# Help with Russian Book displayed in cryillic on Kindle 3



## surfside8605 (Apr 9, 2011)

I have downloaded several books from bookz.ru and I am having some inconsistency with the format of the .txt files from the website. Most of their links to books provide a free download option, but after download some of the text files do not appear to be in the format necessary for the Kindle 3 to convert the text into cryillic. The format of the characters in the .txt files from bookz.ru that work properly and the non-working files also seems to be different. I was hoping someone could help me to figure out what the difference is between the .txt files that work and those that are coming out as gibberish. I tried to calibre on the non-working files and still have not seen any change to them when trying some conversions. I am not highly experienced at this so I would be very grateful for any help you can provide me with. 

I think the best thing would be for me to give an example so maybe you guys could compare the two types of .txt files I'm downloading.

For example if you visit, bookz.ru and download Janusz Wisniewski - Mistress, the downloaded .txt has the proper format and displayed cryillic in Kindle. However, when I try other books such as Gabriel Garcia Marquez One Hundred Years of Solitude or Oscar Wilde - The Picture of Dorian Gray, the .txt file I get is resulting in gibberish on Kindle and the the .txt also seems very different when viewed in notepad.

Thanks for all your help with this!


----------



## WrongTale (Feb 16, 2011)

I've been told that you should save the coding of the .txt file from ANSI to UTF-8, when doing a Save As... 
Later, when converting the file with Calibre, you should choose UTF-8 in Look and Feel / Input Character Encoding.

I have not tested it myself, as the few Russian files I have are readable fine on Kindle.


----------



## surfside8605 (Apr 9, 2011)

Thanks for your help man. I gave it a try. Converted to UTF-8 using save as. Then I used Calibre to convert to .mobi after changing the look and feel input setting to UTF-8. Still getting gibberish. Here's the difference I'm seeing in the .txt documents before and after conversion. I will paste the characters here so you can see.

*A Working file that appears in proper cryillic looks like this as a .txt: *

Òðèäöàòü ëåò íàçàä ýòî ñ÷èòàëîñü ôàíòàñòèêîé.

Òðèäöàòü ëåò íàçàä ýòî ÷èòàëîñü êàê ôàíòàñòèêà. Èññëåäóþùàÿ è ðàñøèðÿþùàÿ ãðàíèöû æàíðà, æàäíî âïèòûâàþùàÿ âñåâîçìîæíûå íîâåéøèå âåÿíèÿ, ïðèìåðÿþùàÿ îáùå÷åëîâå÷åñêîå ëèöî, îòâàæíî èãíîðèðóþùàÿ êàèíîâó ïå÷àòü «æàíðîâîãî ãåòòî».

Ñåé÷àñ ýòî âîñïðèíèìàåòñÿ êàê îäíî èç ñàìûõ ÷åëîâå÷íûõ ïðîèçâåäåíèé íîâåéøåãî âðåìåíè, êàê ðîìàí ïðîíçèòåëüíîé ïñèõîëîãè÷åñêîé ñèëû, êàê ôèëèãðàííîå ðàçâèòèå òåìû ëþáâè è îòâåòñòâåííîñòè.

*The files that don't seem to work at all and are displaying gibberish look like this as .txt :*

¥à¥¢®¤ á ­£«. Œ.€¡ª¨­

&#8230;„ˆ'‹Ž‚ˆ&#8230;

•ã¤®¦­¨ª -- â®â, ªâ® á®§¤ ¥â ¯à¥ªà á­®¥.
 áªàëâì «î¤ï¬ á¥¡ï ¨ áªàëâì åã¤®¦­¨ª -- ¢®â ª  ç¥¬ã áâà¥¬¨âáï
¨áªãááâ¢®.
Šà¨â¨ª -- íâ® â®â, ªâ® á¯®á®¡¥­ ¢ ­®¢®© ä®à¬¥ ¨«¨ ­®¢ë¬¨ áà¥¤áâ¢ ¬¨
¯¥à¥¤ âì á¢®¥ ¢¯¥ç â«¥­¨¥ ®â ¯à¥ªà á­®£®.
‚ëáè ï, ª ª ¨ ­¨§è ï, ä®à¬ ªà¨â¨ª¨ -- ®¤¨­ ¨§ ¢¨¤®¢ ¢â®¡¨®£à ä¨¨.
'¥, ªâ® ¢ ¯à¥ªà á­®¬ ­ å®¤ïâ ¤ãà­®¥, -- «î¤¨ ¨á¯®àç¥­­ë¥, ¨ ¯à¨â®¬
¨á¯®àç¥­­®áâì ­¥ ¤¥« ¥â ¨å ¯à¨¢«¥ª â¥«ì­ë¬¨. â® ¡®«ìè®© £à¥å.
'¥, ªâ® á¯®á®¡­ë ã§à¥âì ¢ ¯à¥ªà á­®¬ ¥£® ¢ëá®ª¨© á¬ëá«, -- «î¤¨
ªã«ìâãà­ë¥. Ž­¨ ­¥ ¡¥§­ ¤¥¦­ë.

Here's a direct link to an example of one of a .txt on bookz.ru that is having these problems...

http://translate.googleusercontent.com/translate_c?hl=en&sl=ru&u=http://bookz.ru/dl2.php%3Fid%3D15229%26t%3Dz%26g%3D6%26f%3Ddoriangr%26a_id%3D1539&prev=/search%3Fq%3Dbookz.ru%26hl%3Den%26prmd%3Divns&rurl=translate.google.com&usg=ALkJrhiE6TBmT1yREPMlTbPdlQJhXeTfIw

Thanks again man.


----------



## SusanCassidy (Nov 9, 2008)

You have to know what encoding the original book has, and use something to convert to UTF-8 from that encoding, if it is not already UTF-8.  Otherwise, it does not know how to interpret the codes used for the characters.


----------



## surfside8605 (Apr 9, 2011)

Thanks for your help Susan. Do you know of any way I can figure out what those these .txt files are encoded as?


…„ˆ‘‹Ž‚ˆ…

    •ã¤®¦­¨ª -- â®â, ªâ® á®§¤ ¥â ¯à¥ªà á­®¥.
     áªàëâì  «î¤ï¬  á¥¡ï  ¨  áªàëâì  åã¤®¦­¨ª  --  ¢®â  ª  ç¥¬ã áâà¥¬¨âáï
¨áªãááâ¢®.
    Šà¨â¨ª -- íâ® â®â, ªâ® á¯®á®¡¥­ ¢ ­®¢®©  ä®à¬¥  ¨«¨  ­®¢ë¬¨  áà¥¤áâ¢ ¬¨
¯¥à¥¤ âì á¢®¥ ¢¯¥ç â«¥­¨¥ ®â ¯à¥ªà á­®£®.
    ‚ëáè ï, ª ª ¨ ­¨§è ï, ä®à¬  ªà¨â¨ª¨ -- ®¤¨­ ¨§ ¢¨¤®¢  ¢â®¡¨®£à ä¨¨.
    ’¥,  ªâ®  ¢  ¯à¥ªà á­®¬  ­ å®¤ïâ  ¤ãà­®¥, -- «î¤¨ ¨á¯®àç¥­­ë¥, ¨ ¯à¨â®¬
¨á¯®àç¥­­®áâì ­¥ ¤¥« ¥â ¨å ¯à¨¢«¥ª â¥«ì­ë¬¨. â® ¡®«ìè®© £à¥å.
    ’¥, ªâ® á¯®á®¡­ë  ã§à¥âì  ¢  ¯à¥ªà á­®¬  ¥£®  ¢ëá®ª¨©  á¬ëá«,  --  «î¤¨
ªã«ìâãà­ë¥. Ž­¨ ­¥ ¡¥§­ ¤¥¦­ë.

The .txt files that fail to work have this encoding and even when I save in notepad as UTF-8, I see no change to the encoding. It still appears this way. So I am guessing that is why I am having no success. Is there a way to figure out what this encoding is and change it?

Thanks again.


----------



## SusanCassidy (Nov 9, 2008)

There's no easy way.  .txt is especially difficult because it has no internal-only data of any kind.  You could try a few of the more common Cyrillic encodings.  Wikipedia has lots of info on encodings.  I don't remember the names of them off-hand, and I'm not at work right now.

Notepad is probably not the best app to try.  I don't remember seeing any options to change encodings.  Try Word or Wordpad, OpenOffice or something fancier.  There are command-line tools for Linux that are very helpful in converting encodings, like iconv.  Cygwin may support them.


----------



## Bjorn2Read (Mar 24, 2011)

This is how I handle Cyrillic TXT files, and so far encountered no problems:
1. Right-click on the TXT file, choose "Open with..." and choose MS Word. You will be presented with a conversion dialog box. Say "Yes" to whatever it suggests.
2. Save the converted document in Word format (DOC). Do any text/format cleanup if necessary. Save.
3. Use "Save As..." command and save the file as "Filtered HTML".
4. Open the resulting HTML file in an html editor (I use Notepad ++) and find the character encoding tag (at the top of the file, "char-set" or something like that). Change it  to "Windows-1251" (that's Cyrillic) from whatever it is, most likely "Windows-1252" (that's Western). Save.
5. Now add the HTML file to Calibre and convert to MOBI.
This has consistently worked for me - hope it helps you!
P.S. I use Kindle 2 with added Cyrillic support - I can only hope that the "improvements" to the next version did not muss up Cyrillic capabilities...


----------



## kindleman (Oct 22, 2010)

Download free Notepad++ and open your file. Select menu item Encoding and check encoding pointed by a bullet mark.
You can change text encoding selecting Convert to... option. No need to convert to mobi, Kindle 3 correctly displays Cyrillic txt files encoded in UTF-8.

Added:

I apologize for being too optimistic :-(, things are a bit more complicated.
Gabriel Garcia Marquez One Hundred Years of Solitude and many other books are encoded in old DOS encoding, code page 866 or OEM encoding.  
In Notepad++, select Encoding/Character sets/Cyrillic/OEM 866 and you'll be able to read displayed text. Go to Encoding/Convert to UTF-8 and save file in UTF-8 encoding.
Now you can either read it  on Kindle as is (just decrease font size and use landscape mode) or convert text to other readable format.


----------



## NogDog (May 1, 2009)

I was under the impression that the out-of-the-box Kindle only supported the Latin1 (Western European) character set, and if you want Cyrillic or other character sets, you'd have to install the font hack and install an appropriate font set (e.g. a UTF-8 font). But I'm by no means 100% positive, but it might be worth checking on before you go much further with this.


----------



## kindleman (Oct 22, 2010)

This is from Kindle page on amazon.com (http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M)

"Support for Non-Latin Characters
Kindle can now display Cyrillic (such as Russian), Japanese, Chinese (Traditional and Simplified), and Korean characters in addition to Latin and Greek scripts for certain file types."


----------



## NogDog (May 1, 2009)

kindleman said:


> This is from Kindle page on amazon.com (http://www.amazon.com/Kindle-Wireless-Reader-Wifi-Graphite/dp/B002Y27P3M)
> 
> "Support for Non-Latin Characters
> Kindle can now display Cyrillic (such as Russian), Japanese, Chinese (Traditional and Simplified), and Korean characters in addition to Latin and Greek scripts for certain file types."


That's certainly a step in the right direction. However, I noted this within that section:

"To view your personal documents with non-Latin characters on your Kindle, send your file as a Microsoft Word document (DOC) attachment to your Kindle's e-mail address ("name"@free.kindle.com). The file will be converted to Kindle format and sent to your Kindle via the Wi-Fi connection and also to the e-mail address associated with your Amazon.com account at no charge. See more details about Kindle's Personal Document Service via Whispernet here. Loading TXT files containing non-Latin characters over USB is currently not supported as some characters may not display properly."


----------



## kindleman (Oct 22, 2010)

NogDog said:


> Loading TXT files containing non-Latin characters over USB is currently not supported as some characters may not display properly.


Amazon people just want to hedge themselves against possible troubles. I loaded yesterday over USB two Cyrillic purely text files, one in UTF-8 and another in CP 1251. UTF-8 reads perfectly well and (surprise, surprise!) CP 1251 is also readable.


----------



## *DrDLN* (dr.s.dhillon) (Jan 19, 2011)

kindleman said:


> "Support for Non-Latin Characters
> Kindle can now display Cyrillic (such as Russian), Japanese, Chinese (Traditional and Simplified), and Korean characters in addition to Latin and Greek scripts for certain file types."


Is it possible to add fonts for languages other than mentioned above?


----------



## SusanCassidy (Nov 9, 2008)

Only via the font hack, not via normal channels.


----------

