# Kindle adding odd hyphens to scanned-in text - how can I get rid of them?



## April Henry (Dec 16, 2009)

I have five out-of-print books I've put on the Kindle (and I'm so thankful I have the opportunity to have folks read them again).

For one book, Circles of Confusion, I no longer had the original word document available, so I paid to have it scanned in. A potential buyer took a look at the sample, and said there were odd hyphens.

She wrote me that: "On location 338-44 (76% into the sample) (when Karl Zehner shows up at the trailer when Claire and Evan are cleaning it out) the word precise is appearing as "pre-cise" and in the same paragraph, the word "methodi-cally." These are not words or phrases meant to be hyphenated, like "barrel-chested" or "cardboard-colored." A few pages later, the word "com-pany." ... When Claire first finds the painting and describes it (at 90% in, location 397-403), the words "win-dow" and "dan-gled."

But when I look at the htm file I submitted (exporting it from MS Word) I do not see these odd hyphens.

http://www.amazon.com/Circles-of-Confusion-ebook/dp/B0032UY4B6/ref=sr_1_1?ie=UTF8&s=books&qid=1263779122&sr=8-1

Can anyone help? I do not have a Kindle (sadly, although if I keep selling Kindle editions, I think I will buy one) and I don't have a PC, so I can't use that PC for Kindle app.

Thank you!
April Henry

Circles of Confusion
Square in the Face
Heart Shaped Box
Buried Diamonds
and 
Learning to Fly


----------



## marianneg (Nov 4, 2008)

If I had to guess I would say that your original book had the hyphens to break the word across lines.  But if you're not seeing it in the HTML file you submitted, well, that's odd.  Maybe it's just not showing up in the editor you're using to look at it.  Have you tried looking in Notepad or something?


----------



## Mike D. aka jmiked (Oct 28, 2008)

I agree with Marianner, it's probably hyphens that were in the printed source material. I've seen this on quite a few books that weren't proofed very well, presumably from OCR input. Additionally, there may be words that run together, another behavior I've seen with OCR.

I don't know of any way to fix this other than doing a search for hyphens and manually deciding if it needs to be there or not.

Mike

Edit: I just took a look at the sample.... it has many more formatting errors than the hyphens. I see misspelled words,  missing new paragraph codes, erratic left margins, and some strange characters that apparently have no purpose, such as (3SO , at location 259-67.

And of course a big no-no in my eyes: it has full justification instead of ragged right, but that point could be argued.


----------



## April Henry (Dec 16, 2009)

Mike - can you take another look?  

This is the one only I had scanned in and did not have the original Word file for.

I think part of the problem was that the Word file they sent me had hyphenation turned on.  So it was inserting hyphens that I couldn't see by searching for hyphens.  I went through and manually selected sections and labelled them as either normal or headline (for chapter heads).  

Is that what you mean by paragraph codes?

What I submitted wasn't justified - not sure why it is on Kindle.  Argh!  

April


----------



## Mike D. aka jmiked (Oct 28, 2008)

Well, you can't do anything about the right justification; Amazon has removed the ability to turn full justification on and off.

If I get a chance, I'll re-download have another look at the sample (if it's been changed).

Mike


----------



## April Henry (Dec 16, 2009)

Yes, I've changed it a whole bunch of times trying to make it right.

Anyone else reading this want to take a look at the sample and see if it looks any better?

April


----------



## Mike D. aka jmiked (Oct 28, 2008)

It still has scads of extra hyphens, random periods, one occurrence of the word "Chee*to", etc.

What are you using to convert the Word file into a MOBI file?

I don't think that turning hyphenation on or off in Word affects the output, I think that's only a display setting. The HTML code that gets generation will still have all the hyphens in it, I think. The Kindle normally just ignores all those.

I think you're facing a hopeless task if you don't have something to read the output on: a Kindle, an iPod Touch/iPhone, PC, etc.

Mike


----------



## April Henry (Dec 16, 2009)

I didn't change Chee*-to because I remember them typsetting it that way because of some trademark - although now when I look at the Cheetos page, they don't have any weird doo-hickey in the word.  

I have the Word file, I save it as "Web page," which converts it to html, and then I upload to Kindle.  I could also upload it as just a Word file.  Do you think I would have better luck with that?  

Or stripping out all formatting and going back again and reformatting it?  

I really want it to look right.  I wish I had kept the original word file, but it is lost to the sands of time.

Any advice anyone has is appreciated.

April


----------



## Mike D. aka jmiked (Oct 28, 2008)

April Henry said:


> I didn't change Chee*-to because I remember them typsetting it that way because of some trademark - although now when I look at the Cheetos page, they don't have any weird doo-hickey in the word.
> 
> I have the Word file, I save it as "Web page," which converts it to html, and then I upload to Kindle. I could also upload it as just a Word file. Do you think I would have better luck with that?


Cheetos is a trademarked name, but I don't believe you have to put the "TM" at the end in a work of fiction. I probably would, because it's trivial to do.

Using Word to export HTML may be the one of the things causing problems. I have people that tell me that Word is notorious for producing bad HTML (whatever that means, I'm not knowledgeable enough to know). For what it's worth, he same sources tell me that OpenOffice (which is what I use, it's a free MS Office clone) produces very good HTML.

What are your options? What OS does your computer use? Can you upload a MOBI format file? I'm not at all familiar with how individuals do self-publishing on the Kindle.

Mike


----------



## April Henry (Dec 16, 2009)

I tried a fourth or possibly fifth time.  The Kindle now lets you upload directly from Word, rather than saving as "Web page" which is html and then uploading it.

Two days ago I tried saving it as a text file, opening in in Pages (for Mac), reformatting it, etc.  When you upload it to Amazon, it lets you look at it, and I did again.  It looked okay to me, but that's what has happened before.  

I'm hoping this fifth brand new time is a charm.


----------

