# Calibre - Weird Text Bugs After Conversion (pdf to mobi)



## Kindle_Matt (Jun 30, 2010)

I have a few PDFs that I want to convert for Kindle. They are primarily text. It's unpredictable but some documents end up with very weird character bugs. Such as words with two Ls where one of the Ls will be stripped off. Or if there are dashes in the doc the dashes will be removed and the space between the words that was once there is also removed.

Has anyone seen that? The source PDF is fine and perfectly legible.

One other thing - it seems to be completely random which files will be "searchable" on the Kindle after I convert - ie, have the ability to search the text. Is there any particular setting on that?

Thanks much.


----------



## Mike D. aka jmiked (Oct 28, 2008)

You didn’t mention which program you were using to try and do this. There are always glitches of one sort or another when converting PDF to reflowable text. PDFs were never intended to be decoded, they were to fix the format for printing. Even Adobe's own programs can’t do an adequate job of reversing the encoding. I've used a number of free and paid-for tools to try to convert these, but nothing does a very good job, all the results needed to be hand-tweaked. 

Mike


----------



## Kindleing (Aug 19, 2010)

Mike - he mentioned Calibre conversion in the subject line.

I had the same problem using Calibre to convert a PDF file the other night.  It appeared to be a character code mismatch; I don't remember the details, but it substituted incorrect characters in a couple of places.  It was repeatable within that document in that the same character pair was always replaced by the same incorrect character.

Wally


----------



## Kindle_Matt (Jun 30, 2010)

Yes, Calibre is the product. I'm going to compare the results from mobipocket. I love the Calibre interface but I hope I don't need to use different techniques for every document.


----------



## Mike D. aka jmiked (Oct 28, 2008)

Kindleing said:


> Mike - he mentioned Calibre conversion in the subject line.


Oops. That went right by me. 

Mike


----------



## Kindle_Matt (Jun 30, 2010)

Well the PDF part was helpful but I'm still getting weird problems. I don't know if it's a strange character set or whatever but very weird things happen like dashes being removed, etc, and long strings of words being concatenated with no apparent reason.  If I find anything repeatable to fix it I'll post back. I've tried converting from PDF to Word then Word to HTML and Text. Other conversions I've done have been flawless so it must be source dependent.  I'm going to try another round with Text.


----------

