# Is there a specification of the PDR format



## kindleuser (Nov 12, 2010)

Apparently, notes and highlights for PDF files are stored in a separate file with file postfix .pdr.
The file content is stored in binary form. 

Is there any specification of this format available?

My motivation: I want to merge my PDF annotations with the original PDF document
to forward them to other people in one PDF file. Many others want to do the same
thing, but there does not seem to be a solution to that. I am wondering whether
one could simply read the PDR file and merge its content in the PDF file. To do so,
I need to understand the format.


----------



## pidgeon92 (Oct 27, 2008)

Are you sure about the extension, or that you are looking at the correct file? This is what I found when I googled that extension:



> File extension PDR description:
> The pdr file extension is associated with Windows Port device drivers, that allow certain USB devices, such as USB flash drives, to be recognized and used by a Windows-based computer.


----------



## kindleuser (Nov 12, 2010)

Yes, I am sure about the format. PDR is an overloaded acronym. If you annotate
a file X.pdf, you will obtain a file.pdr in the Documents directory. In the Kindle
context, it does not relate to Windows drivers.


----------



## tsemple (Apr 27, 2009)

You can be sure that Amazon has a spec for it, since they invented the format. It would be nice if they would publish the spec, so 3rd parties could more confidently integrate Kindle into PDF workflows.

With any luck, it is similar to the .mbp format (for mobi/azw annotation), which has been partially reverse engineered (see http://www.angelfire.com/ego2/idleloop/mbp_reader.html) - except highlighted text is no longer stored in these, and so you have to be able to decode the mobi/azw file to extract this, which would require stripping DRM, etc.

At any rate, I looked at a hex dump after creating some bookmarks, highlights and notes in a PDF. I can see the PDF page labels for bookmarks in clear text, clear text for my note text, and clear text tags of the form '#pdfloc(1fce,119,138,0,20,0,1,1)', 2 for each highlight (marking start and end presumably) and one for each note. And some 'junk DNA' in between to be figured out.

The structure looks something like this:
- a header
- an array of bookmark locations
- an array of highlight start/end locations (pairs of #pdfloc tags)
- an array of notes (single #pdfloc tags)

Since the Kindle's reader app is Java, perhaps this uses Java serialization APIs to write/read these files and someone familiar with that might be able to make more sense of everything than I can.

But even without understanding it fully, you could probably at least extract bookmark locations and notes pretty easily with a script, and then (with more specialized PDF knowledge) have those results fed to a script to add them to the PDF's annotation layer so that Adobe Reader etc. could see them. It would also be useful to be able to create new .pdr's from scratch (i.e. so you could extract some of the PDF annotations and make them viewable/editable on Kindle).

Note that whether you like this or not, Kindle will display overlaid PDF highlights and text/graphic annotations made with Adobe Reader/Acrobat etc. and show a 'sticky note' icon where there is a sticky note annotation - but these are all 'read only' and likely to remain that way. I don't think Amazon is motivated enough to do full and genuine PDF annotation support on Kindle. It might have been nice a couple of years ago, but now there is iPad and other tablets which will remain much more capable for PDF annotation even if Amazon added everything they could think of in terms of PDF features.


----------



## kindleuser (Nov 12, 2010)

Yes, indeed it looks feasible to reverse engineer the format and obtain the data.
Unfortunately, there does not seem to exist an official documentation by Amazon.
The more difficult part is likely to add the data to the PDF. But maybe there 
are libraries or tools around to do that.


----------



## NogDog (May 1, 2009)

kindleuser said:


> ...
> The more difficult part is likely to add the data to the PDF. But maybe there
> are libraries or tools around to do that.


PDFlib, perhaps?


----------



## NiLuJe (Jun 23, 2010)

Calibre (among other things) is using podofo (And poppler on the rendering side).


----------

