Converting Project Gutenberg eTexts into BBeB eBooks
November 11th, 2007
In my previous post I’ve shed some light on how the PRS505 ebook reader renders the various supported file formats. In the meantime, I couldn’t resist tinkering around with converting text ebooks into the better-rendering BBeB format.
While I see that creating quality ebooks this way is a long way ahead, a little piece of python script (gut2lrf.py) from my first attempt would certainly give
eTexts a polish.
I started off checking the various existing tools to create BBeB (.lrf) files, and found makelrf3 a simple yet good enough candidate to do the bytecode compilation with. In order to build a version running on linux, I needed to put together a small Makefile, but otherwise it compiled without problems.
Project Gutenberg, with more than 20,000 titles and growing, is an excellent source of free eTexts. However, in order to avoid file-format traps, the project has a strict policy of using plain text files for its text books. What more, there are some conventions keeping us from easily reflow the text within: “Plain text eBooks should have line wraps at 72 characters and skip a line between paragraphs with no indentation.”
My little script - mentioned above - comes to aid here, as it preprocesses Project Gutenberg eTexts to remove unnecessary line breaks. It also fetches the Title and the Author of the book, and then calls makelrf3 to do the text to lrf conversion. Makelrf does a pretty good job splitting up the text to chapters, and quickly generates an lrf file.

These compiled books are much better to read, yet they are still not perfect. They still lack the navigable Table of Contents, rich text formatting, such as stand-out Chapter headings, smaller spacings between paragraphs, words in italic or bold characters, illustrations, footnotes, as well as page headers and footers, so do expect some upgrades to my script down the line. Until that happens, I wish you happy reading on…
The Sony PRS505 - love at first sight
November 4th, 2007
I am fond of books just as much as gadgets. Furthermore, I like traveling light just as much as I hate to commute idly. It was no question sooner or later I’ll have an ebook reader in my pocket.
With the Sony PRS-505 ebook reader hitting the market I instantly knew I need one. Now that I hold it in my hands, it’s love at first sight. I’d say with chewing on this nicety, my hunger for useful gadgets is cured for a while.
Apart from the sleek design, the most important things securing my choice were the courtesy of Sony to include an SD card slot, and the USB Mass Storage interface, as these two features ensured that the reader can be extended for less (not that the 200MB internal flash can’t hold enough books for many commutes), and will communicate seamlessly with a Linux PC.
This eInk display technology is a salvation for the eyes of many, including myself. An anti-glare, daylight readable, not background-lit, high contrast, 8 grey levels display makes my LCD-strained eyes very happy.
As I was planning to use this device to read not just books but also to keep reference manuals and tutorials at hand, I was curious about the ebook formats it supports. So the first thing I did was playing around with various document formats. The rest of this post is dedicated to this topic.
Text files are a developer’s friend. Fortunately, these are laid out quite fine by the reader. The TXT files appear in the booklist with name of the file as the book title, and the file creation date as the book author. The reader provides 3 zoom levels, with 30, 25, and 20 lines of text per page displayed in portrait mode, or 15 + 2, 12 + 2, and 10 + 1 overlapping lines per (half) page in landscape mode. The font used to render txt documents appears to be Bitstream’s Dutch 801 Roman BT. It seems ISO-8859-1 is assumed being the character encoding of text files.
When an ebook gets opened the first time, the reader works for a couple of seconds to paginate the contents, however, the results get cached, so this only slows things down once per ebook (per zoom level used).
RTF documents add the features of multiple font faces and font decoration to be used. Also, it is the document title and author that gets displayed in the booklist, so these need to be set up properly for easier lookup.
The next widespread format supported is PDF, though it has some issues. The reader’s screen size is too small to display an A4 or a Letter size PDF in a readable way. You may use the landscape function, which shows the top or the bottom half of the page. This, in together with the zoom function results in a readable half-page (without the margins), but the zoom level resets to default when turning page. Furthermore, special fonts/charsets don’t always render properly, and password protected PDFs don’t even show up in the booklist.
On the positive side, internal links in PDFs can be used to navigate within the document. Ah, and I’ve found quite a few reader-optimized ebook titles in PDF format at Feedbooks.com.
Documents in Sony’s proprietary ebook format, BBeB, obviously work the most seamless, providing three zoom levels, where the number of lines displayed depends on the font size settings of the ebook too. However, it’s hard to find anything useful in this format outside the CONNECT eBooks universe. I’m planning to write about tools for creating BBeB documents in a later post.
Concerning pictures (jpeg, png and gif formats), the PRS-505 is littlesomewhat slow on rendering, and with the very limited colorspace of 8 gray levels, the PRS505 is not likely to be used as my primary electronic photo album. However, it is good enough to enjoy my favorite comics, what more, probably an ideal device to share the greatest cartoons of savage chickens with my friends in the offline universe.

Finally, regarding the MP3 playing capabilities, while it is certainly a gimme feature to allow listening and reading at the same time, I don’t yet consider it a big thing, but time will tell whether I’ll use this ebook as a walkman too. For now, I haven’t even tried this feature.All in all, it is a charm to hold and read on. It is definitely worth all the 300 bucks of its introductory price tag.
