diff --git a/www/contribute/producing-an-ebook-step-by-step.php b/www/contribute/producing-an-ebook-step-by-step.php index dad2c248..27e06180 100644 --- a/www/contribute/producing-an-ebook-step-by-step.php +++ b/www/contribute/producing-an-ebook-step-by-step.php @@ -124,8 +124,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll
If you open up any of the chapter files we now have in the src/epub/text/
folder, you’ll notice that the code isn’t very clean. Paragraphs are split over multiple lines, indentation is all wrong, and so on.
If you try opening a chapter in a web browser, you’ll also likely get an error if the chapter includes any HTML entities, like —
. This is because Gutenberg uses plain HTML, which allows entities, but epub uses XHTML, which doesn’t.
We can fix all of this pretty quickly using se clean
. se clean
accepts as its argument the root of a Standard Ebook directory, and with the --single-lines
option it’ll remove the hard line wrapping that Gutenberg is fond of. We’re already in the root, so we pass it .
.
se clean --single-lines .
- Things look much better now, but we’re not perfect yet. If you open a chapter you’ll notice that the <p>
and <h2>
tags have a space between the tag and the text. We can clean that up with a few perl
commands.
perl -pi -e "s|<(p|h2)>\s+|<\1>|g" src/epub/text/chapter* perl -pi -e "s|\s+</(p|h2)>|</\1>|g" src/epub/text/chapter*
+ We can fix all of this pretty quickly using se clean
. se clean
accepts as its argument the root of a Standard Ebook directory. We’re already in the root, so we pass it .
.
se clean .
Finally, we have to do a quick runthrough of each file by hand to cut out any lingering Gutenberg markup that doesn’t belong. In Jekyll, notice that each chapter ends with some extra empty <div>
s and <p>
s. These were used by the original transcriber to put spaces between the chapters, and they’re not necessary anymore, so remove them before continuing.
Now our chapter 1 source looks like this:
<?xml version="1.0" encoding="utf-8"?>