From 01bd373914c08b7cac747a5960e5a3ab6a772551 Mon Sep 17 00:00:00 2001 From: David Date: Sun, 1 Jan 2023 09:31:44 +0800 Subject: [PATCH] Remove title= from H2 in Step guide --- .../producing-an-ebook-step-by-step.php | 52 +++++++++---------- 1 file changed, 26 insertions(+), 26 deletions(-) diff --git a/www/contribute/producing-an-ebook-step-by-step.php b/www/contribute/producing-an-ebook-step-by-step.php index 77200f66..26f6e1a2 100644 --- a/www/contribute/producing-an-ebook-step-by-step.php +++ b/www/contribute/producing-an-ebook-step-by-step.php @@ -49,7 +49,7 @@ require_once('Core.php');
  1. -

    Set up the Standard Ebooks toolset and make sure it’s up-to-date

    +

    Set up the Standard Ebooks toolset and make sure it’s up-to-date

    Standard Ebooks has a toolset that will help you produce an ebook. The toolset installs the se command, which has various subcommands related to creating Standard Ebooks. You can read the complete installation instructions, or if you already have pipx installed, run:

    pipx install standardebooks

    The toolset changes frequently, so if you’ve installed the toolset in the past, make sure to update the toolset before you start a new ebook:

    @@ -58,7 +58,7 @@ require_once('Core.php'); se --version
  2. -

    Select an ebook to produce

    +

    Select an ebook to produce

    The best place to look for public domain ebooks to produce is Project Gutenberg. If downloading from Project Gutenberg, be careful of the following:

    • @@ -72,7 +72,7 @@ require_once('Core.php');

      For this guide, we’ll use The Strange Case of Dr. Jekyll and Mr. Hyde, by Robert Louis Stevenson. If you search for it on Gutenberg, you’ll find that there are two versions; the most popular one is a poor choice to produce, because the transcriber included the page numbers smack in the middle of the text! What a pain those’d be to remove. The less popular one is a better choice to produce, because it’s a cleaner transcription.

    • -

      Locate page scans of your book online

      +

      Locate page scans of your book online

      As you produce your book, you’ll want to check your work against the actual page scans. Often the scans contain formatting that is missing from the source transcription. For example, older transcriptions sometimes throw away italics entirely, and you’d never know unless you looked at the page scans. So finding page scans is essential.

      Below are the three big resources for page scans. You should prefer them in this order:

        @@ -100,7 +100,7 @@ require_once('Core.php');

        You’ll enter a link to the page scans you used in the content.opf metadata as a <dc:source> element.

      • -

        Create a Standard Ebooks epub skeleton

        +

        Create a Standard Ebooks epub skeleton

        An epub file is just a bunch of files arranged in a particular folder structure, then all zipped up. That means editing an epub file is as easy as editing a bunch of text files within a certain folder structure, then creating a zip file out of that folder.

        You can’t just arrange files willy-nilly, though—the epub standard expects certain files in certain places. So once you’ve picked a book to produce, create the basic epub skeleton in a working directory. se create-draft will create a basic Standard Ebooks epub folder structure, initialize a Git repository within it, and prefill a few fields in content.opf (the file that contains the ebook’s metadata).

          @@ -123,7 +123,7 @@ require_once('Core.php');
      • -

        Do a rough cleanup of the source text and perform the first commit

        +

        Do a rough cleanup of the source text and perform the first commit

        If you inspect the folder we just created, you’ll see it looks something like this:

        A tree view of a new Standard Ebooks draft folder @@ -154,7 +154,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

        For this first commit:

        git add -A git commit -m "Initial commit"
      • -

        Split the source text at logical divisions

        +

        Split the source text at logical divisions

        The file we downloaded contains the entire work. Jekyll is a short work, but for longer work it quickly becomes impractical to have the entire text in one file. Not only is it a pain to edit, but ereaders often have trouble with extremely large files.

        The next step is to split the file at logical places; that usually means at each chapter break. For works that contain their chapters in larger “parts,” the part division should also be its own file. For example, see Treasure Island.

        To split the work, we use se split-file. se split-file takes a single file and breaks it in to a new file every time it encounters the markup <!--se:split-->. se split-file automatically includes basic header and footer markup in each split file.

        @@ -163,7 +163,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

        Once we’re happy that the source file has been split correctly, we can remove it.

        rm src/epub/text/body.xhtml
      • -

        Clean up the source text and perform the second commit

        +

        Clean up the source text and perform the second commit

        If you open up any of the chapter files we now have in the src/epub/text/ folder, you’ll notice that the code isn’t very clean. Paragraphs are split over multiple lines, indentation is all wrong, and so on.

        If you try opening a chapter in a web browser, you’ll also likely get an error if the chapter includes any HTML entities, like &mdash;. This is because Gutenberg uses plain HTML, which allows entities, but epub uses XHTML, which doesn’t.

        We can fix all of this pretty quickly using se clean. se clean accepts as its argument the root of a Standard Ebook directory. We’re already in the root, so we pass it ..

        se clean . @@ -190,7 +190,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

        Once the file split and cleanup is complete, you can perform your second commit.

        git add -A git commit -m "Split files and clean"
      • -

        Typogrify the source text and perform the corresponding commit(s)

        +

        Typogrify the source text and perform the corresponding commit(s)

        Now that we have a clean starting point, we can start getting the real work done. se typogrify can do a lot of the heavy lifting necessary to bring an ebook up to Standard Ebooks typography standards.

        Like se clean, se typogrify accepts as its argument the root of a Standard Ebook directory.

        se typogrify .

        Among other things, se typogrify does the following:

        @@ -268,12 +268,12 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

        Once you’ve searched the work for the common issues above, if any manual changes were necessary, you should perform the fourth commit.

        git add -A git commit -m "Manual typography changes"
      • -

        Check for transcription errors

        +

        Check for transcription errors

        Transcriptions often have errors, because the O.C.R. software might confuse letters for other, more unusual characters, or because the ebook’s character set got mangled somewhere along the way from the source to your repository. You’ll find most transcription errors when you proofread the text, but right now you use the se find-unusual-characters tool to see a list of any unusual characters in the transcription. If the tool outputs any, check the source to make sure those characters aren’t errors.

        se find-unusual-characters .

        If any errors had to be corrected, a commit is needed as well.

        git add -A git commit -m "Correct transcription errors"
      • -

        Convert footnotes to endnotes

        +

        Convert footnotes to endnotes

        Works often include footnotes, either added by an annotator or as part of the work itself. Since ebooks don’t have a concept of a “page,” there’s no place for footnotes to go. Instead, we convert footnotes to a single endnotes file, which will provide popup references in the final epub.

        The endnotes file and the format for endnote links are standardized in the SEMoS.

        If you find that you accidentally mis-ordered an endnote, never fear! se shift-endnotes will allow you to quickly rearrange endnotes in your ebook.

        @@ -281,13 +281,13 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

        Jekyll doesn’t have any footnotes or endnotes, so we skip this step.

      • -

        Add a list of illustrations

        +

        Add a list of illustrations

        If a work has illustrations besides the cover and title pages, we include a “list of illustrations” at the end of the book, after the endnotes but before the colophon. The LoI file is also standardized.

        If an LOI is created, do a corresponding commit.

        git add -A git commit -m "Add LOI"

        Jekyll doesn’t have any illustrations, so we skip this step.

      • -

        Converting British quotation to American quotation

        +

        Converting British quotation to American quotation

        If the work you’re producing uses British quotation style (single quotes for dialog and other outer quotes versus double quotes in American), we have to convert it to American style. We use American style in part because it’s easier to programmatically convert from American to British than it is to convert the other way around. Skip this step if your work is already in American style.

        se british2american attempts to automate the conversion. Your work must already be typogrified (the previous step in this guide) for the script to work.

        se british2american .

        While se british2american tries its best, thanks to the quirkiness of English punctuation rules it’ll invariably mess some stuff up. Proofreading is required after running the conversion.

        @@ -297,7 +297,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

        After you’ve run the conversion, do another commit.

        git commit -am "Convert from British-style quotation to American style"
      • -

        Add semantics

        +

        Add semantics

        Part of producing a book for Standard Ebooks is adding meaningful semantics wherever possible in the text. se semanticate does a little of that for us—for example, for some common abbreviations—but much of it has to be done by hand.

        Adding semantics means two things:

          @@ -344,7 +344,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

          After you’ve added semantics according to the SEMoS, do another commit.

          git commit -am "Manually add additional semantics"
        1. -

          Modernize spelling and hyphenation

          +

          Modernize spelling and hyphenation

          Many older works use outdated spelling and hyphenation that would distract a modern reader. (For example, to-night instead of tonight). se modernize-spelling automatically removes hyphens from words that used to be compounded, but aren’t anymore in modern English spelling.

          Do run this tool on prose. Don’t run this tool on poetry.

          se modernize-spelling . @@ -493,7 +493,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll git commit -am "[Editorial] Modernize hyphenation and spelling"
        2. -

          Check for consistent diacritics

          +

          Check for consistent diacritics

          Sometimes during transcription or even printing, instances of some words might have diacritics while others don’t. For example, a word in one chapter might be spelled châlet, but in the next chapter it might be spelled chalet.

          se find-mismatched-diacritics lists these instances for you to review. Spelling should be normalized across the work so that all instances of the same word are spelled in the same way. Keep the following in mind as you review these instances:

            @@ -511,13 +511,13 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

            If any changes had to be made, a corresponding editorial commit should be done as well.

            git commit -am "[Editorial] Correct mismatched diacritics"
          • -

            Check for consistent dashes

            +

            Check for consistent dashes

            Similar to se find-mismatched-diacritics, se find-mismatched-dashes lists instances where a compound word is spelled both with and without a dash. Dashes in words should be normalized to one or the other style.

            se find-mismatched-dashes .

            If corrections were made, another commit is needed.

            git commit -am "[Editorial] Correct mismatched dashes"
          • -

            Set <title> elements

            +

            Set <title> elements

            After you’ve added semantics and correctly marked up section headers, it’s time to update the <title> elements in each chapter to match their expected values.

            The se build-title tool takes a well-marked-up section header from a file, and updates the file’s <title> element to match:

            se build-title . @@ -525,7 +525,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll git commit -am "Add titles"
          • -

            Build the manifest and spine

            +

            Build the manifest and spine

            In content.opf, the manifest is a list of all of the files in the ebook. The spine is the reading order of the various XHTML files.

            se build-manifest and se build-spine will create these for you. Run these on our source directory and they’ll update the <manifest> and <spine> elements in content.opf.

          • -

            Build the table of contents

            +

            Build the table of contents

            With the spine in the right order, we can now build the table of contents.

            The table of contents is a structured document that lets the reader easily navigate the book. In a Standard Ebook, it’s stored outside of the readable text directory with the assumption that the reading system will parse it and display a navigable representation for the user.

            Use se build-toc to generate a table of contents for this ebook.

            @@ -552,7 +552,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll git commit -am "Add ToC"
          • -

            Clean and lint

            +

            Clean and lint

            Before you build the ebook for proofreading, it’s a good idea to check the ebook for some common problems you might have run in to during production.

            First, run se clean one more time to both clean up the source files, and to alert you if there are XHTML parsing errors. Even though we ran se clean before, it’s likely that in the course of production the ebook got in to less-than-perfect markup formatting. Remember you can run se clean as many times as you want—it should always produce the same output.

            se clean . @@ -561,7 +561,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

            If there are no errors, se lint will complete silently—but again, at this stage we’re expecting to see some errors because our ebook isn’t done yet.

          • -

            Build and proofread, proofread, proofread!

            +

            Build and proofread, proofread, proofread!

            At this point, our ebook is still missing some important things—a cover, the colophon, and some metadata—but the actual book is in a state where we can start proofreading. We complete a cover-to-cover proofread now, even though there’s still work to be done on the ebook, because once you’ve actually read the book, you’ll have a better idea of what kind of cover to select and what to write in the metadata description.

            se build will create a usable epub file for transfer to your ereader. We’ll run it with the --kindle and --kobo flag to build a file for Kindles and Kobos too. If you won’t be using a Kindle or Kobo, you can omit those flags.

            se build --output-dir=$HOME/dist/ --kindle --kobo . @@ -606,7 +606,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll
          • -

            Create the cover image

            +

            Create the cover image

          • -

            Complete content.opf

            +

            Complete content.opf

            content.opf is the file that contains the ebook metadata like author, title, description, and reading order. Most of it will be filling in that basic information, and including links to various resources related to the text. We already completed the manifest and spine in an earlier step.

            content.opf is standardized. See the Metadata section of the SEMoS for details on how to fill it out.

            The last details to fill out here will be the short and long descriptions, verifying any Wikipedia links that se create-draft automatically found, adding cover artist metadata, filling out any missing author or contributor metadata, and adding your own metadata as the ebook producer.

            @@ -665,7 +665,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll git commit -am "Complete content.opf"
          • -

            Complete the imprint and colophon

            +

            Complete the imprint and colophon

            se create-draft put a skeleton imprint.xhtml file in the ./src/epub/text/ folder. Fill out the links to the transcription and page scans.

            There’s also a skeleton colophon.xhtml file. Now that we have the cover image and artist, we can fill out the various fields there. Make sure to credit the original transcribers of the text (generally we assume them to be whoever’s name is on the file we download from Project Gutenberg) and to include a link back to the Gutenberg text we used, along with a link to any scans we used (from the Internet Archive or Hathi Trust, for example).

            You can also include your own name as the producer of this Standard Ebooks edition. Besides that, the colophon is standardized; don’t get too creative with it.

            @@ -674,7 +674,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll git commit -am "Complete the imprint and colophon"
          • -

            Final checks

            +

            Final checks

            It’s a good idea to run se typogrify and se clean one more time before running these final checks. Make sure to review the changes with git difftool before accepting them—se typogrify is usually right, but not always!

            Now that our ebook is complete, let’s verify that there are no errors at the S.E. style level:

            se lint . @@ -683,7 +683,7 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll

            Once that completes without errors, we’re ready to move on to the final step!

          • -

            Initial publication

            +

            Initial publication

            You’re ready to publish!