diff --git a/www/contribute/producing-an-ebook-step-by-step.php b/www/contribute/producing-an-ebook-step-by-step.php index e8cf6aa0..a1556992 100644 --- a/www/contribute/producing-an-ebook-step-by-step.php +++ b/www/contribute/producing-an-ebook-step-by-step.php @@ -571,7 +571,21 @@ proceed to seal up my confession, I bring the life of that unhappy Henry Jekyll
Now, transfer the ebook to your ereader and start a cover-to-cover proofread.
-It’s extremely common for transcriptions sourced from Project Gutenberg to have various typos and formatting errors (like missing italics), and it’s also not uncommon for one of Standard Ebook’s tools to make the wrong guess about things like a closing quotation mark somewhere. As you proofread, mark any obvious, or possible but not obvious, errors so that you can compare them with the page scans you found earlier. Keep an eye out for things that we may have to adjust in order to make the text conform to the Typography section of the SEMoS.
+“Proofreading” means a close reading of the text to try to spot any transcription errors or issues which the SEMoS says we must update. It’s typically not a line-by-line comparison to the page scans—that work was already done by the initial transcriber. Rather, proofreading is reading the book as you would any other book, but with careful attention to possible problems in the transcription or in your production.
+Missing or incorrect punctuation. Often O.C.R. software misses small punctuation marks like commas and periods. Does a sentence sound awkward, as if it was missing a comma? Is a period obviously missing between sentences? Mark it and check it against the page scans.
Missing formatting. Transcribers often remove formatting like blockquotes or italics. Is there a section in the book that looks like it should be styled as a blockquote, like verse or a letter? Are characters speaking emphatically, but without italics? Mark these cases to compare against the page scans to see if formatting has to be restored.
Missing thought or paragraph breaks. Is a paragraph unusually long? Does a scene change occur without <hr/>
? They might have been lost during transcription.
Errors caused by the S.E. toolset. Tools like se british2american
or even se typogrify
can cause unexpected typography errors like quotation marks curled in the wrong direction, or dashes spaced incorrectly.
Archaic spellings. Is a particular word spelled in a surprising way? Mark it to check if it should be modernized. The Google Books Ngram Viewer is a great tool to get an idea of whether a word used to be spelled one way, but isn’t spelled that way anymore. Remember to change spellings in their own commits, prefaced with [Editorial]
!
There are some things that you don’t have to worry much about when proofreading:
+Spelling errors. Actual spelling errors are very rare. If a word appears to be misspelled, it’s worth it to check the page scans, but such cases are often done on purpose by the author, or using a older spelling, or are spelled differently in en-US vs. en-GB.
Keeping a 100% faithful representation of a print page layout. Sometimes books have complicated page layouts in print. But ebooks are not the same as print books, with the most important distinction being that there is no “page” to align items to. So, we’re not so concerned with maintaining a pixel-perfect reproduction of print layouts; rather, we wish to adapt print layouts as best we can to the ebook medium.