Typesetting a Book with LaTeX

The other day, I talked about building an e-book for Kindle and Nook.  Today, I want to add a few things to the Makefile we created so that we can produce a PDF.  The end result will be that every time you want to create a PDF of the book, you will only need to type 'make pdf'.

If you're one a Mac, you'll want to install MacTeX.  If you're on Linux/UNIX, you'll want to install the package that contains pdflatex.  We'll be using LaTeX to typeset our book.

Converting Your HTML to TeX

When we built our e-book, we were working with our text as HTML.  LaTeX doesn't work with HTML, so we need to convert our HTML into something that LaTeX can understand.  Since most manuscripts don't use many of the features in HTML—mainly paragraphs and a few special characters—it's relatively easy to convert the HTML into LaTeX with a simple perl script.

Make a backup of all of your files before running this!

The above script will convert curly quotes, em dashes, and ï's into the LaTeX versions. It will also put a blank line between paragraphs and remove everything not within the <body/> of the HTML.  There are still a few things that might need to be cleaned up, such as ellipses (. . .) being transformed into \ldots, but the script makes the job almost painless.

You'll need to make the script executable (chmod a+x html2tex.pl) and run it for each of your HTML chapters (but not your front material or table of contents or anything else in the book that isn't part of the main material or appendices): html2tex.pl ch1.html (for example -- changing ch1.html to the name of each HTML file you need to convert).  You'll get ch1.tex as a result. Again, make a backup of all of your files before running this!

Things to note about LaTeX and TeX:

  • Paragraphs are separated by blank lines.  You can break lines as much as you want, but as long as there isn't a blank line, they will all be in the same paragraph.
  • Italics are created by enclosing the italicized text within \emph{...}.
  • A scene break, or section break as I call it in the script, is created by putting \sectionbreak in a paragraph all by itself.
  • When typing, use the and ' to type left and right single quotes.  Double them (` and '') to create left and right double quotes.  Don't use the quote character on the keyboard (") since it won't typeset the way you want it to.
  • When typing a single quote within a double quote (or a double quote within a single quote), put a small space between them with \, (the TeX to HTML script will replace it with a non-breaking space).
  • When you need to put a regular non-breaking space between words, use the tilde (~).
  • When you need a line break, use two backslashes (\\).

Converting TeX to HTML

Our Kindle and Nook e-books are still needing the HTML, but I find it a lot easier to write in TeX since I don't have to worry about putting in paragraph tags and worry about character entities to get the quotes and dashes just right.  To go from the TeX/LaTeX format to HTML, you can use the following script:

We'll assume for now that you put this in the same directory as the kindlegen program. You'll need to mark the script as executable (chmod a+x tex2html.pl).

Open up your Makefile from before and add the following near the top with the other variable assignments:

Remember that you need a tab instead of spaces when it looks like there are spaces at the beginning of a line.

After the mobi and epub sections, you can add the following to build the PDF:

Of course, you can change the filename of the PDF as long as you change it in all the places it appears (similar to changing novel.mobi or novel.epub).

Describing the Book

You've got your Makefile all set up and ready to go.  Your chapters are in TeX format.  You run the command to create your PDF document:

make pdf

and nothing happens.  What's going wrong?

We haven't created the novel.tex file referenced in the Makefile.  The following is a simple one that you can use to get started.

As you can tell by reading through it, there are a few places you need to change things, such as your name or the title of the book.

Looking at the Results

As can be expected when using LaTeX, the results are pretty good with only a little effort.  The images here will look a little different than what you'll get right away because I have a few tweaks that are specific to my novel, but the general look should be similar.

My title page is very simple:

Draft title page for Of Fish and Swimming Swords

Likewise, the general feel for the first few pages, including the copyright page (the other side of the title page) with the margins as set in the novel.tex file here:

First four pages in draft typesetting of Of Fish and Swimming Swords

I also ran the typesetting with 1 inch margins on the top, bottom, and outside edge (change the corresponding measurements at the top of the novel.tex file):

Draft typesetting with 1 inch margins of Of Fish and Swimming Swords

Of course, to get the most out of LaTeX, you'll want to read up on the language and play around a bit.  Keep in mind that TeX and LaTeX are designed for typesetting, not for wordprocessing.  They take a typesetter's approach to the page, so always think of the pages and blocks of text as physical objects being juggled around to see what the best looking page might be and you'll be a long way to understanding LaTeX.

Typesetting beautiful books isn't difficult though once you get the patterns down.  It's easy to iterate through changes until you get to something you like (for example, I think I like the 1 inch margins better than the 0.75 inch margins) without having to rework the entire book every time you make a change.  And any time I make a change in a chapter, I can just run a make command and both e-books and the PDF version are rebuilt:

make epub pdf

It's hard to get simpler than that after all of the hard work in the beginning.

LaTeX Resources

Enhanced by Zemanta