Cutting your document into pieces with HACHA

7 Cutting your document into pieces with H^AC^HA

H^EV^EA outputs a single .html file. This file can be cut into pieces at various sectional units by H^AC^HA

7.1 Simple usage

First generate your html document by applying H^EV^EA:

# hevea doc.tex

Then cut doc.html into pieces by the command:

# hacha doc.html

This will generate a simple root file index.html. This root file holds document title, abstract and a simple table of contents. Every item in the table of contents contains a link to or into a file that holds a “cutting” sectional unit. By default, the cutting sectional unit is section in the article style and chapter in the book style. The name of those files are doc001.html, doc002.html, etc.

Additionally, one level of sectioning below the cutting unit (i.e. subsections in the article style and sections in the book style) is shown as an entry in the table of contents. Sectional units above the cutting section (i.e. parts in both article and book styles) close the current table of contents and open a new one. Cross-references are properly handled, that is, the local links generated by H^EV^EA are changed into remote links.

The name of the root file can be changed using the -o option:

# hacha -o root.html doc.html

Some of H^EV^EA output get replicated in all the files generated by H^AC^HA. Users can supply a header and a footer, which will appear at the begining and end of every page generated by H^AC^HA. It suffices to include the following commands in the document preamble:

\htmlhead{header}
\htmlfoot{footer}

H^AC^HA also makes every page it generates a clone of its input as regards attributes to the <body ...> opening tag and meta-information from the <head>… <\head> block. See section B.2 for examples of this replication feature.

By contrast, style information specified in the style elements from rom the <head>… <\head> block is not replicated. Instead, all style definitions are collected into an external style sheet file whose name is doc.css, and all generated html files adopt doc.css as an external style sheet. It is important to notice that, since version 1.08, H^EV^EA produces a style element by itself, even if users do not explicitely use styles. As a consequence, H^AC^HA normally produces a file doc.css, which should not be forgotten while copying files to their final destination after a run of H^AC^HA.

7.2 Advanced usage

H^AC^HA behaviour can be altered from the document source, by using a counter and a few macros.

A document that explicitly includes cutting macros still can be typeset by L^AT_EX, provided it loads the hevea.sty style file from the H^EV^EA distribution. (See section 5 for details on this style file). An alternative to loading the hevea package is to put all cutting instructions in comments starting with %HEVEA.

7.2.1 Principle

H^AC^HA recognizes all sectional units, ordered as follows, from top to bottom: part, chapter, section, subsection, subsubsection, paragraph and subparagraph.

At any point between \begin{document} and \end{document}, there exist a current cutting sectional unit (cutting unit for short), a current cutting depth, a root file and an output file. Table of contents output goes to the root file, normal output goes to the output file. Cutting units start a new output file, whereas units comprised between the cutting unit and the cutting units plus the cutting depth add new entries in the table of contents.

At document start, the root file and the output file are H^AC^HA output file (i.e. index.html). The cutting unit and the cutting depth are set to default values that depend on the document style.

7.2.2 Cutting macros

The following cutting instructions are for use in the document preamble. They command the cutting scheme of the whole document:

\cuttingunit: This is a macro that holds the document cutting unit. You can change the default (which is section in the article style and chapter in the book style) by doing:
\renewcommand{\cuttingunit}{secname}.
\tocnumber: Instruct H^EV^EA to put section numbers into table of content entries.
\notocnumber: Instruct H^EV^EA not to put section numbers into table of content entries. This is the default.
cuttingdepth: This is a counter that holds the document cutting depth. You can change the default value of 1 by doing \setcounter{cuttingdepth}{numvalue}. A cutting depth of zero means no other entries than the cutting units in the table of contents.

Other cutting instructions are to be used after \begin{document}. They all generate html comments in H^EV^EA output. These comments then act as instructions to H^AC^HA.

\cuthere{secname}{itemtitle}

Attempt a cut.

If secname is the current cutting unit or the keyword now, then a new output file is started and an entry in the current table of contents is generated, with title itemtitle. This entry holds a link to the new output file.
If secname is above the cutting unit, then the current table of contents is closed. The output file is set to the current root file.
If secname is below the cutting unit and less than the cutting depth away from it, then an entry is added in the table of contents. This entry contains itemtitle and a link to the point where \cuthere appears.
Otherwise, no action is performed.

\cutdef[depth]{secname}

Open a new table of contents, with cutting depth depth and cutting unit secname. If the optional depth is absent, the cutting depth does not change. The output file becomes the root file. Result is unspecified if whatever secname expands to is a sectional unit name above the current cutting unit, is not a valid sectional unit name or if depth does not expand to a small positive number.

\cutend

End the current table of contents. This closes the scope of the previous \cutdef. The cutting unit and cutting depth are restored. Note that \cutdef and \cutend must be properly balanced.

Commands \cuthere and \cutend have starred variants, which behave identically except for footnotes (see 7.3.7).

Default settings work as follows: \begin{document} performs

\cutdef*[\value{cuttingdepth}]{\cuttingunit}

and \end{document} performs \cutend*. All sectioning commands perform \cuthere, with the sectional unit name as first argument and the (optional, if present) sectioning command argument (i.e. the section title) as second argument. Note that starred versions of the sectioning commands also perform cutting instructions.

7.2.3 Table of links organisation

A table of links generated by H^AC^HA is a list of links to generated files. Additionally, some sublists may be present, up to a certain depth. The items in those sublists are links inside generated files, they point to sectional unit titles below the cutting unit, up to a certain depth.

More precisely, let A be a certain sectional unit (e.g. “part”), let B be just below A (e.g. “section”), and let C be just below C (e.g. “subsection”). Further assume that cutting is performed at level B with a depth of more than one. Then, every unit A holds a one or several tables of links to generated files, and each generated file normally holds a B unit. Sublists with links to C units inside B units normally appear in the tables of links of level A. The command-line options -tocbis and -tocter instruct hacha to put sublists at other places. With -tocbis sublists are duplicated at the beginning of the B level files; while with -tocter sublist only appear at the beginning of the B level files.

In my opinion, default style is appropriate for documents with short B units; while -tocbis style is appropriate for documents with long B units with a few sub-units; and -tocter style is appropriate for documents with long B units with a lot of sub-units. As you may have noticed, this manual is cut by following the -tocbis style.

Whatever the style is, if a B unit is cut (e.g. because its text is enclosed in \cutdef{C}… \cutend), then every C unit goes into its own file and there is no sublist after the relevant B level entry in the A level table of links.

7.2.4 Examples

Consider, for instance, a book document with a long chapter that you want to cut at the section level, showing subsections:

\chapter{A long chapter}
.....

\chapter{The next chapter}

Then, you should insert a \cutdef at chapter start and a \cutend at chapter end:

\chapter{A long chapter}
%HEVEA\cutdef[1]{section}
.....
%HEVEA\cutend
\chapter{The next chapter}

Then, the file that would otherwise contain the long chapter now contains the chapter title and a table of sections. No other change is needed, since the command \section already performs the appropriate \cuthere{section}{...} commands, which were ignored by default. (Also note that cutting macros are placed inside %HEVEA comments, for L^AT_EX not to be disturbed).

The \cuthere macro can be used to put some document parts into their own file. This may prove appropriate for long cover pages or abstracts that would otherwise go into the root file. Consider the following document:

\documentclass{article}

\begin{document}

\begin{abstract} A big abstract \end{abstract}
...

Then, you make the abstract go to its own file as it was a cutting unit by typing:

\documentclass{article}
\usepackage{hevea}

\begin{document}
\cuthere{\cuttingunit}{Abstract}
\begin{abstract} A big abstract \end{abstract}
...

(Note that, this time, cutting macros appear unprotected in the source. However, L^AT_EX still can process the document, since the hevea package is loaded).

7.2.5 More and More Pages in Output

In some situations it may be appropriate to produce many pages from one source files. More specifically, loading the deepcut package will put all sectioning units of your document (from \part to \subsection in their own file.

Similarly, loading the figcut package will make all figures and tables go into their own file. The figcut package accepts two options, show and noshow. The former, which is the default, instructs H^EV^EA to repeat the caption into the main flow of text, with a link to the figure. The latter option disables the feature.

7.3 More Advanced Usage

In this section we show how to alter some details of H^AC^HA behaviour. This includes controlling output file names and the title of generated web pages and introducing arbitrary cuts.

7.3.1 Controlling output file names

When invoked as hacha doc.html, H^AC^HA produces a index.html table of links file that points into doc001.html, doc002.html, etc. content files. This is not very convenient when one wishes to point inside the document from outside. However, the \cutname{name} command sets the name of the current output file name as name.

Consider a document cut at the section level, which contains the following important section:

\section{Important\label{important} section}
...

To make the important section goes into file important.html, one writes:

\section{Important\label{important} section}\cutname{important.html}
...

Then, section “Important section” can be referenced from an H^EV^EA unaware html page by:

In this document, there is a very
<a href="important.html#important">important section</a>.

If you are reading the html version of this manual, you may check that you are now reading file cutname.html. This particular file name has been specified from the source using \cutname{cutname.html}.

7.3.2 Controlling page titles

When H^AC^HA creates a web page from a given sectional unit, the title of this page normally is the name of the sectional unit. For instance, the title of this very page should be “Cutting your document into pieces with H^AC^HA”. It is possible to insert some text at the beginning of all page titles, by using the \htmlprefix command. Hence, by writing \htmlprefix{\hevea{} Manual: } in the document, the title of this page would become: “H^EV^EA Manual: Cutting your document into pieces with H^AC^HA” and the title of all other pages would show the same prefix.

7.3.3 Links for the root file

The command \toplinks{prev}{up}{next} instructs H^AC^HA to put links to a “previous”, “up” and “next” page in the root file. The following points are worth noticing:

The \toplink command must appear in the document preamble (i.e. before \begin{document}).
The arguments prev, up and next should expand to urls, notice that these argument are processed (see section 8.1.1).
When one of the expected argument is left empty, the corresponding link is not generated.

This feature can prove useful to relate documents that are generated independently by H^EV^EA and H^AC^HA.

7.3.4 Controlling link contents from the document

By default the links to the previous, up and next pages show a small icon (an appropriate arrow). This can be changed with the command \setlinkstext{prev}{up}{next}, where prev, up and next are some L^AT_EX source. For instance the default behaviour is equivalent to:

\setlinkstext
  {\imgsrc[alt="Previous"]{previous_motif.svg}}
  {\imgsrc[alt="Up"]{contents_motif.svg}}
  {\imgsrc[alt="Next"]{next_motif.svg}}

Command \setlinkstext behaves as \toplinks does. That is, it must occur in document preamble, arguments are processed and empty arguments yield no effect (i.e. defaults apply).

7.3.5 Complete control over navigation links

The previous commands only impact the contents of the navigation links. It is possible, although reserved to avanced users, to achieve greater control by using the \formatlinks command. The \formatlinks command takes four arguments which are command themselves. The last three command format the “previous”, “up” and “next” links respectively, while the first argument formats the resulting group of links. For instance, one can avoid images and for arrows and typeset the full set of navivation links in a purple border (see Section 9 for styling techniques) as follows:

\newstyle{a.navarrow}{font-family:monospace;font-size:x-large;color:purple}
\newstyle{div.navarrows}{border:solid purple;display:inline-block;padding:1ex;}
\newcommand{\myprev}[1]{\ahref[class="navarrow" title="Previous" ]{#1}{$\rightarrow$}\quad}
\newcommand{\myup}[1]{\quad\ahref[class="navarrow" title="Up" ]{#1}{$\uparrow$}\quad}
\newcommand{\mynext}[1]{\quad\ahref[class="navarrow" title="Next" ]{#1}{$\writearrow$}}

\newcommand{\mylinks}[1]{\@open{div}{class="navarrows"}#1\@close{div}\end{center}}
\formatlinks{\mylinks}{\myprev}{\myup}{\mynext}

7.3.6 Cutting a document anywhere

Part of a document goes to a separate file when enclosed in a cutflow environment:

\begin{cutflow}{title}…\end{cutflow}

The content “…” will go into a file of its own, while the argument title is used as the title of the introduced html page.

The html page introduced here does not belong to the normal flow of text. Consequently, one needs an explicit reference from the normal flow of text into the content of the cutflow environment. This will occur naturally when the content of the cutflow environment. contains a \label construct. This look natural in the following quiz example:

\paragraph{A small quiz}
\begin{enumerate}
\item What is black?
\item What is white?
\item What is Dylan?
\end{enumerate}
Answers in section~\ref{answers}.
\begin{cutflow}{Answers}
\paragraph{Quiz answers}\label{answers}
\begin{enumerate}
\item Black is black.
\item White is white.
\item Dylan is Dylan.
\end{enumerate}
\end{cutflow}

The example yields:

A small quiz

What is black?
What is white?
What is Dylan?

Answers in section 7.3.6.

However,introducing html hyperlink targets and references with the \aname and \ahrefloc commands (see section 8.1.1) will be more practical most of the time.

The starred variant environment cutflow* is the same as cutflow, save for the html header and footer (see Section 7.1) which are not replicated in the introduced page.

7.3.7 Footnotes

Footnote texts (given as arguments either to \footnote or \footnotetext) do not go directly to output. Instead, footnote texts accumulate internally in a buffer, awaiting to be flushed. The flushing of notes is controlled by the means of a current flushing unit, which is a sectional unit name or document — a fictional unit above all units. At any point, the current flushing unit is the value of the command \@footnotelevel. In practice, the flushing of footnote texts is performed by two commands:

\flushdef{secname} simply sets the flushing unit to secname.
\footnoteflush{secname} acts as follows:
- If argument secname is equal to or above the current flushing unit, then footnote texts are flushed (if any). In the output, the texts themselves are surrounded by special comments that tag them as footnote texts and record secname.
- Otherwise, no action is performed.

The article style file performs \flushdef{document}, while the book style file performs \flushdef{chapter}. At the end of processing, \end{document} performs \footnoteflush{\@footnotelevel}, so as to flush any pending notes.

Cutting commands interact with footnote flushing as follows:

\cuthere{secname} executes \footnoteflush{secname}. Remember that all sectioning commands perform \cuthere with their sectional unit name as argument.
\cutdef{secname} saves the current flushing unit and buffer on some internal stack, starts a new buffer for footnote texts, and sets the current flushing unit to secname (by performing \flushdef{secname}).
\cutend first flushes any pending texts (by performing \footnoteflush with the current flushing unit as argument), and restores the flushing unit and footnote text buffer saved by the matching \cutdef.
The starred variants \cutdef* and \cutend* perform no operation that is related to footnotes.

Later, when running across footnote texts in its input file, H^AC^HA sometimes put notes in a separate file. More precisely, H^AC^HA has knowledge of the current cutting level, the current sectional unit where cuts occur — as given by the relevant \cutdef. Moreover, H^AC^HA knows the current section level — that is, the last sectional command processed. Besides, H^AC^HA extracts the note level from the comments that surround the notes (as given by the command \footnoteflush that produced the notes). Then, H^AC^HA creates a separate file for notes when the cutting level and the note level differ, or when the current level is above the cutting level (e.g. the current level is document while the cutting level is chapter). As a result, notes should stay where they are when they occur at the end of H^AC^HA output file and otherwise go to a separate file.

To make a complicated story even more complicated, footnotes in minipage environments or in the arguments to \title or \author have a different, I guess satisfactory, behaviour.

Given the above description, footnotes are managed by default as follows.

In style article, hevea puts all footnotes go at the end of the html file. A later run of hacha creates a separate footnote file.
In style book, footnotes are collected at the end of chapters. A later run of hacha leaves them where they are. Footnotes in the title or author names are managed specially, they will normally appear at the end of the root file.

In case you wish to adopt a book-like behaviour for an article (footnotes at the end of sections), it suffices to insert \flushdef{section} in the document preamble.

We now give a few example of interaction between notes and cutting. We first consider normal behaviour. The page you are reading is a section page, since the current cutting unit is “section”. The current unit is “subsection”. The following two subsubsections are sent to their own files by the means of a \cutdef{subsubsection}/\cutend pair. As a result the text of footnotes appear at the end of the subsubsection pages.

The following two subsubsections are sent to their own files by the means of a \cutdef*{subsubsection}/\cutend* pair. As a result, the text of footnotes in the subsections appear at the end of the current section page.⁴

Finally, to send the footnotes in subsubsections to a separate web page, one should use a \cutdef{subsubsection}/\cutend pair (to create a proper buffer for subsubsection notes), redefine the flushing unit, and flush notes explicitly.

\cutdef{subsubsection}\flushdef{document}%
\subsubsection{...}
  ...
\footnoteflush{document}\cutend

4: Standard section footnote.
5: Sent at the end of cutname.html
6: Sent at the end of cutname.html

7 Cutting your document into pieces with HACHA