How to Use HEVEA with the Thai Character SetAndrew Seagar and นิตยา ซีการ์ |
Thai LATEX is written in the TIS-620 character encoding. Some people call this ISO-8859-11, but that name was (for a long time) never officially recognised.
The TIS-620 character encoding is an 8-bit single byte character set.
It encodes both the ASCII Latin characters (0-127) and the Thai
characters (128-255). See, for the official Thai definition, the
docuemnt:
“ISO 8859-11 Latin/Thai Character Set standard”
at the
website:
www.nectec.or.th/it-standards/iso8859-11/
Non-Thai variations to the official Thai character set were introduced by some vendors. The Windows Thai character set (874) places an unofficial ‘smart quote’ character into one of the empty (illegal) slots in the official Thai set. The DEC (Digital Equipement Coorporation) character set places an unofficial ‘no-break space’ character into another of the empty (illegal) slots in the original official Thai set. It is not too clear what is now “official” and what is not. It is necessary to be a little bit careful. Importing “Thai” docuemnts from Windows into a Linux environment via (for example) Openoffice doesn’t always produce a faithful copy of the original text.
Figure 1 shows the Thai characters according to the Unicode Standard (version 3.0).
For Thai in LATEX the package ‘thai’ (file: thai.sty) is used,
i.e. \usepackage{thai}
.
The source is run through a preprocessor (cttex) to encapsulate all
Thai text within bracketted pairs {\thai ....}
and to insert
the thai-break ‘\tb
’ separator.
Normally Thai text is written in a continuous stream with few (if any)
blank (space) characters. The preprocessor inserts the ‘\tb
’
command to indicate places where the text may be broken if near the
end of a line. If these separators are not inserted LATEX has a
great deal of trouble in getting a flush right margin without leaving
huge gaps in the text.
The style file ‘thai.sty’ contains the definitions for {\thai ....}
and \tb
. The {\thai ....}
command is used to switch the
LATEX font.
After passing through the preprocessor, the file is compiled by LATEX in the normal fashion.
For HEVEA the style (package) file ‘thai.sty’ is not used. HEVEA
does not recognise the {\thai ....}
or \tb
constructs.
If these constructs are encountered, warnings will be issued and the
constructs will be ignored.
In order to use the Thai language with HEVEA, the preprocessor which
is normally used before invoking LATEX should not be used.
The original (as typed) Thai LATEX file should be passed directly to
HEVEA. The command \usepackage{thai}
in the file is detected
by HEVEA and is used to establish a Thai character encoding. (It is
no longer necessary to use the command line flag –charset=TIS-620.
This flag is no longer operational).
The commands required to process this file for both Thai LATEX and Thai HEVEA are listed in table 1. The original LATEX filename is assumed to be ‘thaihevea.ttex’ (ttex = Thai tex).
for LATEX cttex < thaihevea.ttex > thaihevea.tex
run preprocessor latex thaihevea.tex
compile using LATEX dvips thaihevea.dvi -o
convert using dvips gv thaihevea.ps
view using ghostview for HEVEA cp thaihevea.ttex thaihevea.tex
‘rename’ file for benefit of HEVEA hevea thaihevea.tex
compile using HEVEA imagen thaihevea
convert image to bitmap firefox thaihevea.html
view using web browser
Since the Thai text is not processed to indicate where the text may be broken, the decision is left to the application displaying the html code. The browser I currently use (Firefox) doesn’t know how to break continuous Thai text in suitable places without external help. However the screen width is larger than a page width, which means that on average there are more natural breaks in any line, and the browser is left justifying the text so it doesn’t make large ugly gaps. The right margin is ragged, not flush, but that looks acceptable (to me).
Following is a paragraph of Thai text. It doesn’t say anything important, it is simply here to serve as a basic test. Even if you can’t compile this with LATEX (e.g. you don’t have the file thai.sty or a Thai character set for printing), you can still compile it with HEVEA and make an English/Thai web page.
If you want to eliminate the Thai so you can compile an English-only
version of this document, simply insert a comment % character before
the \thaistuff
command at the top of the file and uncomment the
second version of the command (which eliminates the Thai) on the
adjacent line.
ศึกษาความหมาย ความสำคัญของสิ่งแวดล้อมศึกษา วิธีการเผยแพร่ประชาสัมพันธ์ ความรู้ทางสิ่งแวดล้อม วิธีการเขียนแผนงานเพื่อเผยแพร่ความรู้ทางสิ่งแวดล้อม นำ สิ่งแวดล้อมศึกษาไปประยุกต์ใช้ในการพัฒนาและเผยแพร่ความรู้ข้อมูล ข่าวสารต่างๆ ในโครงการอื่นๆ ที่มีความสัมพันธ์เกี่ยวข้อง
This document was translated from LATEX by HEVEA.