Gérard Huet's Sanskrit Site
Version 203 [2005-08-09]
Welcome to my Sanskrit site.
You are invited to visit the first hypertext Sanskrit
dictionary, available interactively through its
index.
It currently gives meanings in
French, but has been designed for multilingual use.
You may also download printable versions of this dictionary, under several
formats, as explained below.
This site offers also certain linguistic services for the Sanskrit language, such
as Sanskrit Reader which parses Sanskrit transliterated
text
into Sanskrit banks of tagged hypertext.
Various phonological and morphological tools are also provided, as explained below.
Book form
Portable Document Format
You may download the pdf file from
PDF.
This document is readable through Acrobat Reader,
a well-known browser plug-in from Adobe freely available on Internet.
Since the document is rather large, you have to account for some delay
in loading its 3 Mb.
Postscript
For those of you who prefer Postscript, it is also available as
a compressed Postscript file.
Web form
Interactive browsing
The dictionary may be accessed through a search engine implementing
an index.
Your browser must be html 4.0 compliant, and for proper viewing
of diacritics you must have installed on your system open type fonts
for roman transliteration with diacritics, and for devanagari.
For instance, install fonts IndUni, available from John Smith
at site Cambridge Indology.
A Unicode-compliant font for devanagari with proper ligatures is Apple's
Devanagari MT for Macintosh OS X stations. For Windows users,
installation of font 'Arial MS Unicode' is advised for proper rendering.
You may have to fiddle with the controls of your browser, so that the font
declarations from the dictionary pages get precedence over the standard
selection, and thus encoding is specified as Unicode compliant (UTF-8 encoding).
Remark that most words are given with their etymology as hypertext links. You
may thus navigate from a word to its components, down to its roots.
Also, the gender declarations of
the main entries are mouse-sensitive, and give you direct access to the
relevant declension table. Similarly, the present class mark of the verbal roots
gives access to the conjugation schemes. Also for verb entries, preverbs
lead you to the correspondingly prefixed derived verbs.
Sanskrit made easy
If you want to search for a Sanskrit word
without knowing its exact transliteration, go to section "Sanskrit made easy"
of the index page, which allows you to search for words without knowing
precise diacritics.
For instance, search Vishnou, Siva, or the grammarian Panini.
Sanskrit Grammarian
This interface gives the declension tables for Sanskrit substantives.
Try out this
declension engine by submitting Sanskrit stems
with intended gender. The same transliteration conventions as for the
dictionary index apply. For instance, submit "deva" with gender Mas,
or "devii" with gender Fem, or "brahman" with gender Neu. The fourth
button, labeled "Contextual", may be used for the words which take their
gender from the context, such as "aham", "tvad", or the numbers words
such as "dva", "tri", etc.
A conjugation engine for roots is also available. It handles
the full present system: present indicative, imperfect, imperative and
optative, as well as the passive present, the perfect, the aorist
and the future. Some secondary conjugations (causative, intensive,
desiderative) are also generated, for the full present and future systems.
Try out this conjugation engine
with data such as bhuu 1, as 2, m.rj 2, han 2, haa 3, hu 3, daa 4, su 5,
p.r 6, yuj 7, k.r 8, j~naa 9, namas 10.
Lemmatizer
Conversely, a
lemmatiser
attempts to tag inflected words.
Try for instance devaat, jagmivaan, a.s.tau (clicking on Noun)
or apibat, akaar.siit, dudoha, vaahyate etc (clicking on Verb).
This lemmatizer knows about inflected forms of derived stems which
are not apparent in the display of the main stem inflection.
For instance, dar"sayi.syati is found as conjugated form:
{ ca. fut. a. sg. 3 }[d.r.s_1],
dariid.r"syate yields { int. pr. m. sg. 3 }[d.r"s_1],
did.rk.sate yields { des. pr. m. sg. 3 }[d.r"s_1]
and bibhik.se yields { des. pft. m. sg. 3 | des. pft. m. sg. 1 }[bhaj].
N.B. Do not attempt to lemmatize verbal forms with preverbs - this will
not work, it knowns only how to invert root forms. Lemmatizing
more complex constructions is possible through the Sanskrit Reader below.
Declined forms
A dictionary of inflected forms of Sanskrit words is provided.
The declensions of all substantives, adjectives,
pronouns and numerals is available as a
PDF file Volume I. A second volume contains
the conjugated forms of roots in the present, imperfect, imperative,
optative, perfect, aorist and future tenses as another PDF document
Volume II. A third volume
Volume III contains participial forms,
a fourth Volume IV absolutives,
infinitives and other undeclinable words, and
a fifth Volume V gives additional
generative morphemes.
Sanskrit Reader
Try our experimental Sanskrit Reader.
It is able to segment simple sentences, where the (optional) finite verb form
occurs in final position.
You may use it to analyse sandhi from compounds in the Segmentation mode.
Try for instance to segment "sugandhi.mpu.s.tivardhanam". Then
push the "Tagging" button and get the fully tagged sentence.
For a simple sentence, try "maarjaarodugdha.mpibati".
Sanskrit Parser
If in the reader you press the "Parsing" button, many irrelevant
pseudo-solutions are eliminated. Try for instance example
"hya.hsaayamaha.mpitraasahavipa.nimagaccham". There are 154 potential
segmentation solutions, but the parser keeps only 6.
Sentences may be broken with spaces for piecewise reading and for curbing
down overgenerative items. For instance, if you cut the previous sentence
in two chunks, entering "hya.hsaayamaha.mpitraasaha vipa.nimagaccham",
only two solutions are proposed, the first of which being the intended one.
Each solution returned with the parser is marked with a green check sign,
which may be pressed to get the semantic analysis of the sentence in
terms of roles (kaarakas).
The Zen Library
This site reflects an ongoing project of Sanskrit processing
on a comprehensive software platform.
The project is based on a structured lexicographic database, and on
the Zen library of computational linguistics tools, implemented in
Pidgin ML, functional core of the
Objective Caml
programming language. The Zen library and its documentation are available
as free software under the Gnu General Public License from the
Zen site.
The Sanskrit Portal
Please visit our Sanskrit Portal
to find links to other Sanskrit resources.
Artwork credits
Orissan artwork at this site © B. K. Arts & Crafts,
Bhubaneshwar, Orissa. All rights reserved.
Index page border design courtesy of
Sushama Londhe.
Wallpaper om images courtesy of
Vishvarupa.com.
Ganesh wallpaper courtesy of
François Patte.
Shri Yantra design ©
Gérard Huet 1990.
|
|
| |
|