The Linguistic Type Database processes grammars and treebanks to make online documentation for grammars made with the LKB.

The code can be found on github:

The Linguistic Type Database (LTDB, née Lextype DB), describes types and rules of the grammar with frequency information from the treebank. Currently, we have applied the LTDB to grammars and treebanks of Chinese, English, Japanese and Spanish.

The minimal Linguistic Type Database offers the following:

The LTDB has been updated by Francis Bond and Luis Morgado da Costa, using PyDelphin and visualization from delphin-viz. The software was originally written by ChikaraHashimoto and FrancisBond in perl, and used the html output provided by StephanOepen.

Earlier versions of the lexical type database also included links to external references and other lexicons. We hope to revive them at some stage.

Sample In Line Documentation

n_-_c_le := n_intr_lex_entry
Intransitive count noun (icn)
<ex>The dog barked.

case-p-lex-np-kara := case-p-lex-np &
<name lang='ja'>承名詞受身主格助詞</name>
<ex>子供 が 親 から たしなめ られる
<ex>親戚 から 怒ら れる
<nex>友人 から 自転車 を 買う
(07-03-30)間接受身でも使えるようにすべき。(lkb::do-parse-tty "親戚 から 怒ら れる")
(07-03-30) postp-lexの後につくtypeも必要。(lkb::do-parse-tty "子供 が 親 とか から たしなめ られる")
                        [SYNSEM.LOCAL.CAT.HEAD.CASE kara-case].

Comments should appear inside the TDL doc-strings. They should be written in reStructuredText. There are two special things recognized.


<ex>A good example of this type
<nex>A bad example of this type
<mex>A good example of this type, but which is ungrammatical, which we parse through robust or mal-rules or constructions.

Ideally parses of positive examples should contain the type in question, while parses of negative examples should not (although they may be grammatical under other circumstances). It is assumed the the example is finished by a newline. We show both <nex> with an asterisk (∗) and <mex> with a circled asterisk (⊛) in the human readable documentation. Neither is accompanied by an Obelisk.


case-p-lex-np-ga := case-p-lex-np &
"""<name lang='ja'>承名詞主格助詞
<ex>犬 が 走る
<ex>バナナ が 猿 に 食べ られる
<ex>犬 に 芸 が できる か
<nex>彼 は 帰っ た が
                        [SYNSEM.LOCAL.CAT.HEAD.CASE ga].

It is assumed that the name is finished by a newline.


The code can be found on github:

There is a README file that describes how to build the database. In summary:

./make-ltdb.bash --grmdir /path/to/grammar


./make-ltdb.bash --grmdir ~/git/jacy

The code makes certain assumptions:

For the dependencies, please see the github page.

To enable CGI in user directories, add the following lines to the appropriate Apache configuration file. That could be /etc/apache2/httpd.conf, or more correctly, the appropriate file in /etc/apache2/site-enabled/.

<Directory /home/*/public_html/cgi-bin/>
   Options ExecCGI
   SetHandler cgi-script

Tool Support

As of 2018-11-4, docstrings are supported by the latest LKB-FOS and PyDelphin, PET and ACE, with support in the LOGON LKB promised soon.

Currently, the LKB does NOT support doc-strings in instances (such as rules, roots and lexical entries) only in types. LTDB and ACE supports these, but recommends you wait for the LKB to support them.


To Do

LkbLtdb (last edited 2018-11-15 06:34:25 by FrancisBond)

(The DELPH-IN infrastructure is hosted at the University of Oslo)