4799
Comment:
|
4674
|
Deletions are marked like this. | Additions are marked like this. |
Line 3: | Line 3: |
Before you start: suitable for POWER USERS only. | The Lex DB is useful for grammars with large lexicons. Rather than storing the lexicon in a .tdl file, which has to be loaded each time the lexicon is changed, the Lex DB stores the lexical entries in a postgres database. The instructions for using the Lex DB are somewhat sparse and it is not expected that novice LKB users will make use of the system. The best approach for users who want to make use of the Lex DB for their own grammars is to start by working with an existing Lex DB in order to become familiar with the set up. The initial choice that has to be made is whether to use the Lex DB in the 'single user' or 'multiple user' modes. In the multiple user mode, the revisions are stored in the database itself. The assumption is that the database is running on a server to which multiple users may have access. This mode assumes that the database is set up with its own id (something that system administrators may dislike because of potential security risks). In single user mode, the database user may be the actual user. Revisions are not stored in the database (the assumption is that Subversion or CVS will be used to manage changes). |
Line 5: | Line 6: |
You must be running a LexDB-enabled build of the LKB: | The ERG uses multi-user mode, but a single-user DB can be created by following the single user instructions below. Jacy uses a single user Lex DB. |
Line 7: | Line 8: |
1. download the [http://lingo.stanford.edu/ftp/ LKB binary] and the [http://www.cl.cam.ac.uk/~bmw20/DT/lkb/latest/ LexDB setup files]; or 2. download the [http://lingo.stanford.edu/ftp/ LKB source code]. |
* [http://www.cl.cam.ac.uk/~bmw20/Papers/ Papers] |
Line 10: | Line 10: |
If running the binary, ensure the environment variable `PSQL` is set: | * LkbLexDbSingleUser for basic instructions for the SINGLE USER LexDB |
Line 12: | Line 12: |
{{{ export PSQL=t }}} | * LkbLexDbMultiUser for basic instructions for the MULTI USER LexDB |
Line 14: | Line 14: |
If building from source, ensure your {{{.clinit.cl}}} file contains the following: | * JacyLexDb |
Line 16: | Line 16: |
{{{ (pushnew :psql *features*) }}} | * LexDbEmacsInterface |
Line 18: | Line 18: |
InitializePostgres [[BR]] ["InitializeSkeletalLexDB"] |
* LexDbFieldMappings * LexDbInternals The remainder of this page contains miscellaneous information about use of the Lex DB. == LKB Compilation == The Lex DB code requires that *features* includes :psql. This is now true by default for supported platforms. |
Line 22: | Line 29: |
''LexDB -> Filter'' | |
Line 24: | Line 30: |
The filter specified will be interpreted as an SQL WHERE clause. | The filter is used to select subparts of a (multi-user) Lex DB. In the expanded LKB menu |
Line 26: | Line 32: |
Eg. | ''LexDB -> Set Filter'' The filter specified is interpreted as an SQL WHERE clause. Examples: |
Line 35: | Line 43: |
(Note: the default is {{{ TRUE }}}. This represents the empty condition and will select all available entries.) | (Note: the default is `NULL`. Change this to a non-NULL value to select a subset of the lexical database.) |
Line 37: | Line 45: |
The lexicon as seen by the login user is determined by that user's database filter. Only revision entries matching the conditions in filter can form part of the lexicon. In general multiple revisions for a given entry will be returned; the most recent will become part of the visible lexicon. | The filter determines the entries in the lexicon as seen by a particular user. Only revision entries matching the filter conditions in filter can form part of the lexicon (of these, the most recent revision is the one actually used). The filter is not available in single-user mode. |
Line 41: | Line 49: |
The LexDB may be dumped to text files which can then be uploaded to storage in CVS. | The LexDB may be dumped to text files which can then be uploaded to storage in CVS or SVN. |
Line 45: | Line 53: |
(This will dump public schema tables to text files -- eg. {{{ lexdb.rev }}}, {{{lexdb.dfn}}}, and {{{lexdb.fld}}}) [[BR]](Note: a TDL dump will be performed also unless you set {{{*lexdb-dump-tdl*}}} to {{{nil}}}) [[BR]](Note: the database dump files are tab-separated with null as {{{\N}}}) |
(This will dump public schema tables to text files -- eg. `lexdb.rev`, `lexdb.rev_key`, `lexdb.dfn`, `lexdb.fld`, `lexdb.meta`) [[BR]](Note: the dump mechanism will also produce a `.tdl` file if `*lexdb-dump-tdl*` is set to `t`) [[BR]](Note: the database dump files are tab-separated with null as `\N`) |
Line 49: | Line 57: |
2. Run the cvs commit command. Eg. | 2. Run the cvs commit command. E.g. |
Line 51: | Line 59: |
{{{cvs commit ~/erg/lexdb.*}}} | `cvs commit ~/erg/lexdb.*` |
Line 55: | Line 63: |
1. Run the cvs update command to retrieve the latest dump file. Eg. | 1. Run the cvs update command to retrieve the latest dump file. E.g. |
Line 59: | Line 67: |
2. ''LexDB -> Merge new entries'' | 2. ''LexDB -> Load ('rev' entries)'' |
Line 61: | Line 69: |
These steps update the LexDB (public schema) to include all new revisions stored in a CVS dump file. The new entries will be copied to the table public.revision_new. Any changes made to your copy of the LexDB since the last update will be preserved. | These steps update the LexDB (public schema) to include all new revisions stored in a CVS dump file. The new entries will be copied to the table public.rev_new. Any changes made to your copy of the LexDB since the last update will be preserved. |
Line 63: | Line 71: |
== HOW TO dump LexDB as TDL file == | == HOW TO export LexDB to TDL file == |
Line 65: | Line 73: |
''LexDB -> Dump (TDL format)'' | ''LexDB -> Export (TDL file)'' |
Line 67: | Line 75: |
Dumps active LexDB entries (see filter) to {{{.tdl}}} file. | Dumps active LexDB entries (determined by filter) to `.tdl` file. |
Line 69: | Line 77: |
== HOW TO edit entries in the LexDB == | == HOW TO import TDL entries from a file == |
Line 71: | Line 79: |
The LexDB-Emacs interface allows editing of lexical entries from within an Emacs environment (with browsing functionality, field completion, etc.). New revision entries are first stored in the users private schema, and hence are visible only to the particular user. To commit the entries to the public table (public.revision): | To add a small number of new (revision) entries from a `.tdl` file: ''LexDB -> Import (TDL file)''. You will be queried to provide values for other certain non-grammar fields. The entries will go into the private ''rev'' table. |
Line 73: | Line 81: |
0. Add the following line to your {{{.emacs}}} file: | == HOW TO commit entries to public rev == |
Line 75: | Line 83: |
{{{(load "pg-interface")}}} 1. In [http://www.gnu.org/software/emacs/emacs.html GNU Emacs]: ''M-x lexdb'' to enter LexDB major mode. Then see the PG menu. Available commands in LexDB major mode are: ''C-l'' : load (active revision of lexical entry) into Emacs ''C-c'' : commit (edited/new revision of) lexical entry into LexDB ''TAB'' : field completion ''M-TAB'' : get (ring of) (active) entries in LexDB where value of current field matches that in buffer ''M-n'' : cycle through ring of entries obtained above ''M-s'' : as M-TAB, but explicitly specify field value ''M-va'' : view entries added in merge operation from dump file ''M-vs'' : view entries in user's scratch space Note: To remove a lexical entry from the active grammar, create a (head) revision where the {{{flags}}} field is set to `0` (rather than `1`). This is necessary as in order to preserve revision history entry. No revision entry should ever be deleted from the lexical database itself.) == HOW TO load TDL entries into private schema == To add a small number of new (revision) entries from a `.tdl` file: ''LexDB -> Load TDL entries''. The grammatical fields of the LexDB will be obtained from the TDL code; you will be queried to provide values for other necessary fields. == HOW TO commit entries to public schema == The LexDB consists of a single public schema and a set of private schemas, one per user. New (revision) entries are placed initially in your private schema. To commit (all) entries in your private schema to the public table: ''LexDB -> Commit scratch'' |
The LexDB consists of a single public schema and a set of private schemas, one per user. New (revision) entries are placed initially in your private schema. To commit (all) entries in your private schema to the public table: ''LexDB -> Commit private 'rev' '' |
Line 108: | Line 86: |
== HOW TO list entries in private schema == | == HOW TO list entries in private rev == |
Line 110: | Line 88: |
From LKB: ''LexDB -> View scratch'' | From LKB: ''LexDB -> Display private 'rev' '' |
Line 116: | Line 94: |
== HOW TO clear entries in private schema == | == HOW TO clear entries in private rev == |
Line 118: | Line 96: |
''LexDB -> Clear scratch'' | ''LexDB -> Clear private 'rev''' |
Line 120: | Line 98: |
== Further Topics == | == Troubleshooting == |
Line 122: | Line 100: |
["InternalLexDBStructure"] [[BR]]["MWEs and Idiomatic Expressions"] [[BR]] [http://www.cl.cam.ac.uk/~bmw20/DT/Papers/ Papers] |
* If M-x lexdb results in "no connection to LexDb", make sure that a working grammar is loaded. |
LexDB Usage Instructions
The Lex DB is useful for grammars with large lexicons. Rather than storing the lexicon in a .tdl file, which has to be loaded each time the lexicon is changed, the Lex DB stores the lexical entries in a postgres database. The instructions for using the Lex DB are somewhat sparse and it is not expected that novice LKB users will make use of the system. The best approach for users who want to make use of the Lex DB for their own grammars is to start by working with an existing Lex DB in order to become familiar with the set up. The initial choice that has to be made is whether to use the Lex DB in the 'single user' or 'multiple user' modes. In the multiple user mode, the revisions are stored in the database itself. The assumption is that the database is running on a server to which multiple users may have access. This mode assumes that the database is set up with its own id (something that system administrators may dislike because of potential security risks). In single user mode, the database user may be the actual user. Revisions are not stored in the database (the assumption is that Subversion or CVS will be used to manage changes).
The ERG uses multi-user mode, but a single-user DB can be created by following the single user instructions below. Jacy uses a single user Lex DB.
LkbLexDbSingleUser for basic instructions for the SINGLE USER LexDB
LkbLexDbMultiUser for basic instructions for the MULTI USER LexDB
The remainder of this page contains miscellaneous information about use of the Lex DB.
LKB Compilation
The Lex DB code requires that *features* includes :psql. This is now true by default for supported platforms.
HOW TO set the filter
The filter is used to select subparts of a (multi-user) Lex DB. In the expanded LKB menu
LexDB -> Set Filter
The filter specified is interpreted as an SQL WHERE clause. Examples:
userid = 'danf' userid = 'danf' AND dialect = 'my_dialect' userid IN ('danf', 'aac') confidence > 0.5
(Note: the default is NULL. Change this to a non-NULL value to select a subset of the lexical database.)
The filter determines the entries in the lexicon as seen by a particular user. Only revision entries matching the filter conditions in filter can form part of the lexicon (of these, the most recent revision is the one actually used). The filter is not available in single-user mode.
HOW TO store LexDB in CVS
The LexDB may be dumped to text files which can then be uploaded to storage in CVS or SVN.
1. LexDB -> Dump
(This will dump public schema tables to text files -- eg. lexdb.rev, lexdb.rev_key, lexdb.dfn, lexdb.fld, lexdb.meta) BR(Note: the dump mechanism will also produce a .tdl file if *lexdb-dump-tdl* is set to t) BR(Note: the database dump files are tab-separated with null as \N)
2. Run the cvs commit command. E.g.
cvs commit ~/erg/lexdb.*
HOW TO retrieve LexDB from CVS
1. Run the cvs update command to retrieve the latest dump file. E.g.
cvs update ~/erg/lexdb.*
2. LexDB -> Load ('rev' entries)
These steps update the LexDB (public schema) to include all new revisions stored in a CVS dump file. The new entries will be copied to the table public.rev_new. Any changes made to your copy of the LexDB since the last update will be preserved.
HOW TO export LexDB to TDL file
LexDB -> Export (TDL file)
Dumps active LexDB entries (determined by filter) to .tdl file.
HOW TO import TDL entries from a file
To add a small number of new (revision) entries from a .tdl file: LexDB -> Import (TDL file). You will be queried to provide values for other certain non-grammar fields. The entries will go into the private rev table.
HOW TO commit entries to public rev
The LexDB consists of a single public schema and a set of private schemas, one per user. New (revision) entries are placed initially in your private schema. To commit (all) entries in your private schema to the public table: LexDB -> Commit private 'rev'
HOW TO list entries in private rev
From LKB: LexDB -> Display private 'rev'
or
from Emacs LexDB major mode: M-vs
HOW TO clear entries in private rev
LexDB -> Clear private 'rev
If M-x lexdb results in "no connection to LexDb", make sure that a working grammar is loaded. Troubleshooting