This page is a brief introduction to the profiles used by [incr tsdb()]. For more information, see the User & Reference Manual.
Overview
A Competence and Performance Profile is a test suite instance that includes competence and performance information from a processing system (such as a parser or generator) and a grammar (and possibly pre-processor).
A profile typically is made up of a directory consisting of a database stored as several text files. The format of the files is controlled by the relations file. The most up-to-date relations file should be here: http://svn.emmtee.net/trunk/lingo/lkb/src/tsdb/skeletons/english/Relations. Within each file, the rows are lines, and the columns are separated by the @ mark.
Profiles can be manipulated within [incr tsdb()] proper (the default) and some other tools.
Some of the files are described below. Note, however, that the [incr tsdb()] database schema is subject to occassional revision; the canonical current version is available through SVN.
item
Schema
item: i-id :integer :key i-origin :string i-register :string i-format :string i-difficulty :integer i-category :string i-input :string i-tokens :string i-gloss :string i-translation :string i-wf :integer i-length :integer i-comment :string i-author :string i-date :date
This is the input file. The actual input is typically column 7: i-input. If it is tokenized/pre-processed before processing, the results of this should go into field 8: i-tokens.
run
This gives some information about the configuration of the actual test-run, including the processing engine and grammar versions.
run: run-id :integer :key # unique test run identifier run-comment :string # descriptive narrative platform :string # implementation platform (version) tsdb :string # tsdb(1) (version) used application :string # application (version) used environment :string # application-specific information grammar :string # grammar (version) used avms :integer # number of avm types in image sorts :integer # number of sort types in image templates :integer # number of templates in image lexicon :integer # number of lexical entries lrules :integer # number of lexical rules rules :integer # number of (non-lexical) rules user :string # user who did the test run host :string # machine used for this run os :string # operating system (version) start :date # start time of this test run end :date # end time for this test run items :integer # number of test items in this run status :string # exit status (PVM only)
parse
This gives information about the entire process for a single input item: this information is mainly used for competence/performance profiling. The i-id should be linked to the id in item.
parse: parse-id :integer :key # unique parse identifier run-id :integer :key # test run for this parse i-id :integer :key # item parsed p-input :string # initial (pre-processed) parser input p-tokens :string # internal parser input: lexical lookup readings :integer # number of readings obtained first :integer # time to find first reading (msec) total :integer # total time for parsing (msec) tcpu :integer # total (cpu) processing time (msec) tgc :integer # gc time used (msec) treal :integer # overall real time (msec) words :integer # lexical entries retrieved l-stasks :integer # successful lexical rule applications p-ctasks :integer # parser contemplated tasks (LKB) p-ftasks :integer # parser filtered tasks p-etasks :integer # parser executed tasks p-stasks :integer # parser succeeding tasks aedges :integer # active items in chart (PAGE) pedges :integer # passive items in chart raedges :integer # active items contributing to result rpedges :integer # passive items contributing to result unifications :integer # number of (node) unifications copies :integer # number of (node) copy operations conses :integer # cons() cells allocated symbols :integer # symbols allocated others :integer # bytes of memory allocated gcs :integer # number of garbage collections i-load :integer # initial load (start of parse) a-load :integer # average load date :date # date and time of parse error :string # error string (if applicable |:-) comment :string # application-specific comment
result
This stores the actual result of the processing. There may be multiple results for a single input. The i-id should be linked to the id in item, the parse-id to the is in parse. Typically this file will be used as input by any post-processors, such as exporting, transfer, generation and so on.
result: parse-id :integer :key # parse for this result result-id :integer # unique result identifier time :integer # time to find this result (msec) r-ctasks :integer # parser contemplated tasks r-ftasks :integer # parser filtered tasks r-etasks :integer # parser executed tasks r-stasks :integer # parser succeeding tasks size :integer # size of feature structure r-aedges :integer # active items for this result r-pedges :integer # passive items in this result derivation :string # derivation tree for this reading surface :string # surface string (e.g. realization) tree :string # phrase structure tree (CSLI labels) mrs :string # mrs for this reading flags :string # arbitrary annotation (e.g. BLEU)