Documentation for MatrixTDB tables

Some brief notes about the tables in MatrixTDB2, what they are for, and what is stored in each of the columns.

Note that we discuss MatrixTDB as the system, but the current database on the MySQL server on capuchin being used by the system is MatrixTDB2, not MatrixTDB. MatrixTDB (the database) still exists, but is no longer being used.

Overview of tables

First, what they are for, in alphabetical order:

extr_source_profile

not used in MatrixTDB2

extract_profile_run

not used in MatrixTDB2

extract_result

not used in MatrixTDB2

extract_results

not used in MatrixTDB2

feat_grp

Puts combinations of features that are used in filters together. Every feature/value combination found in filters and imported choices files (language types) is entered as a singleton group (group with just one feature/value combo). Every 'and' group of feature/value combos in filters is entered as a group, i.e. with a common fg_grp_id

filter

Filter repository whose purpose is to match the names of the filters in s_filters.py to the IDs used as foreign keys in fltr_feat_grp.

fltr_feat_grp

Intersection table that matches filters up to the feature groups to which they apply. Used in combination with lt_feat_grp to get the filters that apply to a language type

fltr_mrs

For all filters, lists the mrs tags (aka semantic equivalence classes) that they apply to.

grammar

not used in MatrixTDB2

harv_str

Stores the harvester strings given when importing a source profile

item

Mirrors the item file in a [incr_tsdb()] profile with the addition of i_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id

lt

A repository of language types that have been imported, either by importing source profiles or separately

lt_feat_grp

Intersection table that matches language types up with the feature groups they have. Used in combination with fltr_feat_grp to find filters that are relevant to a language type

mrs

Stores the mrs semantics for each of the mrs tags imported

orig_source_profile

Stores import information about the source profiles

parse

Mirrors the item file in a [incr_tsdb()] profile with the addition of p_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id

res_fltr

Obsolete intersection table that stores the 'result' of applying a universal filter to the 'result' in a [incr_tsdb()] profile. Was only being used to track at most the first fail found for a given result, now not being used at all since we run through universal filters before even adding to item_tsdb, parse, result tables

res_sfltr

Intersection table that stores the 'result' of applying a specific filter to the 'result' in a [incr_tsdb()] profile. Currently only used to store fails, not passes or does-not-applys

result

Mirrors the result file in the source profile with the addition of the r_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id and r_esp_id which is NULL

seed_str

Stores the seed strings which are the harvester strings after having gone through stringmods stored in a canonical form: words, prefixes, suffixes, with all three of those being in alphabetical order

sp_item

Mirrors the item file in the source profile with the addition of the spi_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id

sp_parse

Mirrors the parse file in the source profile with the addition of the spp_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id.

sp_result

Mirrors the result file in the source profile with the addition of the spr_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id.

str_lst

Intersection table that maps seed strings to mrs tags

Documentation of columns

Next, for each table, some documentation of the columns. Note that these are grouped by function.

Tables used to store source profile information

These tables are updated by import_from_itsdb.py and used by add_permutes.py.

orig_source_profile

Stores import information about the source profiles

+----------------------+------------------+------+-----+-------------------+----------------+
| Field                | Type             | Null | Key | Default           | Extra          |
+----------------------+------------------+------+-----+-------------------+----------------+
| osp_orig_src_prof_id | int(10) unsigned | NO   | PRI | NULL              | auto_increment | 
| osp_developer_name   | char(24)         | YES  |     | NULL              |                | 
| osp_prod_date        | timestamp        | NO   |     | CURRENT_TIMESTAMP |                | 
| osp_orig_lt_id       | int(11)          | YES  |     | NULL              |                | 
| osp_comment          | varchar(1000)    | NO   |     |                   |                | 
+----------------------+------------------+------+-----+-------------------+----------------+

osp_orig_src_prof_id

Primary key

osp_developer_name

The developer who imported the source profile

osp_prod_date

The date the profile was imported

osp_orig_ld_id

he ID of the language type that produced the profile

osp_comment

A user-entered description of the source profile

harv_str

Stores the harvester strings given when importing a source profile. Harvester strings serve a dual purpose: They are parsed with a harvester grammar in order to get the mrs for the semantic equivalence class they represent, and they are also used by the constructor function as the starting point for creating the strings for that equivalence class.

+----------------+--------------+------+-----+---------+----------------+
| Field          | Type         | Null | Key | Default | Extra          |
+----------------+--------------+------+-----+---------+----------------+
| hs_id          | int(11)      | NO   | PRI | NULL    | auto_increment | 
| hs_string      | varchar(100) | NO   | MUL |         |                | 
| hs_mrs_tag     | char(20)     | NO   | MUL |         |                | 
| hs_init_osp_id | int(11)      | NO   |     |         |                | 
| hs_cur_osp_id  | int(11)      | NO   | MUL |         |                | 
+----------------+--------------+------+-----+---------+----------------+

hs_id

Primary key

hs_string

The harvester string

hs_mrs_tag

The mrs tag that labels the sem. equiv. class

hs_init_osp_id

The initial original source profile this string/tag combo was imported as a part of

hs_cur_osp_id

The most recent original source profile this string/tag combo was imported as a part of (this is how we update mrss)

mrs

The filters (and parts of the constructor function) are sensistive to the semantic equivalence class of the items they apply to. We use mrs tags to identify the equivalence classes instead of the actual mrss for two reasons 1) they're much easier to manipulate (and store) and 2) this should help when we get to irreducibly ambiguous harvester strings.

Because the actual mrs values will change over time, we allow multiple rows in this table per mrs tag. The field mrs_current flags only one row per mrs tag as the one to use in test suites for the current system. Previous mrs as preserved in case we need to create test suites corresponding to an older state of the system.

+-------------+---------------+------+-----+---------+----------------+
| Field       | Type          | Null | Key | Default | Extra          |
+-------------+---------------+------+-----+---------+----------------+
| mrs_id      | int(11)       | NO   | PRI | NULL    | auto_increment | 
| mrs_tag     | char(20)      | YES  | MUL | NULL    |                | 
| mrs_value   | varchar(1000) | YES  | MUL | NULL    |                | 
| mrs_date    | datetime      | YES  |     | NULL    |                | 
| mrs_osp_id  | int(11)       | YES  |     | NULL    |                | 
| mrs_current | tinyint(4)    | YES  |     | 1       |                | 
+-------------+---------------+------+-----+---------+----------------+

mrs_id

Primary key

mrs_tag

Mrs semantic tag

mrs_vale

The hairy mrs semantics produced in the source profile by [incr_tsdb()]

mrs_date

NULL

mrs_osp_id

The ID of the original source profile the mrs was imported from

mrs_current

1 if this value of the tag is the current one

sp_item

Mirrors the item file in the source profile with the addition of the spi_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. sp_item stands for `source profile: item' In [incr tsdb()], the item file contains the input items, along with information about their well-formedness, length, and other characteristics.

Here are the columns:

+----------------+------------------+------+-----+---------+----------------+
| Field          | Type             | Null | Key | Default | Extra          |
+----------------+------------------+------+-----+---------+----------------+
| spi_id         | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| spi_origin     | char(20)         | YES  |     |         |                | 
| spi_register   | char(20)         | YES  |     |         |                | 
| spi_format     | char(20)         | YES  |     |         |                | 
| spi_difficulty | int(4)           | YES  |     | -1      |                | 
| spi_category   | char(20)         | YES  |     |         |                | 
| spi_input      | varchar(512)     | YES  |     |         |                | 
| spi_wf         | int(4)           | YES  |     | 1       |                | 
| spi_length     | int(4)           | YES  |     | -1      |                | 
| spi_comment    | varchar(256)     | YES  |     |         |                | 
| spi_author     | char(50)         | YES  |     |         |                | 
| spi_date       | char(10)         | YES  |     |         |                | 
| spi_osp_id     | int(11)          | NO   |     |         |                | 
+----------------+------------------+------+-----+---------+----------------+

Most of these are not of interest here (and will likely soon be suppressed). We have meaningful data in the following columns:

spi_id

the item id

spi_input

the actual input string

spi_length

the number of words in the input string

spi_author

the author of the item as recorded by [incr tsdb()]

spi_osp_id

the ID of the original source profile the item came from (not an itsdb field)

sp_parse

Mirrors the parse file in the source profile with the addition of the spp_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. sp_parse stands for `source profile: parse' and corresponds to the [incr tsdb()] parse file. The parse file contains information about the processing of each item in a given test run. For our purposes (constructing gold standard profiles for comparison), most of this data is not useful. Instead, parse serves to link item and result, and we may end up dispensing with it and creating it on the fly on export.

Here are the columns:

+------------------+------------------+------+-----+---------+----------------+
| Field            | Type             | Null | Key | Default | Extra          |
+------------------+------------------+------+-----+---------+----------------+
| spp_parse_id     | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| spp_run_id       | int(11)          | YES  |     | -1      |                | 
| spp_i_id         | int(11)          | NO   |     |         |                | 
| spp_readings     | int(11)          | YES  |     | -1      |                | 
| spp_first        | int(11)          | YES  |     | -1      |                | 
| spp_total        | int(11)          | YES  |     | -1      |                | 
| spp_tcpu         | int(11)          | YES  |     | -1      |                | 
| spp_tgc          | int(11)          | YES  |     | -1      |                | 
| spp_treal        | int(11)          | YES  |     | -1      |                | 
| spp_words        | int(11)          | YES  |     | -1      |                | 
| spp_l_stasks     | int(11)          | YES  |     | -1      |                | 
| spp_ctasks       | int(11)          | YES  |     | -1      |                | 
| spp_ftasks       | int(11)          | YES  |     | -1      |                | 
| spp_etasks       | int(11)          | YES  |     | -1      |                | 
| spp_stasks       | int(11)          | YES  |     | -1      |                | 
| spp_aedges       | int(11)          | YES  |     | -1      |                | 
| spp_pedges       | int(11)          | YES  |     | -1      |                | 
| spp_raedges      | int(11)          | YES  |     | -1      |                | 
| spp_rpedges      | int(11)          | YES  |     | -1      |                | 
| spp_unifications | int(11)          | YES  |     | -1      |                | 
| spp_copies       | int(11)          | YES  |     | -1      |                | 
| spp_conses       | int(11)          | YES  |     | -1      |                | 
| spp_symbols      | int(11)          | YES  |     | -1      |                | 
| spp_others       | int(11)          | YES  |     | -1      |                | 
| spp_gcs          | int(11)          | YES  |     | -1      |                | 
| spp_i_load       | int(11)          | YES  |     | -1      |                | 
| spp_a_load       | int(11)          | YES  |     | -1      |                | 
| spp_date         | char(22)         | YES  |     |         |                | 
| spp_error        | varchar(1000)    | YES  |     |         |                | 
| spp_comment      | varchar(1000)    | YES  |     |         |                | 
| spp_osp_id       | int(11)          | NO   |     |         |                | 
+------------------+------------------+------+-----+---------+----------------+

The only ones we are actively using are as follows:

spp_parse_id

id for this row, linked by sp_result

spp_i_id

pointer to item used as input to the parse

spp_osp_id

id of original source profile this information came from

sp_result

Mirrors the result file in the source profile with the addition of the spr_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id. sp_result stands for `source profile: result' and corresponds to the [incr tsdb()] result file. The result file contains information about the result of processing of each item in a given test run, including the derivation and the output MRS. Again, there are many columns we are not using (which are in there initially to mirror the [incr tsdb()] tables). The ones we are using are glossed below:

+----------------+------------------+------+-----+---------+----------------+
| Field          | Type             | Null | Key | Default | Extra          |
+----------------+------------------+------+-----+---------+----------------+
| spr_parse_id   | int(11)          | NO   | MUL |         |                | 
| spr_result_id  | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| spr_time       | int(11)          | YES  |     | -1      |                | 
| spr_ctasks     | int(11)          | YES  |     | -1      |                | 
| spr_ftasks     | int(11)          | YES  |     | -1      |                | 
| spr_etasks     | int(11)          | YES  |     | -1      |                | 
| spr_stasks     | int(11)          | YES  |     | -1      |                | 
| spr_size       | int(11)          | YES  |     | -1      |                | 
| spr_aedges     | int(11)          | YES  |     | -1      |                | 
| spr_pedges     | int(11)          | YES  |     | -1      |                | 
| spr_derivation | varchar(1000)    | YES  |     |         |                | 
| spr_surface    | varchar(1000)    | YES  |     |         |                | 
| spr_tree       | varchar(1000)    | YES  |     |         |                | 
| spr_mrs        | varchar(1000)    | YES  |     |         |                | 
| spr_flags      | varchar(24)      | YES  |     |         |                | 
| spr_wf         | tinyint(4)       | YES  |     | 1       |                | 
| spr_osp_id     | int(11)          | NO   |     |         |                | 
+----------------+------------------+------+-----+---------+----------------+

spr_parse_id

pointer to row in sp_parse table and through it, row in sp_item

spr_result_id

id for this result

spr_mrs

tag corresponding to actual mrs value to be found in the mrs table; itsdb would have an actual mrs here

spr_wf

?

spr_osp_id

id of original source profile this information came from

Tables for storing master item list

After we import items, we then run add_permutes to make a master list of derived items on the basis of the imported ones. We store these derived items in the tables item_tsdb, parse, result, which are parallel to sp_item, sp_parse, and sp_result.

Note: add_permutes only adds those items to item_tsdb/parse/result that pass all universal filters.

seed_str

Stores the seed strings which are the harvester strings after having gone through stringmods stored in a canonical form: words, prefixes, suffixes, with all three of those being in alphabetical order.

+----------------+--------------+------+-----+---------+----------------+
| Field          | Type         | Null | Key | Default | Extra          |
+----------------+--------------+------+-----+---------+----------------+
| seed_id        | int(11)      | NO   | PRI | NULL    | auto_increment | 
| seed_str_value | varchar(100) | YES  |     | NULL    |                | 
+----------------+--------------+------+-----+---------+----------------+

seed_id

Primary key

seed_str_value

Seed string in normalized form

str_lst

Intersection table that maps seed strings to mrs tags.

+--------------+------------------+------+-----+---------+----------------+
| Field        | Type             | Null | Key | Default | Extra          |
+--------------+------------------+------+-----+---------+----------------+
| sl_id        | int(11)          | NO   | PRI | NULL    | auto_increment | 
| sl_mrs_tag   | char(20)         | YES  |     | NULL    |                | 
| sl_seed_type | char(2)          | YES  |     | NULL    |                | 
| sl_seed_id   | int(10) unsigned | YES  |     | NULL    |                | 
+--------------+------------------+------+-----+---------+----------------+

sl_id

Primary key

sl_mrs_tag

The semantic mrs tag this seed string belongs to. Can serve as a foreign key to mrs.mrs_tag and result.r_mrs.

sl_seed_type

intended to indicate seed/harv status; currently NULL

sl_seed_id

Foreign key to seed_str.seed_id

item

Mirrors the item file in a [incr_tsdb()] profile with the addition of i_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id.

+--------------+------------------+------+-----+---------+----------------+
| Field        | Type             | Null | Key | Default | Extra          |
+--------------+------------------+------+-----+---------+----------------+
| i_id         | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| i_origin     | char(20)         | YES  |     |         |                | 
| i_register   | char(20)         | YES  |     |         |                | 
| i_format     | char(20)         | YES  |     |         |                | 
| i_difficulty | int(4)           | YES  |     | -1      |                | 
| i_category   | char(20)         | YES  |     |         |                | 
| i_input      | varchar(512)     | YES  | MUL |         |                | 
| i_wf         | int(4)           | YES  |     | 1       |                | 
| i_length     | int(4)           | YES  |     | -1      |                | 
| i_comment    | varchar(256)     | YES  |     |         |                | 
| i_author     | char(50)         | YES  |     |         |                | 
| i_date       | char(10)         | YES  |     |         |                | 
| i_osp_id     | int(11)          | NO   |     |         |                | 
+--------------+------------------+------+-----+---------+----------------+

i_id

Primary key

i_input

string, as created by add_permutes.py

i_length

length of string in words

i_author

records add_permutes.py as the author of the item

i_osp_id

original source profile that the harvester string the item is based on comes from

parse

Mirrors the item file in a [incr_tsdb()] profile with the addition of p_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id.

 +----------------+------------------+------+-----+---------+----------------+
| Field          | Type             | Null | Key | Default | Extra          |
+----------------+------------------+------+-----+---------+----------------+
| p_parse_id     | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| p_run_id       | int(11)          | YES  |     | -1      |                | 
| p_i_id         | int(11)          | NO   | MUL |         |                | 
| p_readings     | int(11)          | YES  |     | -1      |                | 
| p_first        | int(11)          | YES  |     | -1      |                | 
| p_total        | int(11)          | YES  |     | -1      |                | 
| p_tcpu         | int(11)          | YES  |     | -1      |                | 
| p_tgc          | int(11)          | YES  |     | -1      |                | 
| p_treal        | int(11)          | YES  |     | -1      |                | 
| p_words        | int(11)          | YES  |     | -1      |                | 
| p_l_stasks     | int(11)          | YES  |     | -1      |                | 
| p_ctasks       | int(11)          | YES  |     | -1      |                | 
| p_ftasks       | int(11)          | YES  |     | -1      |                | 
| p_etasks       | int(11)          | YES  |     | -1      |                | 
| p_stasks       | int(11)          | YES  |     | -1      |                | 
| p_aedges       | int(11)          | YES  |     | -1      |                | 
| p_pedges       | int(11)          | YES  |     | -1      |                | 
| p_raedges      | int(11)          | YES  |     | -1      |                | 
| p_rpedges      | int(11)          | YES  |     | -1      |                | 
| p_unifications | int(11)          | YES  |     | -1      |                | 
| p_copies       | int(11)          | YES  |     | -1      |                | 
| p_conses       | int(11)          | YES  |     | -1      |                | 
| p_symbols      | int(11)          | YES  |     | -1      |                | 
| p_others       | int(11)          | YES  |     | -1      |                | 
| p_gcs          | int(11)          | YES  |     | -1      |                | 
| p_i_load       | int(11)          | YES  |     | -1      |                | 
| p_a_load       | int(11)          | YES  |     | -1      |                | 
| p_date         | char(22)         | YES  |     |         |                | 
| p_error        | varchar(1000)    | YES  |     |         |                | 
| p_comment      | varchar(1000)    | YES  |     |         |                | 
| p_osp_id       | int(11)          | NO   |     |         |                | 
+----------------+------------------+------+-----+---------+----------------+

p_parse_id

Primary key

p_i_id

id of the input item this parse corresponds to. As of now they are always the same.

p_osp_id

original source profile that the harvester string the item is based on comes from

result

Mirrors the result file in the source profile with the addition of the r_osp_id which is a foreign key to orig_source_profile.osp_orig_src_prof_id and r_esp_id which is NULL

+--------------+------------------+------+-----+---------+----------------+
| Field        | Type             | Null | Key | Default | Extra          |
+--------------+------------------+------+-----+---------+----------------+
| r_parse_id   | int(11)          | NO   | MUL |         |                | 
| r_result_id  | int(10) unsigned | NO   | PRI | NULL    | auto_increment | 
| r_time       | int(11)          | YES  |     | -1      |                | 
| r_ctasks     | int(11)          | YES  |     | -1      |                | 
| r_ftasks     | int(11)          | YES  |     | -1      |                | 
| r_etasks     | int(11)          | YES  |     | -1      |                | 
| r_stasks     | int(11)          | YES  |     | -1      |                | 
| r_size       | int(11)          | YES  |     | -1      |                | 
| r_aedges     | int(11)          | YES  |     | -1      |                | 
| r_pedges     | int(11)          | YES  |     | -1      |                | 
| r_derivation | varchar(1000)    | YES  |     |         |                | 
| r_surface    | varchar(1000)    | YES  |     |         |                | 
| r_tree       | varchar(1000)    | YES  |     |         |                | 
| r_mrs        | varchar(1000)    | YES  | MUL |         |                | 
| r_flags      | varchar(24)      | YES  |     |         |                | 
| r_wf         | tinyint(4)       | YES  |     | 1       |                | 
| r_osp_id     | int(11)          | NO   |     |         |                | 
| r_esp_id     | int(11)          | YES  |     | NULL    |                | 
+--------------+------------------+------+-----+---------+----------------+

r_parse_id

parse row this result corresponds to. As of now, the IDs are always the same.

r_result_id

id for this result. As of now, the result, parse, and item IDs are always the same

r_mrs

mrs tag corresponding to mrs_value in mrs

r_wf

1 if this result passes all universal filters, 0 if it fails at least one; filled in by stored procedure in MatrixTDB

r_osp_id

original source profile that the harvester item the result is based on comes from

r_esp_id

NULL. Originally intended to be most recent profile this has been exported to?

Tables for language types and filters

These tables are used by sql_lg_type (to import language types), run_specific_filters (to run the master items through specific filters and record their pass/fail results), and generate_s_profile (to generate a validation [incr_tsdb()] profile).

feat_grp

This table stores groups of features referred to elsewhere. There are singleton groups (containing just one feature) and larger groups. The largest groups (whole language types) aren't stored here, but rather in lt_feat_grp (see below). The non-singleton groups which are stored here are the conjunctions referred to by filters.

+-----------+----------+------+-----+---------+----------------+
| Field     | Type     | Null | Key | Default | Extra          |
+-----------+----------+------+-----+---------+----------------+
| fg_id     | int(11)  | NO   | PRI | NULL    | auto_increment | 
| fg_grp_id | int(11)  | YES  | MUL | NULL    |                | 
| fg_feat   | char(20) | YES  | MUL | NULL    |                | 
| fg_value  | char(20) | YES  |     | NULL    |                | 
+-----------+----------+------+-----+---------+----------------+

fg_id

Primary key

fg_grp_id

the ID of the feature group

fg_feat

feature name

fg_value

feature value

filter

Filter repository whose purpose is to match the names of the filters in s_filters.py to the IDs used as foreign keys in fltr_feat_grp.

+-------------+-----------+------+-----+---------+----------------+
| Field       | Type      | Null | Key | Default | Extra          |
+-------------+-----------+------+-----+---------+----------------+
| filter_id   | int(11)   | NO   | PRI | NULL    | auto_increment | 
| filter_name | char(100) | YES  |     | NULL    |                | 
| filter_type | char(2)   | YES  |     | s       |                | 
+-------------+-----------+------+-----+---------+----------------+

filter_id

Primary key

filter_name

The name member of the filter in s_filters.py

filter_type

Either 'u' for universal or 's' for specific. Now we're just using 's' since add_permutes only enters those items that pass all universal filters and we're skipping run_u_filters.py which used this table for 'u' filters.

fltr_feat_grp

Intersection table that stores the feature groups that each specific filter cares about. These feature groups are the specification of the language types that the filter applies to. Used in combination with lt_feat_grp to get the filters thatapply to a language type

+-------------+---------+------+-----+---------+----------------+
| Field       | Type    | Null | Key | Default | Extra          |
+-------------+---------+------+-----+---------+----------------+
| ffg_id      | int(11) | NO   | PRI | NULL    | auto_increment | 
| ffg_fltr_id | int(11) | YES  | MUL | NULL    |                | 
| ffg_grp_id  | int(11) | YES  | MUL | NULL    |                | 
+-------------+---------+------+-----+---------+----------------+

ffg_id

Primary key

ffg_fltr_id

A foreign key to filter.filter_id

ffg_grp_id

A foreign key to feat_grp.fg_grp_id

fltr_mrs

NOTE: Not used in MatrixTDB2.

Each filter has some set of semantic equivalence classes that it applies to. This table seems to have been intended to record that information. Note that there can be multiple rows with for any given filter (i.e., with the same fm_fltr_id).

+------------+----------+------+-----+---------+----------------+
| Field      | Type     | Null | Key | Default | Extra          |
+------------+----------+------+-----+---------+----------------+
| fm_id      | int(11)  | NO   | PRI | NULL    | auto_increment | 
| fm_fltr_id | int(11)  | YES  |     | NULL    |                | 
| fm_mrs_tag | char(20) | YES  |     | NULL    |                | 
| fm_value   | int(11)  | YES  |     | NULL    |                | 
+------------+----------+------+-----+---------+----------------+

fm_id

row id

fm_fltr_id

filter id, points to filter

fm_mrs_tag

mrs tag, indicating equivalence class

fm_value

whether or not the filter applies?

NOTE: There is no data in this table right now. The filters are defined as filter objects in the python code, and they each contain as list of mrs tags they apply to. This information could be stored in the DB, for reference, I supposed, but it isn't currently.

lt

This table stores language types that we have worked with.

+------------+---------------+------+-----+---------+----------------+
| Field      | Type          | Null | Key | Default | Extra          |
+------------+---------------+------+-----+---------+----------------+
| lt_id      | int(11)       | NO   | PRI | NULL    | auto_increment | 
| lt_origin  | char(2)       | YES  |     | NULL    |                | 
| lt_comment | varchar(1000) | YES  |     | NULL    |                | 
+------------+---------------+------+-----+---------+----------------+

lt_id

Primary key

lt_origin

Either 'r' or 'p'. Indicates whether the language type was purpose-built or randomly-generated

lt_comment

A user-entered comment about what the language type is

lt_feat_grp

Intersection table that matches language types up with the feature groups they have. This table stores pointers from language types to the feature groups they comprise. On the one hand, it points to all the singleton group (feature-value pairs) that define the lt. On the other hand, it also points to the larger groups that are consistent with the lt definition. We update this table through create_or_update_lt() in sql_lg_types.py and update_all_lts_in_lfg() in run_specific_filters.py. We use it to find the filters that are appropriate for a given lt. Used in combination with fltr_feat_grp to find filters that are relevant to a language type.

+------------+---------+------+-----+---------+----------------+
| Field      | Type    | Null | Key | Default | Extra          |
+------------+---------+------+-----+---------+----------------+
| lfg_id     | int(11) | NO   | PRI | NULL    | auto_increment | 
| lfg_lt_id  | int(11) | YES  | MUL | NULL    |                | 
| lfg_grp_id | int(11) | YES  | MUL | NULL    |                | 
+------------+---------+------+-----+---------+----------------+

lfg_id

Primary key

lfg_lt_id

Foreign key to lt.lt_id

lfg_grp_id

A foreign key to feat_grp.fg_grp_id

One final note on this set of tables: If you are debugging the system and trying to figure out why a given string was not grammatical, you will want to know what specific filters it failed. The way to do this is to run the query that generate_s_profile runs when determine which strings fail filters specific to the relevant language type. This query, in a slightly different from from that used by genereate_s_profile, is:

where lt_id is the lt_id of the language type you are generating the profile for and result_id is the id of the result you are wondering about. Be sure you find the ID that matches item_tsdb.i_input (the string) and r.r_mrs (the semantic tag) you're wondering about. This will get you a list of all specific filters that the item failed. If the item didn't fail any specific filters and still isn't appearing as a grammatical item in your profile it probably failed a universal filter (in which case you won't even have a result id for it.)

Tables for running filters

These tables are used by run_u_filters (no longer used) and run_specific_filters to record the results or running filters on string/mrs combos.

res_fltr

NOTE: No longer used since add_permutes just puts in those items that pass all universal filters.

This table records the results of each universal filter on each item (row from result, actually) that it applies to. Now rows are created in this table for filter/result combinations where the filter does not apply to that result.

Might get renamed to res_ufltr for clarity.

+------------+---------+------+-----+---------+----------------+
| Field      | Type    | Null | Key | Default | Extra          |
+------------+---------+------+-----+---------+----------------+
| rf_id      | int(11) | NO   | PRI | NULL    | auto_increment | 
| rf_res_id  | int(11) | YES  | MUL | NULL    |                | 
| rf_fltr_id | int(11) | YES  |     | NULL    |                | 
| rf_value   | int(4)  | YES  |     | NULL    |                | 
+------------+---------+------+-----+---------+----------------+

rf_id

Primary key

rf_res_id

Foreign key to result.r_result_id, parse.p_parse_id, and item_tsdb.i_id

rf_fltr_id

Foreign key to filter.filter_id

rf_value

The 'result' of applying the filter to the 'result'. 0 for fail, 1 for pass, 2 for does-not-apply

res_sfltr

Intersection table that stores the 'result' of applying a specific filter to the 'result' in a [incr_tsdb()] profile. Currently only used to store fails, not passes or does-not-applys.

+--------------+---------+------+-----+---------+----------------+
| Field        | Type    | Null | Key | Default | Extra          |
+--------------+---------+------+-----+---------+----------------+
| rsf_id       | int(11) | NO   | PRI | NULL    | auto_increment | 
| rsf_res_id   | int(11) | YES  | MUL | NULL    |                | 
| rsf_sfltr_id | int(11) | YES  | MUL | NULL    |                | 
| rsf_value    | int(4)  | YES  |     | NULL    |                | 
+--------------+---------+------+-----+---------+----------------+

rsf_id

row id

rsf_res_id

Foreign key to result.r_result_id, parse.p_parse_id, and item_tsdb.i_id

rsf_sfltr_id

Foreign key to filter.filter_id

rsf_value

The 'result' of applying the filter to the 'result'. 0 for fail, 1 for pass, 2 for does-not-apply

Ancillary tables

MatrixTDB2Tables (last edited 2011-10-08 21:12:09 by localhost)

(The DELPH-IN infrastructure is hosted at the University of Oslo)