ACE configuration file

This page documents various options in ACE's grammar-specific configuration file.

The ACE configuration file uses a TDL-like syntax; comments start with semicolons, and each other line looks like:

parameter := value.

The format of the value field varies depending on the parameter.

Values that are filesystem paths are always in double quotes and are relative to the configuration file's location.

Paths to other files:

version

path to a text file containing the grammar version within double quotes

grammar-top

path to a TDL file that includes the rest of the grammar

preprocessor

path to a .rpp file defining the REPP

preprocessor-modules

list of unquoted paths to REPP modules to enable

maxent-model

path to a .mem file for ranking

variable-property-mapping

path to the VPM definition for parsing/generating

input-vpm

path to the input VPM definition, when 'transfer' is set to 'enabled'

output-vpm

path to the output VPM definition, when 'transfer' is set to 'enabled'

parse-node-labels

path to a TDL file defining parse node labels (unused)

generation-ignore-signs

path to a file listing rule and lexeme names to disable for generation

generation-ignore-lexemes

path to a file listing more lexeme names to disable for generation

generation-ignore-rules

path to a file listing more rule names to disable for generation

generation-trigger-rules

path to an MTR file defining trigger rules for generation

generation-fixup-rules

path to an MTR file defining transfer rules to massage MRSes before generation

cleanup-rules

path to an MTR file defining transfer rules to massage MRSes read off of TFSes

idiom-rules

path to an MTR file defining idiom-checking rules for parsing

quickcheck-code

path to a text file defining the quickcheck skeleton

quickcheck-instance

name of an instance containing a PET-style quickcheck skeleton (not a string)

semantic-interface

path to a .smi file defining the SEMI

irregular-forms

list of unquoted paths to irregular form tables

Types

top-type

name of the type at the root of the type hierarchy

semarg-type

grammar-internal supertype of MRS variables

handle-type

SEMI type for MRS handles and holes

skolem-feature

feature name where skolem constants are put (defaults to INSTLOC)

list-type

type for lists

cons-type

type for non-empty lists

null-type

type for empty lists

diff-list-type

type for diff-lists

semantic-interface-top-type

the name of the type at the top of the SEMI type hierarchy

Paths within AVMs

orth-path

path to orthography list within a sign

lex-rels-path

path to list of EPs supplied by a lexeme

lex-carg-path

path to CARG of a lexeme's key relation (old, unused)

lex-pred-path

path to PRED of a lexeme's key relation (old, unused)

rule-rels-path

path to list of EPs supplied by a rule

semantics-path

path to the MRS structure of a sign

hook-path

path to the structure containing LTOP and INDEX inside an MRS TFS

label-path

path to the label string on a parse-node instance

mrs-rels-list

path to the RELS list within an MRS structure; default is "RELS LIST"

mrs-hcons-list

path to the HCONS list within an MRS structure; default is "HCONS LIST"

mrs-icons-list

path to the ICONS list within an MRS structure (if enabled); default is "ICONS LIST"

icons-left

feature for the left side of an ICONS element; default is "IC-LEFT"

icons-right

feature for the right side of an ICONS element; default is "IC-RIGHT"

chart-dependencies

a (flat) list of pairs of quoted paths used for removing unwanted lexemes from the chart before parsing. a lexeme with a non-*top* value for the first path in a pair will only survive if there is another lexeme with a compatible value at the second path in the pair.

Token Mapping

token-mapping

enabled (normal) or disabled (legacy; not well supported)

token-type

type of a token feature structure

lexicon-tokens-path

path to the list of tokens that license a lexeme

lexicon-tokens-last-path

path to the LAST pointer for the diff-list-like token list

token-form-path

path within a token to the surface form string

token-from-path

path within a token to the CFROM (character start position) field

token-to-path

path within a token to the CTO (character end position) field

token-id-path

path within a token to the ID

token-postags-path

path within a token to the POS tag list

token-posprobs-path

path within a token to the POS tag probability list

lattice-mapping-input-path

path within a lattice mapping rule to the input list

lattice-mapping-output-path

path within a lattice mapping rule to the output list

lattice-mapping-context-path

path within a lattice mapping rule to the context list

lattice-mapping-position-path

path within a lattice mapping rule to the positional constraints

Miscellaneous Switches

index-accessibility-filtering

enabled or disabled. this significantly reduces generation time, but if important references to MRS variables are on the packing restrictor then it can cause generation to fail.

generalize-edge-top-types

enabled or disabled. sets the top-level type of all passive edges to sign, which results in significantly increased packing with some grammars. may or may not be a net win, considering additional failures in unpacking.

english-pos-tagger

enabled or disbled. runs the input through a built-in trigram HMM tagger (but you need to supply the training data) and records the results in the token structures.

simplify-lexicon

enabled or disabled. ERG-specific; attempts to represent the lexicon more compactly.

extra-erg-dag-stash

enabled or disabled. ERG-specific; keeps extra copies of some common GLB types DAGs on hand.

process-chart-dependencies-before-lexical-parsing

enabled or disabled. when enabled, chart reduction happens before lexical parsing (default is after lexical parsing).

enable-icons

enabled or disabled. Controls support for ICONS in MRSes.

General Configuration

transfer

enabled or disabled. defaults to disabled. when enabled, ACE processes transfer grammars instead of syntactic grammars. adjustment of several MRS configuration parameters may be necessary.

spanning-only-rules

list of rules to only allow in the spanning chart cell

fragment-only-rules

list of rules to only allow when fragment parsing is enabled

hyper-active-rules

list of rules for which hyperactive parsing should be used

deleted-daughters

list of features to delete on new passive edges

parsing-packing-restrictor

list of features to suppress until unpacking while parsing

generation-packing-restrictor

list of features to suppress until unpacking while generating

mrs-deleted-roles

list of features on internal MRS variable AVMs that should not be exported when presenting final MRSes (superceded by VPM?)

parsing-roots

list of root instances enabled for parsing

generation-roots

list of root instances enabled for generation

non-idiom-root

the name of an instance to check against all parsing results to see whether further idiom processing is required

generic-les-for-semantic-index

list of the names of the generic lexemes that can be used to provide an unknown relation in generation

ortho-max-rules

maximum length of a chain of orthographemic rules to allow in parsing

ortho-warn-on-max-rules

whether to print a warning when the limit of orthographemic rules is reached

freezer-megabytes

defaults to 256; make it bigger if you get a warning about it.

irregular-form-rule-suffix

a string to append to the rule names found in irregular form tables.

invent-ltop

enabled or disabled. if disabled, the MRS LTOP will be extracted from the grammar. otherwise, it will be invented as a new hole connected to the grammar-provided LTOP by a QEQ.

AceConfiguration (last edited 2013-02-06 21:06:00 by WoodleyPackard)

(The DELPH-IN infrastructure is hosted at the University of Oslo)