Differences between revisions 19 and 20
Revision 19 as of 2013-06-28 23:44:54
Size: 5237
Editor: StephanOepen
Revision 20 as of 2013-07-09 19:30:32
Size: 5047
Editor: StephanOepen
Deletions are marked like this. Additions are marked like this.
Line 82: Line 82:
The treebanking tool currently does not provide what in the ‘classic’ setup is
called an interactive update, i.e. annotation seeded from recorded decisions
in a ‘gold’ profile.


This page provides (emerging) information on support in the LOGON tree for the Answer Constraint Engine (ACE). As of mid-2013, binaries for parsing-generating and treebanking with ACE are included with the LOGON tree, albeit (currently) only for 64-bit environments. These binaries are maintained by the ACE developer, WoodleyPackard.


  $LOGONROOT/parse --binary --terg+tnt/ace --protocol 2 --best 1 --limit 0 --count 8 cb

Full-Forest Treebanking

The answer wrapper script in the LOGON tree currently does not support the full range of standard command-line options to activate grammars (that come with the LOGON distribution), but only ‘--erg’ and ‘--terg’, for the release and trunk versions of the ERG, respectively.

To invoke full-forest treebanking, in the [incr tsdb()] podium, select Trees|Switches|External Treebanking Tool’. While this toggle is in effect, the ‘Trees|Annotate’ and ‘Trees|Updatecommands will invoke the external ACE Full-Forest Treebanker. To give an illusion of tight integration, the following [incr tsdb()] parameters will have an effect on the external treebanking tool: (a) selection of a ‘working set’ of items, through a condition on profile entries (e.g. as determined through ‘Options|TSQL Condition’ or ‘Options|New Condition’); (b) the selection of a ‘gold’ profile (by clicking the middle mouse button in the [incr tsdb()] podium), as the source for update information; and (c) the toggle for batch vs. interactive updates (‘Trees|Switches|Automatic Update’).

Furthermore, the invocation of the Answer treebanker application can be customized through ‘Trees|Variables|Treebanking Tool’ or setting the [incr tsdb()] variable *redwoods-treebanker-application* in the per-user ‘.tsdbrc’. The default value, for the time being, is "answer --annotate --terg".

In principle, it should work to follow the ‘common’ release procedure for treebank creation, i.e. first call for an automatic update immediately following the parsing, but adding the option ‘--update/external’ to the ‘parsecommand line. This functionality remains to be validated, though.

Known Issues

Edge identifiers in full ACE derivations (as reported to [incr tsdb()]) are not unique in the context of one input to the parser-generator; this means that ‘classic’ [incr tsdb()] treebanking tools (i.e. tree-based annotation, using syntactic or semantic discriminants) cannot be used on these profiles.

The ACE parser-generator does not (yet) report the token lattices (before and after token mapping, i.e. the tsdb(1) fields ‘p-input’ and ‘p-tokens’) to [incr tsdb()]; both in exporting and post-processing results as well as in regression testing (or cross-platform comparison), precise token information is often desirable.

The exact interpretation of the ‘--best’ and ‘--limit’ parameters remains to be defined for forest-based parsing-generating (unlike in the classic setups, a --limit’ of 0 should probably mean recording no derivations, rather than an unlimited number of them; arguably, this interpretation should also be applied retroactively to n-best parsing-generating).

Resource limit specifications through command-line options to the LOGON parse or generate scripts (i.e. --time, --memory, and --edges) are not communicated to the ACE parser-generator.

Derivations deposited in [incr tsdb()] profiles upon successful forest disambiguation fail to preserve the token identifiers (from the entries of the original forest), which may not matter to many downstream applications but would be a missing link in the conversion of ERG derivations to PTB-style tokenization.

Use of the Answer treebanking tool for updates from an existing ‘gold’ profile presupposes prior normalization of the ‘gold’ profile, i.e. the treebanker will not honor in-profile versioning (through the ‘t-version’ field in the various relations used to record annotations).

During automated updates, the treebanking tool will not explicitly flag items that remain unannotated, i.e. for which the update was not successful. [incr tsdb()] provides the flag ‘Trees|Switches|Update Flag Failures’ (on by default, but currently not communicated to the external treebanking tool) to make it easy to select the sub-set of remaining items that require annotation.

The <Control-G> or <Control-C> key bindings in the [incr tsdb()] podium are not yet enabled during external treebanking. Hence, it is vital to exit from the external treebanking tool by clicking on Exit in its browser window (or otherwise making it terminate, e.g. through a shell command like ‘killall fftb’) to regain control in the [incr tsdb()] podium.

LogonAnswer (last edited 2016-05-23 22:07:28 by StephanOepen)

(The DELPH-IN infrastructure is hosted at the University of Oslo)