|Deletions are marked like this.||Additions are marked like this.|
|Line 45:||Line 45:|
|the parsing, but adding the option ‘`--update/external`’ to the ‘`parse`’||the parsing, by adding the option ‘`--update/external`’ to the ‘`parse`’|
|Line 47:||Line 47:|
|This functionality remains to be validated, though.||This functionality remains to be validated, though (in fact, items that do not
auto-update successfully are currently not flagged, i.e. it will not be possible
to select remaining unannotated items by virtue of a TSQL condition).
= Going Back to Derivation-Based Profiles =
|Line 50:||Line 54:|
Edge identifiers in full ACE derivations (as reported to <<itsdb>>) are not unique in the
context of one input to the parser-generator; this means that ‘classic’ <<itsdb>> treebanking
tools (i.e. tree-based annotation, using syntactic or semantic discriminants) cannot be
used on these profiles.
The ACE parser-generator does not (yet) report the token lattices (before and after
token mapping, i.e. the `tsdb(1)` fields ‘`p-input`’ and ‘`p-tokens`’) to <<itsdb>>;
both in exporting and post-processing results as well as in regression testing (or
cross-platform comparison), precise token information is often desirable.
The exact interpretation of the ‘`--best`’ and ‘`--limit`’ parameters remains to be
defined for forest-based parsing-generating (unlike in the classic setups, a
‘`--limit`’ of `0` should probably mean recording no derivations, rather than an
unlimited number of them; arguably, this interpretation should also be applied
retroactively to n-best parsing-generating).
|Line 70:||Line 58:|
Derivations deposited in <<itsdb>> profiles upon successful forest disambiguation
fail to preserve the token identifiers (from the entries of the original
forest), which may not matter to many downstream applications but would be
a missing link in the conversion of ERG derivations to PTB-style tokenization.
|Line 86:||Line 68:|
|select the sub-set of remaining items that require annotation.||select the sub-set of remaining items that require annotation. (As of July 26 2013,
the treebanking tool flags these by setting t-active = -1, but for some reason
<<itsdb>> is currently unwilling to apply a TSQL condition to the FFTB invocation).
This page provides (emerging) information on support in the LOGON tree for the Answer Constraint Engine (ACE). As of mid-2013, binaries for parsing-generating and treebanking with ACE are included with the LOGON tree, albeit (currently) only for 64-bit environments. These binaries are maintained by the ACE developer, WoodleyPackard.
$LOGONROOT/parse --binary --terg+tnt/ace --protocol 2 --best 1 --limit 0 --count 8 cb
The answer wrapper script in the LOGON tree currently does not support the full range of standard command-line options to activate grammars (that come with the LOGON distribution), but only ‘--erg’ and ‘--terg’, for the release and trunk versions of the ERG, respectively.
To invoke full-forest treebanking, in the [incr tsdb()] podium, select ‘Trees|Switches|External Treebanking Tool’. While this toggle is in effect, the ‘Trees|Annotate’ and ‘Trees|Update’ commands will invoke the external ACE Full-Forest Treebanker. To give an illusion of tight integration, the following [incr tsdb()] parameters will have an effect on the external treebanking tool: (a) selection of a ‘working set’ of items, through a condition on profile entries (e.g. as determined through ‘Options|TSQL Condition’ or ‘Options|New Condition’); (b) the selection of a ‘gold’ profile (by clicking the middle mouse button in the [incr tsdb()] podium), as the source for update information; and (c) the toggle for batch vs. interactive updates (‘Trees|Switches|Automatic Update’).
Furthermore, the invocation of the Answer treebanker application can be customized through ‘Trees|Variables|Treebanking Tool’ or setting the [incr tsdb()] variable *redwoods-treebanker-application* in the per-user ‘.tsdbrc’. The default value, for the time being, is "answer --annotate --terg".
In principle, it should work to follow the ‘common’ release procedure for treebank creation, i.e. first call for an automatic update immediately following the parsing, by adding the option ‘--update/external’ to the ‘parse’ command line. This functionality remains to be validated, though (in fact, items that do not auto-update successfully are currently not flagged, i.e. it will not be possible to select remaining unannotated items by virtue of a TSQL condition).
Going Back to Derivation-Based Profiles
Resource limit specifications through command-line options to the LOGON parse or generate scripts (i.e. --time, --memory, and --edges) are not communicated to the ACE parser-generator.
Use of the Answer treebanking tool for updates from an existing ‘gold’ profile presupposes prior normalization of the ‘gold’ profile, i.e. the treebanker will not honor in-profile versioning (through the ‘t-version’ field in the various relations used to record annotations).
During automated updates, the treebanking tool will not explicitly flag items that remain unannotated, i.e. for which the update was not successful. [incr tsdb()] provides the flag ‘Trees|Switches|Update Flag Failures’ (on by default, but currently not communicated to the external treebanking tool) to make it easy to select the sub-set of remaining items that require annotation. (As of July 26 2013, the treebanking tool flags these by setting t-active = -1, but for some reason [incr tsdb()] is currently unwilling to apply a TSQL condition to the FFTB invocation).
The <Control-G> or <Control-C> key bindings in the [incr tsdb()] podium are not yet enabled during external treebanking. Hence, it is vital to exit from the external treebanking tool by clicking on Exit in its browser window (or otherwise making it terminate, e.g. through a shell command like ‘killall fftb’) to regain control in the [incr tsdb()] podium.