Differences between revisions 10 and 11
Revision 10 as of 2019-07-17 22:07:10
Size: 5477
Comment:
Revision 11 as of 2019-08-01 08:21:07
Size: 5744
Comment:
Deletions are marked like this. Additions are marked like this.
Line 14: Line 14:

= Treebanking =

To see rules that can be applied to a span, middle-click on the first word of the span and drag it to the last word of the span you want to inspect. The span will be marked in red. Only rules that can build that span will be shown below.
Line 23: Line 28:

FFTB: the full forest treebanker

FFTB is a tool for treebanking with DELPH-IN grammars that allows the selection of an arbitrary tree from the "full forest" without enumerating/unpacking all analyses in the parsing stage. FFTB is partly integrated with [incr tsdb()] and the LOGON tree; for details on using FFTB through that toolchain, please see LogonAnswer. There is a wishlist.

Obtaining it

The last binary versions of all ACE-dependent tools are here:

http://sweaglesw.org/linguistics/acetools/

Treebanking

To see rules that can be applied to a span, middle-click on the first word of the span and drag it to the last word of the span you want to inspect. The span will be marked in red. Only rules that can build that span will be shown below.

Upgrade the treebank

For updating your treebank from one grammar version to another, you would use fftb, e.g.:

fftb -g indra-new.dat profile-parsed-with-new --gold indra-old/tsdb/gold/profile --auto 

according to this message.

FFTB on OSX

It is possible to use FFTB independently of the LOGON tree, either on Linux or on Mac OS X. The instructions below explain how to install and launch FFTB on OSX. Note that although the treebanking user interface presented by FFTB will be identical regardless of whether the LOGON tree is used, some of the ancillary steps, such as preparing a full-forest profile to use with FFTB, may be quite different when the LOGON tree is not used. Such methods are experimental and not (yet) discussed here.

The adventurous can attempt to compile and run a standalone fftb on linux from the SVN repository: http://sweaglesw.org/svn/treebank/trunk

FFTB for bridging analyses (self-help robust treebanking)

As of 1214, the ERG includes (disabled by default) so-called bridging rules which enable a parse to be found for any input string, with degraded semantic usefulness. There is currently no parse selection model trained to function properly with these rules enabled, so they should be disabled when the ERG is used in an online processing environment. However, in an offline treebanking environment, these rules enable annotators to capture some of the correct semantics for ungrammatical sentences even when a fully correct analysis is not available. In the future, it may be possible to use such annotations to train an automatic parse selection model to work with the bridging rules.

If a profile has been parsed to include an "edge" relation using the bridging rules, and erg-1214-bridge.dat is an ACE grammar image with the relevant rules enabled, it is possible to update the profile from a non-bridged one and continue annotating the rejected sentences with a command like this:

 fftb --suppress-bridges --browser -g erg-1214-bridge.dat --gold path/to/previous-profile path/to/new-profile-with-bridging 

The --suppress-bridges option causes FFTB to assume a discriminant has been selected for each sentence rejecting the bridging rule at the top level span. This should cause the update to behave exactly as if the bridging rules had not been enabled in parsing the new profile. After updating, you can rerun without that option (and also without the --gold ... option), or you can manually disable that discriminant each time you visit a rejected item that you want to reclaim.

FFTB on remote machine

You can wrap a proxy server round fftb and treebank remotely.

Here is the configuration for nginx, thanks to AlexandreRademaker. with minor tweaks to get it running on Ubuntu. This serves it at localhost:8080/private/. Call fftb without --browser.

### our changes
user www-data;
include /etc/nginx/modules-enabled/*.conf;
# daemon on

worker_processes  1;

error_log  /var/log/nginx/error.log;
pid        /var/run/nginx.pid;

events {
   worker_connections  1024;
}

http {
   merge_slashes off;

   default_type  application/octet-stream; 
   charset   utf-8;
   keepalive_timeout  65;
   server_tokens       off;
   tcp_nopush          off;
   tcp_nodelay         on;

   gzip              on;
   gzip_http_version 1.0;
   gzip_proxied      any;
   gzip_min_length   500;
   gzip_disable      "MSIE [1-6]\.";
   gzip_types        text/plain text/xml text/css
                     text/comma-separated-values
                     text/javascript
                     application/x-javascript
                     application/atom+xml;

   include mime.types;

   upstream fftb9080 {
       server 127.0.0.1:9080;
   }

   server {
       listen      8080;
       server_name localhost;
       charset     utf-8;
       
       location / {
           proxy_pass http://fftb9080;
           ## for password protection
           #auth_basic "Private Forest";
           #auth_basic_user_file /etc/nginx/.htpasswd;
       }
   }
}

Trouble Shooting

Span

The tool offers you the chance to select select spans with "Is A Constituent" but this causes the system to crash (very often) when you try to save the tree (from at least 0.9.23). So don't do this.

LOGNAME

At least when running inside a docker container, fftb crackes if it can't find the environment variable LOGNAME. To set it use

export LOGNAME=myuser

FftbTop (last edited 2019-08-01 08:21:07 by LuisMorgadoCosta)

(The DELPH-IN infrastructure is hosted at the University of Oslo)