This page provides some semi-formal and semi-technical background on what are called Elementary Dependency Structures (EDS, or at times also EDs), a reduction of full MRS formulae to a variable-free form, i.e. a semantic dependency graph. The original motivation for this representation was ease of downstream processing, for an information extraction application, say.

EDS since its 2002 inception (Oepen, et al., 2002) has found a broader range of DELPH-IN-internal applications, typically where it is desirable to ‘break down’ a full meaning representation into relevant, variable-free component pieces: designing so-called discriminants for semantics-based treebanking (Oepen & Lønning, 2006), for example; features encoding co-occurence of semantic predicates in parse disambiguation (Toutanova, et al., 2002; Fujita, et al., 2007; inter alios); generative probability models for semantic transfer (Oepen, et al., 2007); granular semantic parser evaluation (Dridan, 2009; Dridan & Oepen, 2011; MacKinlay, et al., 2011; MacKinlay, et al., 2012; inter alios); further reduction into bi-lexical semantic dependencies (Ivanova, et al., 2012); efficient search in Wikipedia-scale semantic networks (Kouylekov & Oepen, 2014); estimating inter-annotator agreement (Bender, et al., 2015); or inference-based textual entailment (Lien & Kouylekov, 2015).

Outside of DELPH-IN, applications of EDS (or its reduction into bi-lexical dependencies) like the experiments of Jones, et al. (2013) suggest a growing community interest in parsing into graph-shaped target representations. For example, this is seen in the inclusion of EDS-derived bi-lexical dependencies in Task 8 and Task 18 (on Broad-Coverage Semantic Dependency Parsing, or SDP) at the 2014 and 2015 SemEval Conferences, respectively. EDS-derived bi-lexical semantic dependencies continue to be used for data-driven semantic parsing work beyond SemEval 2015, and the full SDP data sets are becoming publicly available through the Linguistic Data Consortium (LDC) in early 2016.

This page was initiated and is maintained by StephanOepen, who is also the original developer of the EDS design and supporting Lisp code (which is part of the LKB, as well as of the stand-alone Lisp MRS library).

A First Example

Following is the Elementary Dependency Structure (EDS) associated with the (preferred) ERG analysis (in the 1111 release version) for the sentence Abrams promised the dog to bark.

   _1:proper_q<0:6>[BV x5]
   e2:_promise_v_1<7:15>[ARG1 x5, ARG2 x10, ARG3 e16]
   _2:_the_q<16:19>[BV x10]
   e16:_bark_v_1<27:32>[ARG1 x5]

Semi-formally, the above structure is a directed graph where nodes are labeled (among other things) with semantic predicates (e.g. proper_q or _dog_n_1) and arcs are labeled with semantic argument roles (e.g. ARG1, ARG2, or BV). Node labels can optionally be suffixed with so-called characterization pointers, e.g. <7:15> on the _promise_v_1 predicate; more on these below.

For unique reference (in the textual format depicted above), each node further has a node identifier (e.g. _1 or x5, prefixed to node labels and separated by a colon), which serves to depict reentrant node occurences in argument positions, i.e. as the target of an incoming dependency arc. While node identifiers in EDS can superficially resemble logical variables in an underlying MRS, they formally have a very different (non-variable) status and could be uniquely renamed without any loss of information or generality.

Consider our example above: the fourth line renders the node identified as e2 and labeled with the predicate _promise_v_1 (at characterization <7:15>). This node has three outgoing dependency arcs, labeled ARG1, ARG2, and ARG3, pointing to nodes x5, x10, and e16, respectively. Outgoing arcs are depicted as a comma-separated list enclosed in a pair of square brackets following each node label; nodes without any outgoing arcs will have an empty such list (as do for example nodes x5 and x10).

Some predicates may be parameterized, i.e. they take one or more constant arguments. In our running example, this is true of the named predicate (the label of dependency node x5), which shows its parameter enclosed in a pair of parentheses, following the predicate and characterization (if any).

Finally, the root node of the dependency graph is identified by the node identifier immediately following the opening curly brace (on the first line), i.e. in our running example node e2 is the root of the graph.

Reflections on Selected Phenomena

Note that (unlike many flavours of syntactic dependency graphs now in common use) EDS need not be tree-shaped, i.e. dependency nodes can have more than one incoming arc. In the above example, node x5 is an argument of both node e2 and node e16 (as the ARG1, i.e. ‘deep subject’ in both cases). Furthermore, some of the input tokens from the underlying utterance may not be reflected in the semantic dependency graph. In the above, this is true of the infinitival marker to, for example, which is analyzed as semantically vacuous (i.e. not contributing any meaning of its own).

Conversely, it is possible for multiple EDS nodes to correspond to a single input token; this is observed in nodes _1 and x5 in our above example, making two statements about the interpretation of Abrams, of which one is a syntactically covert quantifier (labeled proper_q) in the underlying MRS.

Finally, it is also possible for an EDS to contain dependency nodes that do not directly correspond to a single input word. While this is not true for our first example above, the phenomenon is present in the following example, the EDS analysis of the input Chasing the garden dog helped.

   _1:udef_q<0:22>[BV x5]
   x5:nominalization<0:22>[ARG1 e9]
   e9:_chase_v_1<0:7>[ARG2 x10]
   _2:_the_q<8:11>[BV x10]
   e17:compound<12:22>[ARG1 x10, ARG2 x16]
   _3:udef_q<12:18>[BV x16]
   e2:_help_v_1<23:30>[ARG1 x5]

In the above, nodes x5 (the nominalization of the chase event) and e17 (the compounding of garden and dog) do not directly correspond to individual words of the input, but rather are pieces of semantics that are associated with larger phrases and the constructions used to build them (viz. nominalization and compounding). Arguably, the same holds for one of the nodes labeled udef_q (to be read as definite in an underspecified manner), i.e. node _1, which corresponds to the covert quantifier (in the underlying MRS) assumed to bind the nominalization. Node _3 in the above example (also labeled udef_q, i.e. a syntactically covert quantifier), on the other hand, is similar to node _1 (labeled proper_q) in our initial example (i.e. corresponds to a single input word, viz. garden, as indicated by its characterization); in other words, this covert quantifier just binds the left, non-head element of the compound.

Further Properties of Dependency Nodes

In the MRS universe, information such as number and tense (which are in large parts determined morphologically, in English at least) is typically represented as so-called variable properties, i.e. pairs of attributes and corresponding values associated to logical variables; in our examples so far, we have suppressed all such information. For some further discussion of MRS variable properties and how they relate to features and types internal to a parsing or generation grammar, please see the RmrsVpm page.

Seeing that EDS is a variable-free representation, variable properties need to be re-associated with dependency nodes, more specifically the node (corresponding to the MRS elementary predication) for which the variable served as its so-called distinguished variable (Oepen & Lønning, 2006). Thus, each EDS node can be optionally annotated with a variable type and a set of property–value pairs (enclosed in curly braces, inbetween the node label and outgoing arcs), for example the dog barked.

   _1:_the_q<0:3>[BV x6]
   x6:_dog_n_1<4:7>{x NUM sg, IND +}[]
   e3:_bark_v_1<8:15>{e SF prop, TENSE past}[ARG1 x6]

In this serialization of node properties pertaining to the underlying distinguished variable, the variable type (e.g. ‘x’ for instances and e’ for eventualities) is optional and, when present, needs to occupy the first position in the list (in which case there will in total be an odd number of expressions between the pair of curly braces). Empty sets of node properties can be omitted, as was the case in our introductory examples above.

Serialization of node properties is enabled by the boolean configuration option *eds-show-properties-p*, which until late 2015 used to default to false; with emerging use of EDS for generation (see the EdsGeneration page for background) more recently, it seemed convenient to toggle the default behavior, in a sense trading a bit of compactness and readability for additional information content.

On MRS Conversion to EDS Graphs

Breaking Up Things Further: EDS Triples

    proper_q<0:6> BV named<0:6>(Abrams)  
    _promise_v_1<7:15> ARG1 named<0:6>(Abrams)  
    _promise_v_1<7:15> ARG2 _dog_n_1<20:23>  
    _promise_v_1<7:15> ARG3 _bark_v_1<27:32>  
    _the_q<16:19> BV _dog_n_1<20:23>  
    _bark_v_1<27:32> ARG1 named<0:6>(Abrams)  

Graph Connectivity

Optionally, the EDS code can test for graph connectivity, i.e. reachability of nodes from the root. In the default textual EDS serialization format, graphs that contain sub-graphs (which can be individual nodes) that are not connected to the root will be flagged as fragmented, and non-connected nodes will be indicated by an annotation (using the ‘|’ symbol) in the first column of each affected line. Following is the EDS for the input Nearly every dog barked. (from the MRS test suite, using the 1111 release of the ERG), for example:

  {e2: (fragmented)
   _1:_every_q<7:12>[BV x7]
   e2:_bark_v_1<17:24>[ARG1 x7]

In this example, node e4 is not reachable from the root node e2, i.e. there is no path through the graph leading to e4. In this case, non-connectivity is owed to a known limitation in the analysis of degree specifiers on quantifiers in the ERG (or, arguably, in the more general MRS framework, as a candidate analysis for this phenomenon would call for the composition of predicates). In current ERG analyses, the degree specifier nearly is only weakly integrated with the underlying MRS, viz. by sharings its label with that of the quantifier, i.e. attaching itself to the same scopal position as the quantifier. Seeing that labels (and hence label equality) are an MRS-internal notion (part of the meta language, but not logical variables that would have a status in any variant of predicate calculus as the object language), it is only natural that the ‘missing’ link between the quantifier and its degree specifier is exposed in EDS.

While degree specification of quantifiers is a known source of non-connectivity in ERG-derived EDSs, there is no principled or technical reason to prevent other instances of non-connectivity—which typically would correspond to ill-formed MRSs. Thus, EDS connectivity is at times used as a wellformedness test in the grammar engineering and treebanking cycle.

Software Support in DELPH-IN Tools

The reduction of MRSs to Elementary Dependency Structures is implemented as part of the Common Lisp MRS Library that is part of the LKB and can be linked into PET (activated through the -mrs=eds command line option). Please look near the top of the EDS source code for available parameters. Excessive use of comments in the function ed-select-representative() (in the same file) may also serve to clarify the disambiguation rules that apply in case a single variable (typically a handle) is associated with multiple elementary predications.

In the LKB, EDS views can be requested for parsing and realization results from pop-up menues in several of the graphical viewers. Together with other views on syntactic and semantic aspects of parsing results, a LaTeX rendering of Elementary Dependency Structures can be obtained from the ERG On-line Demonstrator. However, probably the easiest path to obtaining EDS outputs is through batch processing in [incr tsdb()] and its export facilities; please see the ErgProcessing page for instructions.

For access to full EDS functionality, one can invoke the MRS-to-EDS reduction through Lisp function calls like the following:

  MRS(48): (type-of mrs)
  MRS(49): (with-open-file (stream "/tmp/sample.eds" :direction :output
                            :if-exists :supersede)
             (ed-output-psoa mrs :stream stream :format :ascii))

Other valid formats include :triples, :latex, :html, and :lui, and various optional parameters to the function (and global variables) serve to customize the output in format-specific ways.

EDS Configuration Options

Semantic Wellformedness Testing


EdsTop (last edited 2016-01-08 16:48:48 by StephanOepen)

(The DELPH-IN infrastructure is hosted at the University of Oslo)