ERS: High-Level Characterization
In a nutshell, English Resource Semantics (ERS; see ErgSemantics) captures sentence meaning in an abstract representation that is compatible with logic-based approaches to natural language semantics. Here, we interpret ‘sentence meaning’ (in contrast to what is at times called ‘speaker’ or ‘occasion meaning’) as the contribution to semantic interpretation that is wholly determined by the linguistic signal and its grammatical structure. (On ‘speaker’ vs. ‘sentence’ meaning, see Quine 1960 and Grice 1968; for a discussion of these ideas and how they relate to the ERG, see Bender et al 2015.)
ERS is finely calibrated as an easy-to-use interface representation for parsing and generation, because it abstracts from semantically irrelevant surface variation, is rich and detailed, yet avoids semantic distinctions that are not constrained by the grammar of English. For a semi-formal summary of the ERS meaning representation language, please see ErgSemantics/Basics.
In the following sections, we walk through several core aspects (or ‘layers’) of ERS meaning representation, provide motivating examples and pointers to related sections of the ERG Semantic Documentation, and also seek to reflect on which ERS layers are more fully developed and which remain to be elaborated.
Semantic Predicate–Argument Structure
Predicate–argument structure is expressed in a collection of n-ary predications (or relations) linked together by (typed) variables. Thus, argument sharing across predicates will be captured through variable equality, as for example in non-scopal modification, control constructions, coordinate structures, as illustrated in Example (1), as well as others (like relative clauses and certain types of comparatives):
(1) The cheerful children wanted to sing and dance.
In this example, the instance variable x5 (highlighted in red) is the first argument of both the (semantics of the) attribute adjective cheerful, the subject control predicate try, and both predications corresponding to the coordinated verb phrase.
There are many syntactically distinct ways of expressing the same underlying predicate–argument structure, as exemplified in Examples (2a)–(2h) below. So-called diathesis alternations like the dative shift (2b), passivization ((2c) and (2d)), or focus movement ((2e) and (2f)) can lead to stark differences in syntactic structure but have no observable effect on predicate–argument structure (and would be expected to exhibit identical truth conditions). Such surface variation is abstracted away at the level of ERS meaning representations, i.e. Examples (2a) through (2h) are all analyzed as semantically equivalent close paraphrases and thus all associated with the predicate–argument structure shown below. This abstraction is one of the properties of ERS that makes it well-suited as the interface representation to parsing and generation, in that downstream processing is expected to be independent of (language-specific) syntax.
(2a) Kim gave Sandy the book. (2b) Kim gave the book to Sandy. (2c) Sandy was given the book by Kim. (2d) The book was given to Sandy by Kim. (2e) The book, Kim gave Sandy. (2f) Sandy, Kim gave the book to. (2g) The book, Sandy was given by Kim. (2h) To Sandy, the book was given by Kim. ...
Similar normalization effects are obtained for other constructions, for example in (3) and (4) below. Lexical knowledge in the ERG enables the distinction between so-called referential vs. expletive usages of some pronouns, of which only the former will correspond to semantic arguments. While technique in (3a) is the syntactic subject of the predicative copula, the paraphrase invoking so-called (expletive) it extraposition in (3b) demonstrates that technique is not a semantic argument of impossible: Intuitively, there is no lack of possibility attributed to the technique instance. Instead, there is a long-distance dependency with the unexpressed syntactic complement of apply in (3a), which is made explicit by variable x6 in the ERS (in red).
(3a) This technique is impossible to apply. (3b) It is impossible to apply this technique.
Another frequent variation in syntactic structure that is normalized at the level of ERS pertains to what at times are called restrictive modifiers, which can take the form of pre- or post-nominal attributive adjectives or relative clauses (i.e. non-local dependencies), for example:
(4a) The barking dog scared me. (4b) The dog that was barking scared me. (4c) The dog barking (behind the fence) scared me. (4d) The dog I think Kim told to bark scared me.
In the ERS analyses for (4a) through (4d), there will always be an instance of the _bark_v_1 relation (albeit with different tense properties on its eventuality variable), where the dog instance (x6) serves as its first argument.
Two variants of the examples in (3a,b) are given in (5), but neither is assigned the same semantics as for the examples in (3).
(5a) To apply this technique is impossible. (5b) Applying this technique is impossible.
Instead, for each of (5a,b) the grammar introduces a nominalization EP which takes the semantics of the subject VP as its argument. For (5b) this is perhaps less surprising, since the grammar uniformly adds such an EP for all deverbal gerunds, and this distinct treatment of (5b) is consistent with the lack of full parallelism with (3b), as shown by the ungrammaticality of (6): adjectives such as impossible with an expletive subject take an infinitival VP complement, but not an -ing VP. Thus (5b) is treated as more closed related to Application of this technique is impossible.
(6) *It is impossible applying this technique.
With distinct semantics for (3b) and (5b), a choice must be made for (5a), either to give it a semantics similar or identical to (3b), or rather to (5b). At present, the ERG opts for the latter, but the highly regular availability of both variants (3b) and (5a) for adjectives like impossible (e.g. difficult, fun, convenient, surprising, disastrous) suggests that the decision to add a nominalization EP for the semantics of (5a) may need revisiting.
Quantification and Scope
ERS makes explicit which parts of the linguistic signal express quantification and provides partially-specified information about the scope of quantifiers (and other ‘operator-like’ predications; see below). The representation is underspecified in the sense that we give just one ERS for the following type of examples, even though there are two readings, depending on the relative scope of the quantifiers:
(1) All dogs chased a cat. (a) ∀x dog(x): ∃y cat(y): chase(x,y) (b) ∃y cat(y): ∀x dog(x): chase(x,y)
Nonetheless our representations do give partial information about scope, in keeping with the facts of English. Of particular interest here is the allocation of predications into the restriction and body of quantifiers, which is critical to the correct modeling of their interpretation. For example, the two sentences in (2) differ precisely in this allocation. The quantifier expressed by all, when viewed as a generalized quantifier, expresses a subset relation: For sentences with all to be true, the set described by its restriction must be a subset of the set described by its body.
(2a) All funny jokes are short. (2b) All short jokes are funny.
This difference is captured in the ERS by constraining the restriction (second argument) of each quantifier predication, as can be seen in our representations for these two examples, where handle equality corresponds to logical conjunction of predications. The handles and associated handle constraints shown in red (h6 and h8) in these ERSs relate the restriction argument of all to the appropriate elementary predications.
In fact, the handle topology is the only difference between these two ERSs: They are identical in predicate–argument structure but differ in scopal properties. In the semantics of (2a), funny and joke are conjoined in the restrictor of the quantifier (shown as h8 in red), while short is the top-level predication, as indicated by =q equality between its handle h2 and the top handle h1 (in blue). Conversely, the scopal constraints on funny and short are reversed in the semantics for (2a).
One well-formedness constraint in ERS is that every instance variable (i.e. every variable of type x; see ErgSemantics/Basics) must be bound by a quantifier. Accordingly, ERSs provide quantifiers somewhat ‘generously’, one corresponding to each nominal expression in the surface signal, as well as additional quantifiers in case the semantic contribution of a construction introduces further x-type variables (such as Coordination, Nominalization, or Partitives). Where there is an overt determiner (all, none, the, a, this, etc.) the quantifier will reflect the semantic contribution of that determiner. Where there is not, the quantifier will be one of a collection of abstract predicates, most frequently udef_q (an underspecified quantifier). See the page on Implicit Quantifiers for further details.
Current NLP tasks rarely exercise the kind of inference enabled by the proper treatment of quantifiers, but as NLP as a field approaches more involved and sensitive question answering and other dialogue related tasks, we expect this aspect of meaning representation to gain importance (see also Steedman, 2012). For applications where some or all of the quantifiers provided by the ERS are not necessary, they can be easily identified and handled according to the needs of the application. A full inventory of the quantifier predicates provided by the grammar can be found in the SEM-I (semantic interface). Note that certain elements sometimes treated as quantifiers in the literature (notably many and the semantic contribution of number names) are treated akin to the predicates introduced by attributive adjectives in the ERG.
Negation, Modality, Operators
The second primary way in which English syntax places constraints on scope involves what we call scopal operators (see ErgSemantics/Basics). Here, we illustrate this with say, probably, and not, all analyzed in ERS as predications that take scopal (i.e. handle-valued) arguments. Their scope relative to each other (and to the main predication with which they co-occur) is fixed by their position in the syntax, and this is recorded in the ERS via the =q handle constraints highlighted in the figure below. These constraints say that _probable_a_1 will be within the scope of _say_v_to (with perhaps a quantifier taking scope inbetween), and likewise for neg and _probable_a_1, and _rain_v_1 and neg, respectively.
(1) The meteorologist says it probably won't rain.
By using =q constraints rather than direct handle equality, we leave open the possibility of quantifiers coming in between scopal operators and the elements they out-scope because of examples such as:
(2) Every child didn't feed the doves in the park. (3) Cindy didn't light every candle last night. (4) That team will probably win every medal.
[Examples (2) and (3) are due to Lee (2009).]
Predicates and Argument Identification
ERS does not make explicit lexical semantics beyond what correlates with grammaticized contrasts, e.g. predicate distinctions that co-vary with clear differences in argument frames and types. For example, in the ERS for The bank with the shortest atm lines is near the river bank, both instances of bank will be represented by _bank_n_of. On the other hand, the examples in (1) show how the distinction between the verb–particle construction, which allows greater flexibility in placement of the up particle, and the ‘standard’ use of up as a directional preposition lead to different ERS representations. This contrast is reflected in different choices for the verbal predicate: two-place _look_v_up, meaning ‘locate’ for (1a) and (1b), vs. ‘plain’ one-place _look_v_1 combined with a two-place directional up predication:
(1a) Kim looked up the answer. (1b) Kim looked the answer up. (1c) Kim looked up the chimney.
Similarly, the so-called causative–inchoative alternation gives rise to systematic predicate distinctions, corresponding to different usages of, for example, accumulate, age, break, burn, and roughly 400 other verbs in the ERG lexicon:
(2a) Kim broke the window. (2b) The window broke.
ERS provides unique argument identification but not thematic interpretation; for each distinct predicate (or set of predicates from related lexical classes), there is a uniform assignment of semantic roles on the basis of positional argument identification. For example, the ERG provides a systematic predicate alternation between two-place _break_v_cause vs. one-place _break_v_1. Because the bland semantic roles labels (ARG1, etc) are intended to have predicate specific interpretations, this allows us to differentiate the ARG1 of _break_v_cause (roughly, ‘agent’) from ARG1 of _break_v_1 (roughly, ‘theme’).
Further ERS Contents
- Tense, aspect
- Plurality, generics, partitives
- Information structure
- Event types/Aktionsart
- Discourse status
- Scope islands
- Restrictive v. non-restrictive modification (rarely grammatically constrained, but constrained in some cases)
Extra-Grammatical Aspects of Meaning
There are many additional layers of meaning representation that are of interest to NLP, but which fall outside the scope of grammar-based processing, because they are not grammatically constrained. Here we outline some of these and describe how ERS can serve as a useful interface for systems or projects annotating (manually or automatically) these additional layers. For more discussion, see Bender, et al., 2015.
Below the Level of Composition
On the one hand, there are layers of meaning representation that the grammar does not constrain because they concern only atoms within the composition. Two prominent examples are fine-grained word sense distinctions and named entity types. Regarding the former, ERS only marks those sense distinctions that are morphosyntactically marked (as described above). Since further sense distinctions could never be disambiguated based on grammatical structure alone, ERS instead provides predicate symbols intended to be underspecified representations of a range of more specific word senses. Word sense annotation or disambiguation efforts should find the ERS a useful starting point (as opposed to surface strings), however, both because of the coarse-grained sense distinctions that are provided and because of the normalization across different inflected forms of the same lemma.
For example, the ERS represents both drew and drawing in (1) with the predicate _draw_v_1. Clearly, for hand annotation, an interface relating ERS elementary predications to surface strings would be required, but this is facilitated by the existence of the ‘characterization’ (character offsets) information associated to each predicate symbol. For automatic word sense disambiguation, the explicit identification of arguments (the horse is the ARG2 of _draw_v_1 meaning ‘make a picture’ and ARG1 of _draw_v_1 meaning ‘pull’) is more helpful than flat word strings.
(1) They drew a horse drawing a wagon.
Never Grammatically Disambiguated
- Distributive v. collective readings of coordination
- Event quantification
Further Semantic Computation Required
- Quantifier scope resolution
- Coreference resolution
- Focus of negation and other focus-sensitive operators
- Presupposition projection
- Coherence relations/rhetorical structure
- Discourse moves/adjacency pairs
Bender, Emily M., Dan Flickinger, Stephan Oepen, Woodley Packard, and Ann A. Copestake. 2015. Layers of Interpretation: On Grammar and Compositionality. In IWCS, pp. 239-249.
Grice, H. P. (1968). Utterer’s meaning, sentence-meaning, and word-meaning. Foundations of Language 4(3), 225 – 242.
Lee, Sunyoung. 2009. Interpreting scope ambiguity in first and second language processing: Universal quantifiers and negation. University of Hawaii at Manoa.
Quine, W. V. O. (1960). Word and Object. Cambridge, MA, USA: MIT Press.
Steedman, Mark. 2012. Taking scope: The natural semantics of quantifiers. MIT Press.