Background

In an informal, grass-roots development (with partial support from the WeSearch project) this is an attempt at working towards an ‘encyclopedia’ of semantic analyses developed in the English Resource Grammar (ERG), i.e. what one might call documentation of the downstream interface to the ERG parser, or input interface to its generator.

This page and its descendants, for the time being at least, represent work in progress. Thus, read on with a grain of salt and at your own risk.

Fundamentals

Part of our goals in documenting the ERG semantics is to make explicit important differences in our degrees of confidence in individual analyses. In some cases, current semantic analyses reflect a careful design process (possibly building on supporting background literature or revisions of earlier attempts); in other cases, there may be known minor deficiencies; and for yet another group of semantic phenomena, current analyses may be mere placeholders (‘tying things together’ somehow, without a deep commitment to the specifics of the analysis) or plain broken, i.e. formally not well-formed or otherwise ludicrous.

Discovery Procedure

We developed a discovery procedure which starts from grammar entities (phrase structure rules, lexical rules, and lexical types) in the current version of the ERG to enable a data-driven exploration of semantic phenomena which have received treatments in the ERG to date. The discovery procedure starts by identifying grammar entities which are likely to contribute to the composition of semantic representations that go beyond the basics. The details of what was considered ‘beyond the basics’ in the discovery of semantic phenomena are summarized on the ErgSemantics/Discovery page, together with some reflections on the effectiveness of the current procedure.

We organize this documentation in terms of what we consider semantic phenomena; the emerging inventory of phenomena is available as the ErgSemantics/Inventory, ordered lexicographically.

ERG Semantic Documentation (ESD) Test Suite

One aspect of the documentation produced in this work is a test suite illustrating each identified phenomenon with one or more short, simple sentences, attempting to balance restricted vocabulary size with the clarity of the intended reading of each example. This test suite can be viewed as an extension of the MRS Test Suite.

Semantic Fingerprints

In capturing semantic phenomena (and hopefully also in future work on automated regression testing) we invoke a notion of semantic fingerprints, i.e. characteristics of the MRS configuration that identify the phenomenon. We utilize a compact template language for MRS fingerprints (similar in form to the MRS LaTeX style) that makes the specification of labels and (characterization) links optional, and further allows wild-carding of predicate symbols and role labels (using ‘_’, i.e. just an underscore). For plain N–N compounding, as in garden dog, for example, we take the semantic fingerprint to look something like the following:

  h0:compound[ARG1 x1, ARG2 x2]
  h0:[ARG0 x1]
  [ARG0 x2]

In other words, the phenomenon is characterized by the appearance of the two-place compound relation, linking together another two EPs in the configuration indicated by the shared label h0 (of the compound head and the two-place modifier relation) and the shared referential indices x1 and x2. We do not include the covert quantifier required when the modifier is a non-quantified nominal, or the =q handle constraint holding between the udef_q and the EP introducing x2 (corresponding to garden in our example), because this part of the semantic analysis of the compound construction follows from the analyses of separate phenomena (though ones that are typically co-present with this type of compounding), i.e. general ERG assumptions about the representations of common nouns and quantifiers.

There is search interface for ‘fingerprinting’ collections of ERG analyses, i.e. use the fingerprints (or variants) to search for instances of semantic phenomena, in either the ESD Test Suite or the DeepBank Treebank.

How to Cite this Work

Links

References

ErgSemantics (last edited 2014-11-04 08:54:07 by StephanOepen)

(The DELPH-IN infrastructure is hosted at the University of Oslo)