Skip to content

FeforStandoffAnnotationInterface

BenjaminWaldron edited this page Jun 14, 2006 · 4 revisions

Time

1200 to ??? Weds 14th.

Topics

Interface between standoff XML interface (SMAF) and the deep parser (deep grammar running on LKB/PET).

Sketch:

* Discuss the standard for the process (x) below:

PREPROCESSING --.--> SMAF XML x> deep parser

* Current SMAF looks something like (see also SmafExample):

{{{<?xml version='1.0' encoding='UTF-8'?>

* SMAF edges may take a variety of types: Eg.

  • token
  • morph
  • named-entity
  • pos
  • oscar

* SMAF edges possess a number of properties: id, standoff from/to, lattice source/target + a content blob

* A content blob consists of

  • either TEXT
  • or one or more of the following:
    • RMRS
    • slots
    • (typed feature structure)

* [SciBorg] We have been playing with the use of a config file to define the mapping (x). This looks something like:

;;;; PROCESSOR settings (LKB) (share with PET?)

;; "instantiate chart" with edges of following form:
;; token edges (edgeType='tok'), possess: a string (tokenStr)
;; non-token edges (edgeType='morph'), may possess: stem + partialTree
;; non-token edges which also give rise to token edge in chart (edgeType='tok+morph'):
;;    union of 'tok' and 'morph' above

token.[] -> edgeType='tok' tokenStr=content
morph.[] -> edgeType='morph' stem=content.stem partialTree=content.partial-tree
pos.[] -> edgeType='morph'
oscar.[] -> edgeType='tok+morph' tokenStr=content.surface

;;;; GRAMMAR specific settings (ERG)

;; map SMAF type into type in grammar's type hierarchy
;; map SMAF RMRS content into lexical entry

;; "slot" definitions

define gMap.type ()
define gMap.pred (synsem lkeys keyrel pred)
define gMap.carg (synsem lkeys keyrel carg) STRING
define gMap.rel (synsem lkeys keyrel)

;; syn(sem) type

oscar.[type='compound'] -> gMap.type='n_proper_nale'
oscar.[type='substance'] -> gMap.type='n_proper_nale'
oscar.[type='element'] -> gMap.type='n_proper_nale'
oscar.[type='namender'] -> gMap.type='n_proper_nale'
oscar.[type='adjective'] -> gMap.type='adj_intrans_nale'

;; semantics 

;; either pure native RMRS

oscar.[] -> gRmrs=content.rmrs

;; or slots (REL + CARG)

oscar.[type='compound'] -> gMap.pred='chem_compound_rel'
oscar.[type='substance'] -> gMap.pred='chem_substance_rel'
oscar.[type='element'] -> gMap.pred='chem_element_rel'
oscar.[type='namender'] -> gMap.pred='named_rel'

oscar.[type='compound'] -> gMap.carg=content.surface
oscar.[type='substance'] -> gMap.carg=content.surface
oscar.[type='element'] -> gMap.carg=content.surface
oscar.[type='namender'] ->  gMap.carg=content.surface

;; or feature structures as in MAF??? no

* Collect and examine some examples...

Clone this wiki locally