lapps.github.io
LAPPS Vocabulary - Current Issues
Last update: November 20 th), 2014.
There are now four spots where the syntax and semantics of the LAPPS data structures are defined: the source code (mainly the wrappers), the LAPPS Web Services Exchange Vocabulary, the LAPPS JSON schema and the LAPPS Interchange Format specifications. None of these are final and all of them have open issues. In addition, these four sources are not in sync. This page contains open issues for the LAPPS vocabulary: unclarities, ommisions, and inconsistencies relative to the LIF specifications (for those cases where the latter are ahead of the vocabulary).
- Required versus optional attributes
- The vocabulary does not make a distinction between required and optional attributes. We do not yet have a clear idea what attributes should be required. Take for example the Annotation element. We should probably require the id attribute. But we cannot require start and end since under Annotation we may have things like Dependency, for which character offsets are not meaningful.
- Lists versus sets
- When the vocabulary says "List of URIs", does this imply that the list is ordered? Shall we use List and Ordered List, or List and Set?
- Adding vocabulary pages for coreference
- See the [
Coreference in LIF](http://lapps.github.io/interchange/coref-v3.html) page. We need to add two elements.
Coreference Stores all information on coreference. It has two features: - mentions. A list of identifiers. Each identifier points at an object of type Annoation or a subtype thereof.
- representative. An identifier that points to the full form of the elements in the coreference chain, that is, one of the elements of the menrions list.
Markable This is needed in LIF in case we do not have annotation objects that the mentions list can point to. An object then needs to be created from the offsets alone and we called this a Markable object. It has an extra feature named targets that allows it to point to other annotation objects. It may also need features like ENTITY_MENTION_TYPE in order to store what is available in the output of typical coreference services. - Adding vocabulary pages for phrase structure
- See the [
Phrase structure in LIF](http://lapps.github.io/interchange/phrase_structure-v1.html) page. We need to add two elements.
PhraseStructure The container with all phrase structure information. Possibly an immediate subtype of Annotation. It has two features: - categorySet. A meta data feature containing a URI for a particular category set. If defined in the LAPPS vocabulary, this URI would be inside http://vocab.lappsgrid.org/ns/types.
Question: should this be a list of URIs? - constituents. A set of annotation objects of type Constituent.
Constituent The list of constituents defines the tree structure of the parse tree. Each constituent has two features. - label. A category label, defined in the URI that is the value of PhraseStructure#categorySet.
- children. An ordered list of identifiers. Each identifier points to an annotation object of type Constituent.
- categorySet. A meta data feature containing a URI for a particular category set. If defined in the LAPPS vocabulary, this URI would be inside http://vocab.lappsgrid.org/ns/types.
- Adding vocabulary pages for dependency structure
- See the [
Dependency structure in LIF](http://lapps.github.io/interchange/dependencies-v1.html) page. Two new elements:
DependencyStructure The container with all phrase structure information. Possibly an immediate subtype of Annotation. It has two features: - dependencySet. A URI for a particular set of dependency labels. If defined in the LAPPS vocabulary, the URI would be inside http://vocab.lappsgrid.org/ns/types. This is a meta data feature.
- type. The type of dependencies: basic-dependencies, collapsed-dependencies, etcetera. Given Steve Cassidy's insistence on having @type as well as type in his JSON-LD, and the possibility that we go along with this, maybe this feature should be called dependencyType.
- dependencies. A set of annotation objects of type Dependency.
Dependency The list of dependencies defines the dependency structure. Each dependency has three features: - label. A dependency label, defined in the URI that is the value of DependencyStructure#dependencySet.
- governor. An identifier pointing at an object of type Annotation or a subtype thereof. Can be null for the root dependency.
- dependent. An identifier pointing at an object of type Annotation or a subtype thereof.
- The Date object
- Dates have a dateType feature with values like date, datetime and time. With features like this, should we add a meta data feature like dateTypeSet, which has a URI containing type definitions? This question is relevant for other object types as well. Also, we should add a value feature to store the normalized value, as well as other Timex2 and Timex3 features.
- Other Issues
-
Thing layout of the headers is different from the other pages Date URLs in sameAs and similarTo are not links Person URLs in sameAs and similarTo are not links Document URLs in sameAs isocat reference is not a link. Also, Document is where the language property is defined. But in LIF we use @language as a key asscoiated with a text string. We should probably add a Text element. Sentence sentenceType is in red Location locType is in red TextDocument has a different table layout than other pages AudioDocument has a different table layout than other pages Token pos needs to be changed into posTag