Discriminators and the Vocabulary

The main Git repository relevant to the vocabulary and discriminators is https://github.com/lapps/vocabulary-pages. In it, there are two files, lapps.vocabulary and lapps.discriminators that are maintained manually and that contain the specifications for the vocabulary and the list of discriminators. Both are configuration files that are input to transformations from those files into HTML pages and Java source code, as depicted below.

The Vocabulary DSL and the Discriminator DSL are Groovy Domain-Specific Languages (DSLs) for the vocabulary and the discriminators. Both these repositories allow you to create a shell script and a jar that can be used by the code in https://github.com/lapps/vocabulary-pages to create the vocabulary pages and Java source code.

  1. The Vocabulary DSL executable generates Java classes that need to be exported to the https://github.com/lapps/org.lappsgrid.vocabulary repository as well as a set of web pages that can then be put online at http://vocab.lappsgrid.org/. In addition it creates a file that contains elements of the vocabulary that are always considered discriminators, basically the names of annotation types. The sources for the Vocabulary DSL are maintained in https://github.com/lappsgrid-incubator/vocabulary-dsl.

  2. The Discriminator DSL executable generates Java classes that need to be exported to the https://github.com/lapps/org.lappsgrid.discriminator repository and one HTML file that can be added to the vocabulary pages, more specifically, the content of http://vocab.lappsgrid.org/discriminators.html. The sources are maintained in https://github.com/lappsgrid-incubator/discriminator-dsl

Both these repositories allow you to create a shell script and a jar that can be used by the code in https://github.com/lapps/vocabulary-pages to create the vocabulary pages and Java source code. Details on how to run the code that creates the HTML pages and Java code are in the README file in https://github.com/lapps/vocabulary-pages.

Vocabulary

Vocabulary pages are defined by top-level elements of the lapps-vocabulary file. Here is the element that defines the http://vocab.lappsgrid.org/NamedEntity page:

NamedEntity {
    parent "Region"
    definition "A phrase that clearly identifies an individual from others that have similar attributes, such as the name of a person, organization, location, artifact, etc. as well as temporal expressions."
    sameAs "$iso/DC-2275"
    discriminator 'ne'
    metadata {
    	namedEntityCategorySet {
    		type "String or URI"
    		description "The set of values that can be used for the category property."
    	}
    }
    properties {
    	category {
    		type "String or URI"
    		required true
    		description "The type of named entity. Typically one of DATE, PERSON, ORGANIZATION, or LOCATION."
    	}
    	type {
    		type "String or URI"
    		description "A type attribute for the entity. For example the type of location or organization."
    	}
        gender {
            type "String or URI"
            description "A value such as male, female, unknown. Ideally a URI referencing a pre-defined descriptor."
        }
    }
}

The properties directly under NamedEntity (parent, definition, etcetera) are a fixed group, but you are allowed to add any property under metadata and properties, as long as the values of those properties themselves have properties from the set {type, required, description}, the default for required is false. The discriminator property is optional and is used to overrule the default used when discriminators are generated from the vocabulary entry, the default is to use the same name but in lower case. There is one more property that is illustrated below.

Date {
    parent "NamedEntity"
    definition "A reference to a date or period."
    similarTo "http://schema.org/Date"
    sameAs "$iso/DC-6123"
    deprecated NE_DEPRECATED
    properties {
        dateType {
            type "String or URI"
            description "Sub-type information such as date, datetime, time, etc. Ideally a URI referencing a pre-defined descriptor."
        }
    }
}

The deprecated property indicates that the Date annotation type will be removed in the future version of the vocabulary. The NE_DEPRECATED is a variable defined earlier in the vocabulary file which sets the message to be displayed:

NE_DEPRECATED = '''Use <link>NamedEntity</link> with appropriate @category and @type attributes instead. This annotation type will be removed in a future version of the vocabulary.'''

Discriminators

Some of the discriminators are generated automatically from the vocabulary by make vocabulary when running this command from the vocabulary-pages repository. For each annotation type a discriminator is generated using the annotation type name in lower case (unless the discriminator property is used). In addition, some of the properties will be added as discriminators. What properties those are is currently hard-wired in the code in VocabDsl.groovy, at the moment only token#pos and token#lemma are added.

Many other non-vocabulary discriminators are defined in lapps.discriminators. Here is an example:

lif {
  uri media('jsonld#lif')
  description "LAPPS Interchange format. (LIF)"
}

Note that media is a closure that is defined at the top of the file, it expands to the full URI for media types (actually, using another embedded closure). The link target does actually exist and has content and the value of description is in that page (http://vocab.lappsgrid.org/ns/media/jsonld#lif).

Templates

The Vocabulary and Discriminator DSLs use templates that control the layout of the HTML pages created. The syntax for the templates is just Groovy, or in particular the syntax for Groovy’s builtin MarkupBuilder class.