Log in Help
Homesaletao 〉 splitch14.html

Chapter 14
Working with Ontologies [#]

GATE provides an API for modeling and manipulating ontologies and comes with two plugins that provide implementations for the API and several tools for simple editing of ontologies and using them for document annotation. Note that for more complex ontology editing, it may be better to use a tool such as Protégé to first edit the ontology outside GATE.

Ontologies in GATE are classified as language resources. In order to create an ontology language resource, the user must first load a plugin containing an ontology implementation.

The following implementations and ontology related tools are provided as plugins:

GATE ontology support aims to simplify the use of ontologies both within the set of GATE tools and for programmers using the GATE ontology API. The GATE ontology API hides the details of the actual backend implementation and allows a simplified manipulation of ontologies by modeling ontology resources as easy-to-use Java objects. Ontologies can be loaded from and saved to various serialization formats.

The GATE ontology support roughly conforms to the representation, manipulation and inference that conforms to what is supported in OWL-Lite (see  http://www.w3.org/TR/owl-features/). This means that a user can represent information in an ontology that conforms to OWL-Lite and that the GATE ontology model will provide inferred information equivalent to what an OWL-Lite reasoner would provide. The GATE ontology model makes an attempt to also to some extend provide useful information for ontologies that do not conform to OWL-Lite: RDFS, OWL-DL, OWL-Full or OWL2 ontologies can be loaded but GATE might ignore part of all contents of those ontologies, or might only provide part of, or incorrect inferred facts for such ontologies. If an ontology is loaded that contains a restriction not supported by OWL-Lite, like oneOf, unionOf, intersectionOf, or complementOf, the classes to which such restrictions apply will not be found in some sitations because the Ontology API has not way of representing such restrictions. For example, such classes will not show up when requesting the direct subclasses of a given class. In other situations, e.g. when retrieved directly using the URI, the class will be found. Using the Ontology plugin with ontologies that do not conform to OWL-Lite should be avoided to avoid such confusing behavior.

The GATE API tries to prevent clients from modifying an ontology that conforms to OWL-Lite to become OWL-DL or OWL-Full and also tries to prevent or warn about some of the most common errors that would make the ontology inconsistent. However, the current implementation is not able to prevent all such errors and has no way of finding out if an ontology conforms to OWL-Lite or is inconsistent.

14.1 Data Model for Ontologies

14.1.1 Hierarchies of Classes and Restrictions

Class hierarchy (or taxonomy) plays the central role in the ontology data model. This consists of a set of ontology classes (represented by OClass objects in the ontology API) linked by subClassOf, superClassOf and equivalentClassAs relations. Each ontology class is identified by an URI (unless it is a restriction or an anonymous class, see below). The URI of each ontology resource must be unique.

Each class can have a set of superclasses and a set of subclasses; these are used to build the class hierarchy. The subClassOf and superClassOf relations are transitive and methods are provided by the API for calculating the transitive closure for each of these relations given a class. The transitive closure for the set of superclasses for a given class is a set containing all the superclasses of that class, as well as all the superclasses of its direct superclasses, and so on until no more are found. This calculation is finite, the upper bound being the set of all the classes in the ontology. A class that has no superclasses is called a top class. An ontology can have several top classes. Although the GATE ontology API can deal with cycles in the hierarchy graph, these can cause problems for processes using the API and probably indicate an error in the definition of the ontology. Also other components of GATE, like the ontology editor cannot deal with cyclic class structures and will terminate with an error. Care should be taken to avoid such situations.

A pair of ontology classes can also have an equivalentClassAs relation, which indicates that the two classes are virtually the same and all their properties and instances should be shared.

A restriction (represented by Restriction objects in the GATE ontology API) is an anonymous class (i.e., the class is not identified by an URI/IRI) and is set on an object or a datatype property to restrict some instances of the specified domain of the property to have only certain values (also known as value constraint) or certain number of values (also known as cardinality restriction) for the property. Thus for each restriction there exists at least three triples in the repository. One that defines resource as a restriction, another one that indicates on which property the restriction is specified, and finally the third one that indicates what is the constraint set on the cardinality or value on the property. There are six types of restrictions:

  1. Cardinality Restriction (owl:cardinalityRestriction): the only valid values for this restriction in OWL-Lite are 0 and 1. A cardinality restriction set to either 0 or 1 implies both a MinCardinality Restriction and a MaxCardinality Restriction set to the same value.

  2. MinCardinality Restriction (owl:minCardinalityRestriction)

  3. MaxCardinality Restriction (owl:maxCardinalityRestriction)

  4. HasValue Restriction (owl:hasValueRestriction)

  5. AllValuesFrom Restriction (owl:allValuesFromRestriction)

  6. SomeValuesFrom Restriction (owl:someValuesFromRestriction)

Please visit the OWL Reference for more detailed information on restrictions.

14.1.2 Instances

Instances, also often called individuals are objects that belong to classes. Like named classes, each instance is identified by an URI. Each instance can belong to one or more classes and can have properties with values. Two instances can have the sameInstanceAs relation, which indicates that the property values assigned to both instances should be shared and that all the properties applicable to one instance are also valid for the other. In addition, there is a differentInstanceAs relation, which declares the instances as disjoint.

Instances are represented by OInstance objects in the API. API methods are provided for getting all the instances in an ontology, all the ones that belong to a given class, and all the property values for a given instance. There is also a method to retrieve a list of classes that the instance belongs to, using either transitive or direct closure.

14.1.3 Hierarchies of Properties

The last part of the data model is made up of hierarchies of properties that can be associated with objects in the ontology. The specification of the type of objects that properties apply to is done through the means of domains. Similarly, the types of values that a property can take are restricted through the definition of a range. A property with a domain that is an empty set can apply to instances of any type (i.e. there are no restrictions given). Like classes, properties can also have superPropertyOf, subPropertyOf and equivalentPropertyAs relations among them.

GATE supports the following property types:

  1. Annotation Property:

    An annotation property is associated with an ontology resource (i.e. a class, property or instance) and can have a Literal as value. A Literal is a Java object that can refer to the URI of any ontology resource or a string (http://www.w3.org/2001/XMLSchema#string) with the specified language or a data type (discussed below) with a compatible value. Two annotation properties can not be declared as equivalent. It is also not possible to specify a domain or range for an annotation property or a super or subproperty relation between two annotation properties. Five annotation properties, predefined by OWL, are made available to the user whenever a new ontology instance is created:

    • owl:versionInfo,

    • rdfs:label,

    • rdfs:comment,

    • rdfs:seeAlso, and

    • rdfs:isDefinedBy.

    In other words, even when the user creates an empty ontology, these annotation properties are created automatically and available to users.

  2. Datatype Property:

    A datatype property is associated with an ontology instance and can have a Literal value that is compatible with its data type . A data type can be one of the pre-defined data types in the GATE ontology API:


    A set of ontology classes can be specified as a property’s domain; in that case the property can be associated with the instance belonging to all of the classes specified in that domain only (the intersection of the set of domain classes).

    Datatype properties can have other datatype properties as subproperties.

  3. Object Property:

    An object property is associated with an ontology instance and has an instance as value. A set of ontology classes can be specified as property’s domain and range. Then the property can only be associated with the instances belonging to all of the classes specified as the domain. Similarly, only the instances that belong to all the classes specified in the range can be set as values.

    Object properties can have other object properties as subproperties.

  4. RDF Property:

    RDF properties are more general than datatype or object properties. The GATE ontology API uses RDFProperty objects to hold datatype properties, object properties, annotation properties or actual RDF properties (rdf:Property).

    Note: The use of RDFProperty objects for creating, or manipulating RDF properties is carried over from previous implementations for compatibility reasons but should be avoided.

All properties (except the annotation properties) can be marked as functional properties, which means that for a given instance in their domain, they can only take at most one value, i.e. they define a function in the algebraic sense. Properties inverse to functional properties are marked as inverse functional. If one likes ontology properties with algebraic relations, the semantics of these become apparent.

14.1.4 URIs

URIs are used to identify resources (instances, classes, properties) in an ontology. All URIs that identify classes, instances, or properties in an ontology must consist of two parts:

URIs uniquely identify resources: each resource can have at most one URI and each URI can be associated with at most one resource.

URIs are represented by OURI objects in the API. The Ontology object provides factory methods to create OURIs from a complete URI string or by appending a name to the default namespace of the ontology. However it is the responsibility of the caller to ensure that any strings that are passed to these factory methods do in fact represent valid URIs. GATE provides some helper methods in the OUtils class to help with encoding and decoding URI strings.

14.2 Ontology Event Model

An Ontology Event Model (OEM) is implemented and incorporated into the new GATE ontology API. Under the new OEM, events are fired when a resource is added, modified or deleted from the ontology.

An interface called OntologyModificationListener is created with five methods (see below) that need to be implemented by the listeners of ontology events.

public void resourcesRemoved(Ontology ontology, String[] resources);

This method is invoked whenever an ontology resource (a class, property or instance) is removed from the ontology. Deleting one resource can also result into the deletion of the other dependent resources. For example, deleting a class should also delete all its instances (more details on how deletion works are explained later). The second parameter, an array of strings, provides a list of URIs of resources deleted from the ontology.

public void resourceAdded(Ontology ontology, OResource resource);

This method is invoked whenever a new resource is added to the ontology. The parameters provide references to the ontology and the resource being added to it.

public void ontologyRelationChanged(Ontology ontology, OResource resource1,
                                   OResource resource2, int eventType);

This method is invoked whenever a relation between two resources (e.g. OClass and OClass, RDFPRoeprty, RDFProeprty, etc) is changed. Example events are addition or removal of a subclass or a subproperty, two classes or properties being set as equivalent or different and two instances being set as same or different. The first parameter is the reference to the ontology, the next two parameters are the resources being affected and the final parameters is the event type. Please refer to the list of events specified below for different types of events.

public void resourcePropertyValueChanged(Ontology ontology,
                                         OResource resource, RDFProperty
                                       property, Object value, int eventType)

This method is invoked whenever any property value is added or removed to a resource. The first parameter provides a reference to the ontology in which the event took place. The second provides a reference to the resource affected, the third parameter provides a reference to the property for which the value is added or removed, the fourth parameter is the actual value being set on the resource and the fifth parameter identifies the type of event.

public void ontologyReset(Ontology ontology)

This method is called whenever ontology is reset. In other words when all resources of the ontology are deleted using the ontology.cleanup method.

The OConstants class defines the static constants, listed below, for various event types.

public static final int OCLASS_ADDED_EVENT;
public static final int ANONYMOUS_CLASS_ADDED_EVENT;
public static final int HAS_VALUE_RESTRICTION_ADDED_EVENT;
public static final int SUB_CLASS_ADDED_EVENT;
public static final int SUB_CLASS_REMOVED_EVENT;
public static final int EQUIVALENT_CLASS_EVENT;
public static final int ANNOTATION_PROPERTY_ADDED_EVENT;
public static final int DATATYPE_PROPERTY_ADDED_EVENT;
public static final int OBJECT_PROPERTY_ADDED_EVENT;
public static final int TRANSTIVE_PROPERTY_ADDED_EVENT;
public static final int SYMMETRIC_PROPERTY_ADDED_EVENT;
public static final int OBJECT_PROPERTY_VALUE_ADDED_EVENT;
public static final int RDF_PROPERTY_VALUE_ADDED_EVENT;
public static final int RDF_PROPERTY_VALUE_REMOVED_EVENT;
public static final int EQUIVALENT_PROPERTY_EVENT;
public static final int OINSTANCE_ADDED_EVENT;
public static final int DIFFERENT_INSTANCE_EVENT;
public static final int SAME_INSTANCE_EVENT;
public static final int RESOURCE_REMOVED_EVENT;
public static final int SUB_PROPERTY_ADDED_EVENT;
public static final int SUB_PROPERTY_REMOVED_EVENT;

An ontology is responsible for firing various ontology events. Object wishing to listen to the ontology events must implement the methods above and must be registered with the ontology using the following method.

addOntologyModificationListener(OntologyModificationListener oml);

The following method cancels the registration.

removeOntologyModificationListener(OntologyModificationListener oml);

14.2.1 What Happens when a Resource is Deleted?

Resources in an ontology are connected with each other. For example, one class can be a sub or superclass of another classes. A resource can have multiple properties attached to it. Taking these various relations into account, change in one resource can affect other resources in the ontology. Below we describe what happens (in terms of what does the GATE ontology API do) when a resource is deleted.

14.3 The Ontology Plugin [#]

The plugin Ontology contains the current ontology API implementation. It is based on a backend that uses Sesame version 2 and OWLIM version 3.

The Ontology plugin depends on libraries that are not available in the Central Maven Repository, so the plugin must be downloaded and installed separately. You can download released versions of the plugin from GitHub, and the latest snapshot version from our snapshot repository. Unpacking the downloaded zip file will create a new directory gateplugin-Ontology-version , and that directory should be loaded as a CREOLE plugin – open the plugin manager, click the “+” button at the top left, switch to the “directory URL” tab, and select the ontology plugin directory you just unpacked. This will add the plugin to the known plugins list and you can then select “load now” and/or “load always” as appropriate.

Once the plugin is loaded, the context menu for Language Resources will include the following ontology language resources:

Each of these language resources is explained in more detail in the following sections.

To make the plugin available to your GATE Embedded application, load the plugin prior to creating one of the ontology language resources using code similar to the following:

1// Find the directory for the Ontology plugin 
2File ontologyPlugin = new File("/path/to/gateplugin-Ontology"); 
3// Load the plugin from that directory 
4Gate.getCreoleRegister().registerPlugin(new Plugin.Directory( 
5      ontologyPlugin.toURI().toURL()));

Alternatively, if you load a saved application state that was saved with the plugin loaded, then it will be re-loaded automatically as part of that process.

14.3.1 Upgrading from previous versions of GATE [#]

If you have a saved GATE application from GATE version 8.4.1 or earlier that uses the Ontology plugin that was built in to GATE at that time, you will need to upgrade your application to make it work with GATE 8.5 and later.

With the Ontology plugin loaded, there will be an “Ontologies” sub-menu in the GATE Developer “Tools” menu, with an entry to “Upgrade old saved application”. Select this option and locate the existing xgapp file in the file chooser. The upgrade backs up the old xgapp file with a “.onto.bak” extension and replaces all references to the old “built-in” Ontology plugin with the version of the plugin you currently have loaded.

Note that this procedure is specific to the Ontology plugin and is in addition to the standard upgrade procedure detailed in section 3.9.5 used for the other standard GATE plugins – to fully upgrade an application that uses ontologies you must run both the standard upgrader and the ontology plugin upgrader in sequence in order to obtain a final xgapp that will work with GATE 8.5. You can run the two upgrades either way around on the same file, we suggest running the standard upgrader first (skipping the Ontology plugin) and then the ontology plugin upgrader second, which will leave your original pre-8.5 application backed up with a “.bak” extension.

14.3.2 The OWLIMOntology Language Resource [#]

The OWLIMOntology language resource is the main ontology language resource provided by the plugin and creates an in-memory store backed by files in a directory on the file system to hold the ontology data.

To create a new OWLIM Ontology resource, select ‘OWLIM Ontology’ from the right-click ‘New’ menu for language resources. A dialog as shown in Figure 14.1 appears with the following parameters to fill in or change:


Figure 14.1: The New OWLIM Ontology Dialog

Note: you could create a language resource such as OWLIM Ontology from GATE Developer successfully, but you will not be able to browse/edit the ontology unless you loaded Ontology Tools plugin beforehand.

Additional ontology data can be loaded into an existing ontology language resource by selecting the ‘Load’ option from the language resource’s context menu. This will show the dialog shown in figure 14.2. The parameters in this dialog correspond to the parameters in the dialog for creating a new ontology with the addition of one new parameter: ‘load as import’. If this parameter is checked, the ontology data is loaded specifically as an ontology import. Ontology imports can be excluded from what is saved at a later time.


Figure 14.2: The Load Ontology Dialog

Figure 14.3 shows the ontology save dialog that is shown when the option ‘Save as…’ is selected from the language resource’s context menu. The parameter ‘include imports’ allows the user to specify if the data that has been loaded through imports should be included in the saved data or not.


Figure 14.3: The Save Ontology Dialog

14.3.3 The ConnectSesameOntology Language Resource [#]

This ontology language resource can be created from either a directory on the local file system that holds an ontology backing store (as created in the ‘data directory’ for the ‘OWLIM Ontology’ language resource), or from a sesame repository on a server that holds an OWLIM ontology store.

This is very useful when using very large ontologies with GATE. Loading a very large ontology from a serialized format takes a significant amount of time because the file has to be deserialized and all implied facts have to get generated. Once an ontology has been loaded into a persisting OWLIMOntology language resource, the ConnectSesameOntology language resource can be used with the directory created to re-connect to the already de-serialized and inferred data much faster.

Figure 14.4 shows the dialog for creating a ConnectSesameOntology language resource.

Note that this ontology language resource is only supported when connected with an OWLIM3 repository configured to use the owl-max ruleset and with partialRDFS optimizations disabled! Connecting to any other repository is experimental and for expert users only! Also note that connecting to a repository that is already in use by GATE or any other application is not supported and might result in unwanted or erroneous behavior!


Figure 14.4: The New ConnectSesameOntology Dialog

14.3.4 The CreateSesameOntology Language Resource [#]

This ontology language resource can be directly created from a Sesame2 repository configuration file. This is an experimental language resource intended for expert users only. This can be used to create any kind of Sesame2 repository, but the only repository configuration supported by GATE and the GATE ontology API is an OWLIM repository configured to use the owl-max ruleset and with partialRDFS optimizations disabled. The dialog for creating this language resource is shown in Figure 14.5.


Figure 14.5: The New CreateSesameOntology Dialog

14.3.5 The OWLIM2 Backwards-Compatible Language Resource [#]

This language resource is shown as “OWLIM Ontology DEPRECATED” in the ‘New Language Resource’ submenu from the ‘File’ menu. It provides the “OWLIM Ontology” language resource in a way that attempts maximum backwards-compatibility with the ontology language resource provided by prior versions or the Ontology_OWLIM2 language resource. This means, the class name is identical to those language resources gate.creole.ontology.owlim.OWLIMOntologyLR) and the parameters are made compatible. This means that the parameter defaultNameSpace is added as an alias for the parameter baseURI (also the methods setPersistsLocation and getPersistLocation are available for legacy Java code that expects them, but the persist location set that way is not actually used).

In addition, this language resource will still automatically add the resource name of a resource as the String value for the annotation property “label”.

14.3.6 Using Ontology Import Mappings [#]

If an ontology is loaded that contains the URIs of imported ontologies using owl:imports, the plugin will try to automatically resolve those URIs to URLs and load the ontology file to be imported from the location corresponding to the URL. This is done transitively, i.e. import specifications contained in freshly imported ontologies are resolved too.

In some cases one might want to suppress the import of certain ontologies or one might want to load the data from a different locatin, e.g. from a file on the local file system instead. With the OWLIMOntology language resource this can be achieved by specifying an import mappings file when creating the ontology.

An import mappings file (see figure 14.6 for an example) is a plain file that maps specific import URIs to URLs or to nothing at all. Each line that is not empty or does not start with a hash (#) indicating a comment line must contain a URI. If the URI is not followed by anything, this URI will be ignored when processing imports. If the URI is followed by something, this is interpreted as a URL that is used for resolving the import of the URI. Local files can be specified as file: URLs or by just giving the absolute or relative pathname of the file in Linux path notation (forward slashes as path separators). At the moment, filenames with embedded whitespace are not supported. If a pathname is relative it will be resolved relative to the directory which contains the mappings file.

# map this import to another web url
http://proton.semanticweb.org/2005/04/protont http://mycompany.com/owl/protont.owl

# map this import to a file in the same directory as the mappings file
http://proton.semanticweb.org/2005/04/protons protons.owl

# ignore this import

Figure 14.6: An example import mappings file

14.3.7 Using BigOWLIM [#]

The GATE ontology plugin is based on SwiftOWLIM for storing the ontology and managing inference. SwiftOWLIM is an in-memory store and the maximum size of ontologies that can be stored is limited by the available memory.

BigOWLIM (see http://www.ontotext.com/owlim/big/) can handle huge ontologies and is not limited by available memory. BigOWLIM is a commercial product and needs to be separately obtained and installed for use with the GATE ontology plugin. See the BigOWLIM installation guide on how to set up BigOWLIM on a Tomcat server and how to create BigOWLIM on the server with the Sesame console program.

The ontology plugin can easily and without any additional installation be used with BigOWLIM repositories by using the ConnectSesameOntology LR (see section 14.3.3) to connect to a BigOWLIM repository on a remote Tomcat server.

14.3.8 The sesameCLI command line interface [#]

The script sesameCLI is located in the bin subdirectory of the Ontology plugin directory and provides basic functionality for creating repositories, importing, exporting, querying and updating of GATE ontologies, either on a saved local file repository (saved with the persistent parameter of the OWLIM Ontology LR set to true) or a repository on a server from the command line. It can be used on any machine that supports bash scripts.

To show usage information run the command with the –help option. Some options can be specified in a long form using double hyphens or a single-letter form using a single hyphen, for example, -e can be used in place of –do or -u in place of –serverURL.

The main option is –do which specifies which action should be carried out. For all actions the ontology must be specified as a combination of either the URL of a Sesame web server with serverURL or the directory of a local Sesame repository directory with sesameDir and the name of the repository with –id.

The –do option supports the following values:


Clear the repository and remove all triples from it.


Perform an ASK query. The result of the ASK query is printed to standard output.


Perform a SELECT query. The result of the query is printed in tabular form to standard output. The default column separation character is a tab and if the column separator or a new line character occurs in a value it is changed to a space.


Perform a SPARQL update query (INSERT, DELETE)


Import data into the repository from a file


Export data from the repository into a filenames


Create a new repository using a TURTLE repository configuration file.


Delete a repository. Note that due to a Sesame limitation, the actual files for the repository may not be removed from the disk for remote ontologies on a server.


Print the list of all repository names to standard output.

The sesameCLI command line tool is meant as an easy way to perform some basic operations from the command line and for basic testing. The functions it supports and its command line options may change in future versions.

14.4 GATE Ontology Editor [#]

GATE’s ontology support also includes a viewer/editor that can be used within GATE Developer to navigate an ontology and quickly inspect the information relating to any of the objects defined in it—classes and restrictions, instances and their properties. Also, resources can be deleted and new resources can be added through the viewer.

The ontology viewer is part of the “Ontology Tools” plugin, which is visible by default in the GATE plugin manager, however you will also need to load the ontology implementation plugin (see section 14.3) in order to be able to load the ontology LRs you want to view.

Note: To make it possible to show a loaded ontology in the ontology editor, the Ontology Tools plugin must be loaded before the ontology language resource is created.


Figure 14.7: The GATE Ontology Viewer

The viewer is divided into two areas. One on the left shows separate tabs for hierarchy of classes and instances and for (as of Gate 4) hierarchy of properties. The view on right hand side shows the details pertaining of the object currently selected in the other two.

First tab on the left view displays a tree which shows all the classes and restrictions defined in the ontology. The tree can have several root nodes—one for each top class in the ontology. The same tree also shows each instances for each class. Note: Instances that belong to several classes are shown as children of all the classes they belong to.

Second tab on the left view displays a tree of all the properties defined in the ontology. This tree can also have several root nodes—one for each top property in the ontology. Different types of properties are distinguished by using different icons.

Whenever an item is selected in the tree view, the right-hand view is populated with the details that are appropriate for the selected object. For an ontology class, the details include the brief information about the resource such as the URI of the selected class, type of the selected class etc., set of direct superclasses, the set of all superclasses using the transitive closure, the set of direct subclasses, the set of all the subclasses, the set of equivalent classes, the set of applicable property types, the set of property values set on the selected class, and the set of instances that belong to the selected class. For a restriction, in addition to the above information, it displays on which property the restriction is applicable to and what type of the restriction that is.

For an instance, the details displayed include the brief information about the instance, set of direct types (the list of classes this instance is known to belong to), the set of all types this instance belongs to (through the transitive closure of the set of direct types), the set of same instances, the set of different instances and the values for all the properties that are set.

When a property is selected, different information is displayed in the right-hand view according to the property type. It includes the brief information about the property itself, set of direct superproperties, the set of all superproperties (obtained through the transitive closure), the set of direct subproperties, the set of all subproperties (obtained through the transitive closure), the set of equivalent properties, and domain and range information.

As mentioned in the description of the data model, properties are not directly linked to the classes, but rather define their domain of applicability through a set of domain restrictions. This means that the list of properties should not really be listed as a detail for class objects but only for instances. It is however quite useful to have an indication of the types of properties that could apply to instances of a given class. Because of the semantics of property domains, it is not possible to calculate precisely the list of applicable properties for a given class, but only an estimate of it. If a property for instance requires its domain instances to belong to two different classes then it cannot be known with certitude whether it is applicable to either of the two classes—it does not apply to all instances of any of those classes, but only to those instances the two classes have in common. Because of this, such properties will not be listed as applicable to any class.

The information listed in the details pane is organised in sub-lists according to the type of the items. Each sub-list can be collapsed or expanded by clicking on the little triangular button next to the title. The ontology viewer is dynamic and will update the information displayed whenever the underlying ontology is changed through the API.

When you double click on any resource in the details table, the respective resource is selected in the class or in the property tree and the selected resource’s details are shown in the details table. To change a property value, user can double click on a value of the property (second column) and the relevant window is shown where user is asked to provide a new value. Along with each property value, a button (with red X caption) is provided. If user wants to remove a property value he or she can click on the button and the property value is deleted.

A new toolbar has been added at the top of the ontology viewer, which contains the following buttons to add and delete ontology resources:

The tree components allow the user to select more than one node, but the details table on the right-hand side of the GATE Developer GUI only shows the details of the first selected node. The buttons in the toolbar are enabled and disabled based on users’ selection of nodes in the tree.

  1. Creating a new top class:

    A window appears which asks the user to provide details for its namespace (default name space if specified), and class name. If there is already a class with same name in ontology, GATE Developer shows an appropriate message.

  2. Creating a new subclass:

    A class can have multiple super classes. Therefore, selecting multiple classes in the ontology tree and then clicking on the ‘SC’ button, automatically considers the selected classes as the super classes. The user is then asked for details for its namespace and class name.

  3. Creating a new instance:

    An instance can belong to more than one class. Therefore, selecting multiple classes in the ontology tree and then clicking on the ‘I’ button, automatically considers the selected classes as the type of new instance. The user is then prompted to provide details such as namespace and instance name.

  4. Creating a new restriction:

    As described above, restriction is a type of an anonymous class and is specified on a property with a constraint set on either the number of values it can take or the type of value allowed for instances to have for that property. User can click on the blue ‘R’ square button which shows a window for creating a new restriction. User can select a type of restriction, property and a value constraint for the same. Please note that restrictions are considered as anonymous classes and therefore user does not have to specify any URI for the same but restrictions are named automatically by the system.

  5. Creating a new property:

    Editor allows creating five different types of properties:

    • Annotation property: Since an annotation property cannot have any domain or range constraints, clicking on the new annotation property button brings up a dialog that asks the user for information such as the namespace and the annotation property name.

    • Datatype property: A datatype property can have one or more ontology classes as its domain and one of the pre-defined datatypes as its range. Selecting one or more classes and clicking on the new Datatype property icon, brings up a window where the selected classes in the tree are taken as the property’s domain. The user is then asked to provide information such as the namespace and the property name. A drop down box allows users to select one of the data types from the list.

    • Object, Symmetric and Transitive properties: These properties can have one or more classes as their domain and range. For a symmetric property the domain and range are the same. Clicking on any of these options brings up a window where user is asked to provide information such as the namespace and the property name. The user is also given two buttons to select one or more classes as values for domain and range.

  6. Removing the selected resources:

    All the selected nodes are removed when user clicks on the ‘X’ button. Please note that since ontology resources are related in various ways, deleting a resource can affect other resources in the ontology; for example, deleting a resource can cause other resources in the same ontology to be deleted too.

  7. Searching in ontology:

    The Search button allows users to search for resources in the ontology. A window pops up with an input text field that allows incremental searching. In other words, as user types in name of the resource, the drop-down list refreshes itself to contain only the resources that start with the typed string. Selecting one of the resources in this list and pressing OK, selects the appropriate resource in the editor. The Search function also allows selecting resources by the property values set on them.

  8. Refresh Ontology

    The refresh button reloads the ontology and updates the editor.

  9. Setting properties on instances/classes:

    Right-clicking on an instance brings up a menu that provides a list of properties that are inherited and applicable to its classes. Selecting a specific property from the menu allows the user to provide a value for that property. For example, if the property is an Object property, a new window appears which allows the user to select one or more instances which are compatible to the range of the selected property. The selected instances are then set as property values. For classes, all the properties (e.g. annotation and RDF properties) are listed on the menu.

  10. Setting relations among resources:

    Two or more classes, or two or more properties, can be set as equivalent; similarly two or more instances can be marked as the same. Right-clicking on a resource brings up a menu with an appropriate option (Equivalent Class for ontology classes, Same As Instance for instances and Equivalent Property for properties) which when clicked then brings up a window with a drop down box containing a list of resources that the user can select to specify them as equivalent or the same.

14.5 Ontology Annotation Tool [#]

The Ontology Annotation Tool (OAT) is a GATE plugin available from the Ontology Tools plugin set, which enables a user to manually annotate a text with respect to one or more ontologies. The required ontology must be selected from a pull-down list of available ontologies.

The OAT tool supports annotation with information about the ontology classes, instances and properties.

14.5.1 Viewing Annotated Text [#]

Ontology-based annotations in the text can be viewed by selecting the desired classes or instances in the ontology tree in GATE Developer (see Figure 14.8). By default, when a class is selected, all of its sub-classes and instances are also automatically selected and their mentions are highlighted in the text. There is an option to disable this default behaviour (see Section 14.5.4).


Figure 14.8: Viewing Ontology-Based Annotations

Figure 14.8 shows the mentions of each class and instance in a different colour. These colours can be customised by the user by clicking on the class/instance names in the ontology tree. It is also possible to expand and collapse branches of the ontology.

14.5.2 Editing Existing Annotations [#]


Figure 14.9: Editing Existing Annotations

In order to view the class/instance of a highlighted annotation in the text (e.g., United States - see Figure 14.9), hover the mouse over it and an edit dialogue will appear. It shows the current class or instance (Country in our example) and allows the user to delete it or change it. To delete an existing annotation, press the Delete button.

A class or instance can be changed by starting to type the name of the new class in the combo-box. Then it displays a list of available classes and instances, which start with the typed string. For example, if we want to change the type from Country to Location, we can type ‘Lo’ and all classes and instances which names start with Lo will be displayed. The more characters are typed, the fewer matching classes remain in the list. As soon as one sees the desired class in the list, it is chosen by clicking on it.

It is possible to apply the changes to all occurrences of the same string and the same previous class/instance, not just to the current one. This is useful when annotating long texts. The user needs to make sure that they still check the classes and instances of annotations further down in the text, in case the same string has a different meaning (e.g., bank as a building vs. bank as a river bank).

The edit dialogue also allows correcting annotation offset boundaries. In other words, user can expand or shrink the annotation offsets’ boundaries by clicking on the relevant arrow buttons.

OAT also allows users to assign property values as annotation features to the existing class and instance annotations. In the case of class annotation, all annotation properties from the ontology are displayed in the table. In the case of instance annotations, all properties from the ontology applicable to the selected instance are shown in the table. The table also shows existing features of the selected annotation. User can then add, delete or edit any value(s) of the selected feature. In the case of a property, user is allowed to provide an arbitrary number of values. User can, by clicking on the editList button, add, remove or edit any value to the property. In case of object properties, users are only allowed to select values from a pre-selected list of values (i.e. instances which satisfy the selected property’s range constraints).

14.5.3 Adding New Annotations [#]


Figure 14.10: Add New Annotation

New annotations can be added in two ways: using a dialogue (see Figure 14.10) or by selecting the text and clicking on the desired class or instance in the ontology tree.

When adding a new annotation using the dialogue, select a text and after a very short while, if the mouse is not moved, a dialogue will appear (see Figure 14.10). Start typing the name of the desired class or instance, until you see it listed in the combo-box, then select it with the mouse. This operation is the same, as in changing the class/instance of an existing annotation. One has the option of applying this choice to the current selection only or to all mentions of the selected string in the current document (Apply to All check box).

User can also create an instance from the selected text. If user checks the ‘create instance’ checkbox prior to selecting the class, the selected text is annotated with the selected class and a new instance of the selected class (with the name equivalent to the selected text) is created (provided there isn’t any existing instance available in the ontology with that name).

14.5.4 Options [#]


Figure 14.11: Tool Options

There are several options that control the OAT behaviour (see Figure 14.11):

14.6 Relation Annotation Tool [#]

This tool is designed to annotate a document with ontology instances and to create relations between annotations with ontology object properties. It is close and compatible with OAT but focus on relations between annotations, see section 14.5 for OAT.

To use it you must load the Ontology Tools plugin, load a document and an ontology then show the document and in the document editor click on the button named ‘RAT-C’ (Relation Annotation Tool Class view) which will also display the ‘RAT-I’ view (Relation Annotation Tool Instance view).

14.6.1 Description of the two views


Figure 14.12: Relation Annotation Tool vertical and horizontal document views

The right vertical view shows the loaded ontologies as trees.

To show/hide the annotations in the document, use the class checkbox. The selection of a class and the ticking of a checkbox are independent and work the same as in the annotation sets view.

To change the annotation set used to load/save the annotations, use the drop down list at the bottom of the vertical view.

To hide/show the classes in the tree in order to decrease the amount of elements displayed, use the context menu on classes selection. The setting is saved in the user preferences.

The bottom horizontal view shows two tables: one for instances and one for properties. The instances table shows the instances and their labels for the selected class in the ontology trees and the properties table shows the properties values for the selected instance in the instances table.

Two buttons allow to add a new instance from the text selection in the document or as a new label for the selected instance.

To filter on instance labels, use the filter text field. You can clear the field with the X button at the end of the field.

You can use ‘Show In Ontology Editor’ on the context menu of an instance in the instance table. Then in the ontology editor you can add class or object properties.

14.6.2 Create new annotation and instance from text selection

14.6.3 Create new annotation and add label to existing instance from text selection

14.6.4 Create and set properties for annotation relation

14.6.5 Delete instance, label or property

14.6.6 Differences with OAT and Ontology Editor

This tool is very close to OAT but without the annotation editor popup and instead a bottom tables view, with multiple ontologies support, with only instance annotation and no class annotation.

To make OAT compatible with this tool you must use ‘Mention’ as annotation type, ‘class’ and ‘inst’ as feature names. They are the defaults in OAT. You must also select the same annotation set in the drop down list at the bottom right corner.

You should enable the option ‘Selected Text As Property Value’ in the Options panel of OAT. So it will add a label from the selected text for each instance.

The ontology editor is useful to check that an instance is correctly added to the ontology and to add new annotation relation as object property.

14.7 Using the ontology API [#]

The following code demonstrates how to use the GATE API to create an instance of the OWLIM Ontology language resource.

1// step 1: initialize GATE 
2if(!Gate.isInitialized()) { Gate.init(); } 
4// step 2: load the Ontology plugin that contains the implementation 
5File ontoHome = new File("/path/to/gateplugin-Ontology"); 
7    new Plugin.Directory(ontoHome.toURI().toURL())); 
9// step 3: set the parameters 
10FeatureMap fm = Factory.newFeatureMap(); 
11fm.put("rdfXmlURL", urlOfTheOntology); 
12fm.put("baseURI", theBaseURI); 
13fm.put("mappingsURL", urlOfTheMappingsFile); 
14// .. any other parameters 
16// step 4: finally create an instance of ontology 
17Ontology ontology = (Ontology) 
19                   fm); 
21// retrieving a list of top classes 
22Set<OClass> topClasses = ontology.getOClasses(true); 
24// for all top classes, printing their direct sub classes and print 
25// their URI or blank node ID in turtle format. 
26for(OClass c : topClasses) { 
27   Set<OClass> dcs = c.getSubClasses(OConstants.Closure.DIRECT_CLOSURE); 
28   for(OClass sClass : dcs) { 
29        System.out.println(sClass.getONodeID().toTurtle()); 
30   } 
33// creating a new class from a full URI 
34OURI aURI1 = ontology.createOURI("http://sample.en/owlim#Organization"); 
35OClass organizationClass = ontology.addOClass(aURI1); 
37// create a new class from a name and the default name space set for 
38// the ontology 
39OURI aURI2 = ontology.createOURIForName("someOtherName"); 
40OClass someOtherClass = ontology.addOClass(aURI2); 
42// set the label for the class 
43someOtherClass.setLabel("some other name", OConstants.ENGLISH); 
45// creating a new Datatype property called name 
46// with domain set to Organization 
47// with datatype set to string 
48URI dURI = new URI("http://sample.en/owlim#Name", false); 
49Set<OClass> domain = new HashSet<OClass>(); 
51DatatypeProperty dp = 
52  ontology.addDatatypeProperty(dURI, domain, Datatype.getStringDataType()); 
54// creating a new instance of class organization called IBM 
55OURI iURI = ontology.createOURI("http://sample.en/owlim#IBM"); 
56OInstance ibm = Ontology.addOInstance(iURI, organizationClass); 
58// assigning a Datatype property, name to ibm 
60    new Literal("IBM Corporation", dp.getDataType()); 
62// get all the set values of all Datatype properties on the instance ibm 
63Set<DatatypeProperty> dps = Ontology.getDatatypeProperties(); 
64for(DatatypeProperty dp : dps) { 
65 List<Literal> values = ibm.getDatatypePropertyValues(dp); 
66 System.out.println("DP : "+dp.getOURI()); 
67 for (Literal l : values) { 
68   System.out.println("Value : "+l.getValue()); 
69   System.out.println("Datatype : "+ l.getDataType().getXmlSchemaURI()); 
70 } 
73// export data to a file in Turtle format 
74BufferedWriter writer = new BufferedWriter(new FileWriter(someFile)); 
75ontology.writeOntologyData(writer, OConstants.OntologyFormat.TURTLE); 

14.8 Ontology-Aware JAPE Transducer [#]

One of the GATE components that makes use of the ontology support is the JAPE transducer (see Chapter 8). Combining the power of ontologies with JAPE’s pattern matching mechanisms can ease the creation of applications.

In order to use ontologies with JAPE, one needs to load an ontology in GATE before loading the JAPE transducer. Once the ontology is known to the system, it can be set as the value for the optional ontology parameter for the JAPE grammar. Doing so alters slightly the way the matching occurs when the grammar is executed. If a transducer is ontology-aware (i.e. it has a value set for the ’ontology’ parameter) it will treat all occurrences of the feature named class differently from the other features of annotations. The values for the feature class on any type of annotation will be considered as referring to classes in the ontology as follows:

For example, if the default namespace of the ontology is http://gate.ac.uk/example# then a class feature with the value “Person” refers to the http://gate.ac.uk/example#Person class in the ontology. If the ontology imports other ontologies then it may be useful to define templates for the various namespace URIs to avoid excessive repetition. There is an example of this for the PROTON ontology in section 8.1.6.

In ontology-aware mode the matching between two class values will not be based on simple equality but rather hierarchical compatibility. For example if the ontology contains a class named ‘Politician’, which is a sub class of the class ‘Person’, then a pattern of {Entity.class == ‘Person’} will successfully match an annotation of type Entity with a feature class having the value ‘Politician’. If the JAPE transducer were not ontology-aware, such a test would fail.

This behaviour allows a larger degree of generalisation when designing a set of rules. Rules that apply several types of entities mentioned in the text can be written using the most generic class they apply to and need not be repeated for each subtype of entity. One could have rules applying to Locations without needing to know whether a particular location happens to be a country or a city.

If a domain ontology is available at the time of building an application, using it in conjunction with the JAPE transducers can significantly simplify the set of grammars that need to be written.

The ontology does not normally affect actions on the right hand side of JAPE rules, but when Java is used on the right hand side, then the ontology becomes accessible via a local variable named ontology, which may be referenced from within the right-hand-side code.

In Java code, the class feature should be referenced using the static final variable, LOOKUP_CLASS_FEATURE_NAME, that is defined in gate.creole.ANNIEConstants.

14.9 Annotating Text with Ontological Information [#]

The ontology-aware JAPE transducer enables the text to be linked to classes in an ontology by means of annotations. Essentially this means that each annotation can have a class and ontology feature. To add the relevant class feature to an annotation is very easy: simply add a feature ‘class’ with the classname as its value. To add the relevant ontology, use ontology.getURL().

Below is a sample rule which looks for a location annotation and identifies it as a ‘Mention’ annotation with the class ‘Location’ and the ontology loaded with the ontology-aware JAPE transducer (via the runtime parameter of the transducer).

Rule: Location


  // create the ontology and class features
  FeatureMap features = Factory.newFeatureMap();
  features.put("ontology", ontology.getURL());
  features.put("class", "Location");

  // create the new annotation
  try {
       mentionAnnots.lastNode().getOffset(), "Mention", features);
  catch(InvalidOffsetException e) {
    throw new JapeException(e);

14.10 Populating Ontologies [#]

Another typical application that combines the use of ontologies with NLP techniques is finding mentions of entities in text. The scenario is that one has an existing ontology and wants to use Information Extraction to populate it with instances whenever entities belonging to classes in the ontology are mentioned in the input texts.

Let us assume we have an ontology and an IE application that marks the input text with annotations of type ‘Mention’ having a feature ‘class’ specifying the class of the entity mentioned. The task we are seeking to solve is to add instances in the ontology for every Mention annotation.

The example presented here is based on a JAPE rule that uses Java code on the action side in order to access directly the GATE ontology API:

1Rule: FindEntities 
5  //find the annotation matched by LHS 
6  //we know the annotation set returned 
7  //will always contain a single annotation 
8  Annotation mentionAnn = mentionAnnots.iterator().next(); 
10  //find the class of the mention 
11  String className = (String)mentionAnn.getFeatures(). 
12    get(gate.creole.ANNIEConstants.LOOKUP_CLASS_FEATURE_NAME); 
13  // should normalize class name and avoid invalid class names here! 
14  OClass aClass = ontology.getOClass(ontology.createOURIForName(className)); 
15  if(aClass == null) { 
16    System.err.println("Error class \"" + className + "\" does not exist!"); 
17    return; 
18  } 
20  //find the text covered by the annotation 
21  String theMentionText = gate.Utils.stringFor(doc, mentionAnn); 
23  // when creating a URI from text that came from a document you must take care 
24  // to ensure that the name does not contain any characters that are illegal 
25  // in a URI.  The following method does this nicely for English but you may 
26  // want to do your own normalization instead if you have nonEnglish text. 
27  String mentionName = OUtils.toResourceName(theMentionText); 
29  // get the property to store mention texts for mention instances 
30  DatatypeProperty prop = 
31    ontology.getDatatypeProperty(ontology.createOURIForName("mentionText")); 
33  OURI mentionURI = ontology.createOURIForName(mentionName); 
34  // if that mention instance does not already exist, add it 
35  if (!ontology.containsOInstance(mentionURI)) { 
36    OInstance inst = ontology.addOInstance(mentionURI, aClass); 
37    // add the actual mention text to the instance 
38    try { 
39      inst.addDatatypePropertyValue(prop, 
40        new Literal(theMentionText, OConstants.ENGLISH)); 
41    } 
42    catch(InvalidValueException e) { 
43      throw new JapeException(e); 
44    } 
45  } 

This will match each annotation of type Mention in the input and assign it to a label ‘mention’. That label is then used in the right hand side to find the annotation that was matched by the pattern (lines 5–10); the value for the class feature of the annotation is used to identify the ontological class name (lines 12–14); and the annotation span is used to extract the text covered in the document (lines 16–26). Once all these pieces of information are available, the addition to the ontology can be done. First the right class in the ontology is identified using the class name (lines 28–37) and then a new instance for that class is created (lines 38–50).

Beside JAPE, another tool that could play a part in this application is the Ontological Gazetteer, see Section 13.3, which can be useful in bootstrapping the IE application that finds entity mentions.

The solution presented here is purely pedagogical as it does not address many issues that would be encountered in a real life application solving the same problem. For instance, it is naïve to assume that the name for the entity would be exactly the text found in the document. In many cases entities have several aliases – for example the same person name can be written in a variety of forms depending on whether titles, first names, or initials are used. A process of name normalisation would probably need to be employed in order to make sure that the same entity, regardless of the textual form it is mentioned in, will always be linked to the same ontology instance.

For a detailed description of the GATE ontology API, please consult the JavaDoc documentation.