Chapter 12
Developing GATE [#]
This chapter describes ways of getting involved in and contributing to the GATE project. Sections 12.1 and 12.2 are good places to start. Sections 12.3 and 12.5 describe protocol and provide information for committers; we cover creating new plugins and updating this user guide. See Section 12.2 for information on becoming a committer.
12.1 Reporting Bugs and Requesting Features [#]
The source code and issue trackers for GATE can be found on GitHub. The code is split across many repositories:
- gate-core
- the core GATE Embedded library and GATE Developer GUI.
- gate-top
- miscellaneous small components including
- gate-plugin-base
- Maven parent POM shared by all GATE plugins (ours and those developed by third parties)
- gate-maven-plugin
- Maven plugin used by the base POM to build the artifacts required for a JAR to be a GATE plugin
- gate-plugin-test-utils
- utilities used in plugin unit tests
- archetypes
- to simplify the generation of new plugins
- gate-spring
- the helper classes for using GATE in the Spring Framework (see section 7.15)
- gateplugin-*
- the standard GATE plugins
Use the GitHub issue tracker for the appropriate repository to report bugs or submit feature requests – gate-core for bugs in the core library or which cut across many plugins, or the relevant gateplugin- repository for bugs in a specific plugin. When reporting bugs, please give as much detail as possible. Include the GATE version number and build number, the platform on which you observed the bug, and the version of Java you were using (8u171, 10.0.1, etc.). Include steps to reproduce the problem, and a full stack trace of any exceptions, including ‘Caused by …’. You may wish to first check whether the bug is already fixed in the latest snapshot build (available from https://jenkins.gate.ac.uk). You may also request new features.
12.2 Contributing Patches [#]
Patches may be submitted via the usual GitHub pull request mechanism. Create a fork of the relevant GitHub repository, commit your changes there, then submit a pull request. Note that gate-core is intended to be compatible with Java 8, so if you regularly develop using a later version of Java it is very important to compile and test your patches on Java 8. Patches that use features from a later version of Java and do not compile and run on Java 8 will not be accepted.
When you submit a pull request you will be asked to sign a contributor licence agreement if you do not already have one on file. This is to ensure that we at the University of Sheffield have permission to use the code you contribute.
12.3 Creating New Plugins [#]
GATE provides a flexible structure where new resources can be plugged in very easily. There are three types of resources: Language Resource (LR), Processing Resource (PR) and Visual Resource (VR). In the following subsections we describe the necessary steps to write new PRs and VRs, and to add plugins to the nightly build. The guide on writing new LRs will be available soon.
You can quickly create a new plugin project structure using the Maven archetype described in section 7.12.
12.3.1 What to Call your Plugin [#]
Plugins in GATE have two types of “name”, the Maven artifact ID (which is what you use when adding the plugin to the plugin manager or loading it via the API) and the <name> in the POM file (which is what is displayed in the plugin manager). The artifact ID should follow normal Maven conventions and be named in “lower-case-with-hyphens”, the human readable name in the POM file can be anything but conventionally we use the form “Function: Detail”, for example “Language: Arabic” or “Tagger: Numbers”. This naturally groups similar plugins together in the plugin manager list when it is sorted alphabetically. Before naming your plugin, look at the existing plugins and see where it might group well.
Core GATE plugins use the Maven group ID uk.ac.gate.plugins. If you are not part of the core GATE development team you should use your own group ID, typically based on the reversed form of a DNS domain name you control (e.g. com.example if you owned example.com).
12.3.2 Writing a New PR [#]
Class Definition
Below we show a template class definition, which can be used in order to write a new Processing Resource.
2package example;
3
4import gate.*;
5import gate.creole.*;
6import gate.creole.metadata.*;
7
8/∗∗
9 ∗ Processing Resource. The @CreoleResource annotation marks this
10 ∗ class as a GATE Resource, and gives the information GATE needs
11 ∗ to configure the resource appropriately.
12 ∗/
13@CreoleResource(name = "Example PR",
14 comment = "An example processing resource")
15public class NewPlugin extends AbstractLanguageAnalyser {
16
17 /∗
18 ∗ this method gets called whenever an object of this
19 ∗ class is created either from GATE Developer GUI or if
20 ∗ initiated using Factory.createResource() method.
21 ∗/
22 public Resource init() throws ResourceInstantiationException {
23 // here initialize all required variables, and may
24 // be throw an exception if the value for any of the
25 // mandatory parameters is not provided
26
27 if(this.rulesURL == null)
28 throw new ResourceInstantiationException("rules URL null");
29
30 return this;
31 }
32
33
34 /∗
35 ∗ this method should provide the actual functionality of the PR
36 ∗ (from where the main execution begins). This method
37 ∗ gets called when user click on the "RUN" button in the
38 ∗ GATE Developer GUI’s application window.
39 ∗/
40 public void execute() throws ExecutionException {
41 // write code here
42 }
43
44 /∗ this method is called to reinitialize the resource ∗/
45 public void reInit() throws ResourceInstantiationException {
46 // reinitialization code
47 }
48
49 /∗
50 ∗ There are two types of parameters
51 ∗ 1. Init time parameters − values for these parameters need to be
52 ∗ provided at the time of initializing a new resource and these
53 ∗ values are not supposed to be changed.
54 ∗ 2. Runtime parameters − values for these parameters are provided
55 ∗ at the time of executing the PR. These are runtime parameters and
56 ∗ can be changed before starting the execution
57 ∗ (i.e. before you click on the "RUN" button in GATE Developer)
58 ∗ A parameter myParam is specified by a pair of methods getMyParam
59 ∗ and setMyParam (with the first letter of the parameter name
60 ∗ capitalized in the normal Java Beans style), with the setter
61 ∗ annotated with a @CreoleParameter annotation.
62 ∗
63 ∗ for example to set a value for outputAnnotationSetName
64 ∗/
65 String outputAnnotationSetName;
66
67 //getter and setter methods
68
69 /∗ get<parameter name with first letter Capital> ∗/
70 public String getOutputAnnotationSetName() {
71 return outputAnnotationSetName;
72 }
73
74 /∗ The setter method is annotated to tell GATE that it defines an
75 ∗ optional runtime parameter.
76 ∗/
77 @Optional
78 @RunTime
79 @CreoleParameter(
80 comment = "name of the annotationSet used for output")
81 public void setOutputAnnotationSetName(String setName) {
82 this.outputAnnotationSetName = setName;
83 }
84
85 /∗∗ Init−time parameter ∗/
86 private ResourceReference rulesURL;
87
88 // getter and setter methods
89 public ResourceReference getRulesURL() {
90 return rulesURL;
91 }
92
93 /∗ This parameter is not annotated @RunTime or @Optional, so it is a
94 ∗ required init−time parameter.
95 ∗/
96 @CreoleParameter(
97 comment = "example of an inittime parameter",
98 defaultValue = "resources/morph/default.rul")
99 public void setRulesURL(ResourceReference rulesURL) {
100 this.rulesURL = rulesURL;
101 }
102}
Use ResourceReference for things like configuration files. The defaultValue is a path relative to the plugin’s src/main/resources folder, but users can use normal URLs to refer to files outside the plugin’s JAR. Resource files like this should be put into a resources folder (i.e. src/main/resources/resources) as GATE Developer has special support for copying the resources folder out of a plugin to give the user an editable copy of the resource files.
Context Menu
Each resource (LR,PR) has some predefined actions associated with it. These actions appear in a context menu that appears in GATE Developer when the user right clicks on any of the resources. For example if the selected resource is a Processing Resource, there will be at least four actions available in its context menu: 1. Close 2. Hide 3. Rename and 4. Reinitialize. New actions in addition to the predefined actions can be added by implementing the gate.gui.ActionsPublisher interface in either the LR/PR itself or in any associated VR. Then the user has to implement the following method.
public List getActions() {
return actions;
}
Here the variable actions should contain a list of instances of type javax.swing.AbstractAction. A string passed in the constructor of an AbstractAction object appears in the context menu. Adding a null element adds a separator in the menu.
Listeners
There are at least four important listeners which should be implemented in order to listen to the various relevant events happening in the background. These include:
- CreoleListener
Creole-register keeps information about instances of various resources and refreshes itself on new additions and deletions. In order to listen to these events, a class should implement the gate.event.CreoleListener. Implementing CreoleListener requires users to implement the following methods:
- public void resourceLoaded(CreoleEvent creoleEvent);
- public void resourceUnloaded(CreoleEvent creoleEvent);
- public void resourceRenamed(Resource resource, String oldName, String newName);
- public void datastoreOpened(CreoleEvent creoleEvent);
- public void datastoreCreated(CreoleEvent creoleEvent);
- public void datastoreClosed(CreoleEvent creoleEvent);
- DocumentListener
A traditional GATE document contains text and a set of annotationSets. To get notified about changes in any of these resources, a class should implement the gate.event.DocumentListener. This requires users to implement the following methods:
- public void contentEdited(DocumentEvent event);
- public void annotationSetAdded(DocumentEvent event);
- public void annotationSetRemoved(DocumentEvent event);
- AnnotationSetListener
As the name suggests, AnnotationSet is a set of annotations. To listen to the addition and deletion of annotations, a class should implement the gate.event.AnnotationSetListener and therefore the following methods:
- public void annotationAdded(AnnotationSetEvent event);
- public void annotationRemoved(AnnotationSetEvent event);
- AnnotationListener
Each annotation has a featureMap associated with it, which contains a set of feature names and their respective values. To listen to the changes in annotation, one needs to implement the gate.event.AnnotationListener and implement the following method:
- public void annotationUpdated(AnnotationEvent event);
12.3.3 Writing a New VR [#]
Each resource (PR and LR) can have its own associated visual resource. When double clicked, the resource’s respective visual resource appears in GATE Developer. The GATE Developer GUI is divided into three visible parts (See Figure 12.1). One of them contains a tree that shows the loaded instances of resources. The one below this is used for various purposes - such as to display document features and that the execution is in progress. This part of the GUI is referred to as ‘small’. The third and the largest part of the GUI is referred to as ‘large’. One can specify which one of these two should be used for displaying a new visual resource in the creole.xml.
Class Definition
Below we show a template class definition, which can be used in order to write a new Visual Resource.
2
3import gate.*;
4import gate.creole.*;
5import gate.creole.metadata.*;
6
7/∗
8 ∗ An example Visual Resource for the New Plugin
9 ∗ Note that here we extends the AbstractVisualResource class.
10 ∗ The @CreoleResource annotation associates this VR with the
11 ∗ underlying PR type it displays.
12 ∗/
13@CreoleResource(name = "Visual resource for new plugin",
14 guiType = GuiType.LARGE,
15 resourceDisplayed = "example.NewPlugin",
16 mainViewer = true)
17public class NewPluginVR extends AbstractVisualResource {
18
19 /∗
20 ∗ An Init method called when the GUI is initialized for
21 ∗ the first time
22 ∗/
23 public Resource init() {
24 // initialize GUI Components
25 return this;
26 }
27
28 /∗
29 ∗ Here target is the PR class to which this Visual Resource
30 ∗ belongs. This method is called after the init() method.
31 ∗/
32 public void setTarget(Object target) {
33 // check if the target is an instance of what you expected
34 // and initialize local data structures if required
35 }
36}
Every document has its own document viewer associated with it. It comes with a single component that shows the text of the original document. GATE provides a way to attach new GUI plugins to the document viewer. For example AnnotationSet viewer, AnnotationList viewer and Co-Reference editor. These are the examples of DocumentViewer plugins shipped as part of the core GATE build. These plugins can be displayed either on the right or on top of the document viewer. They can also replace the text viewer in the center (See figure 12.1). A separate button is added at the top of the document viewer which can be pressed to display the GUI plugin.
Below we show a template class definition, which can be used to develop a new DocumentViewer plugin.
2/∗
3 ∗ Note that the class needs to extends the AbstractDocumentView class
4 ∗/
5@CreoleResource
6public class DocumentViewerPlugin extends AbstractDocumentView {
7
8 /∗ Implementers should override this method and use it for
9 ∗ populating the GUI.
10 ∗/
11 public void initGUI() {
12 // write code to initialize GUI
13 }
14
15 /∗ Returns the type of this view ∗/
16 public int getType() {
17 // it can be any of the following constants
18 // from the gate.gui.docview.DocumentView
19 // CENTRAL, VERTICAL, HORIZONTAL
20 }
21
22 /∗ Returns the actual UI component this view represents. ∗/
23 public Component getGUI() {
24 // return the top level GUI component
25 }
26
27 /∗ This method called whenever view becomes active.∗/
28 public void registerHooks() {
29 // register listeners
30 }
31
32 /∗ This method called whenever view becomes inactive. ∗/
33 public void unregisterHooks() {
34 // do nothing
35 }
36}
12.3.4 Writing a ‘Ready Made’ Application [#]
Often a CREOLE plugin may contain an example application to showcase the PRs it contains. These ‘ready made’ applications can be made easily available through GATE Developer by creating a simple PackagedController subclass. In essence such a subclass simply references a saved application and provides details that can be used to create a menu item to load the application.
The following example shows how the example application in the tagger-measurements plugin is added to the menus in GATE Developer.
2 icon = "measurements", autoinstances = @AutoInstance(parameters = {
3 @AutoInstanceParam(name="pipelineURL",
4 value="resources/annie-measurements.xgapp"),
5 @AutoInstanceParam(name="menu", value="ANNIE")}))
6public class ANNIEMeasurements extends PackagedController {
7
8}
The menu parameter is used to specify the folder structure in which the menu item will be placed. Typically its value will just be a single menu name, but it can be a semicolon-separated list of names, which will map to a series of sub-menus. For example "Languages;German" would create a “Languages” menu with a “German” sub-menu, which in turn would contain the menu item for this application.
12.3.5 Distributing Your New Plugins [#]
Since GATE 8.5 plugins are distributed via the normal Maven repository mechanism. Release versions of most core plugins are in the Central Repository and snapshot versions are released via our own Maven repository at http://repo.gate.ac.uk/content/groups/public, along with releases of a few plugins whose dependencies are not in Central.
There are several routes by which you can release your own plugins into the Central Repository, the simplest is to use the Sonatype OSSRH system (which is how we release gate-core and the standard plugins).
For snapshots you can host your own Maven repository, or use the OSSRH snapshot repository. In order to use plugins from a repository other than Central or the GATE team repository mentioned above, you must tell Maven where to find it by creating a file called settings.xml in the .m2 folder under your home directory – GATE will respect any repositories you have configured in your Maven settings.
2 xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
3 xsi:schemaLocation="http://maven.apache.org/SETTINGS/1.0.0
4 http://maven.apache.org/xsd/settings-1.0.0.xsd">
5
6 <profiles>
7 <profile>
8 <id>my-custom-repo</id>
9 <repositories>
10 <repository>
11 <id>my-repo</id>
12 <name>My Personal Repo</name>
13 <url>http://repo.example.com/</url>
14 <layout>default</layout>
15 <releases><enabled>true</enabled></releases>
16 <snapshots><enabled>true</enabled></snapshots>
17 </repository>
18 </repositories>
19 </profile>
20 </profiles>
21
22 <activeProfiles>
23 <activeProfile>my-custom-repo</activeProfile>
24 </activeProfiles>
25</settings>
12.4 Adding your plugin to the default list [#]
The GATE plugin manager has a list of “default” plugins that are automatically listed in the manager whenever GATE Developer is started. This list is itself maintained in another GitHub repository https://github.com/GateNLP/gate-metadata, with a separate file for each version of GATE. If you have developed and released a plugin that you believe is of wider interest to the GATE user community you can request that it be added to the default list. This is done through the normal GitHub pull request mechanism – fork the gate-metadata repository and commit your change to all the versioned plugins-NNN.tsv files for versions of GATE with which your plugin is compatible, then submit a pull request asking us to merge your change into the master list.
The same procedure applies when you release an updated version of your plugin – update your forked copy of the TSV files and submit another pull request.
12.5 Updating this User Guide [#]
The GATE User Guide is maintained on GitHub at https://github.com/GateNLP/userguide. If you are a developer at Sheffield you do not need to check out the userguide explicitly, as it will appear under the tao directory when you check out sale.
The user guide is written in LATEX and translated to PDF using pdflatex and to HTML using tex4ht. The main file that ties it all together is tao_main.tex, which defines the various macros used in the rest of the guide and \inputs the other .tex files, one per chapter.
12.5.1 Building the User Guide [#]
You will need:
- A standard POSIX shell environment including GNU Make. On Windows this generally means Cygwin, on Mac OS X the XCode developer tools and on Unix the relevant packages from your distribution.
- A copy of the userguide sources (see above).
- A LATEX installation, including pdflatex if you want to build the PDF version, and tex4ht if you want to build the HTML. MiKTeX should work for Windows, texlive (available in MacPorts) for Mac OS X, or your choice of package for Unix.
- The BibTeX database big.bib. It must be located in the directory above where you have checked out the userguide, i.e. if the guide sources are in /home/bob/github/userguide then big.bib needs to go in /home/bib/github. Sheffield developers will find that it is already in the right place, under sale, others will need to download it from http://gate.ac.uk/sale/big.bib.
- The file http://gate.ac.uk/sale/utils.tex.
- A bit of luck.
Once these are all assembled it should be a case of running make to perform the actual build. To build the PDF do make tao.pdf, for the one page HTML do make index.html and for the several pages HTML do make split.html.
The PDF build generally works without problems, but the HTML build is known to hang on some machines for no apparent reason. If this happens to you try again on a different machine.
12.5.2 Making Changes to the User Guide [#]
To make changes to the guide simply edit the relevant .tex files, make sure the guide still builds (at least the PDF version), and check in your changes to the source files only. Please do not check in your own built copy of the guide, the official user guide builds are produced by a Jenkins continuous integration server in Sheffield.
For non-Sheffield developers we welcome documentation patches through the normal GitHub pull request mechanism.
If you add a section or subsection you should use the \sect or \subsect commands rather than the normal LaTeX \section or \subsection. These shorthand commands take an optional first parameter, which is the label to use for the section and should follow the pattern of existing labels. The label is also set as an anchor in the HTML version of the guide. For example a new section for the ‘Fish’ plugin would go in misc-creole.tex with a heading of:
and would have the persistent URL http://gate.ac.uk/userguide/sec:misc-creole:fish.
If your changes are to document a bug fix or a new (or removed) feature then you should also add an entry to the change log in recent-changes.tex. You should include a reference to the full documentation for your change, in the same way as the existing changelog entries do. You should find yourself adding to the changelog every time except where you are just tidying up or rewording existing documentation. Unlike in the other source files, if you add a section or subsection you should use the \rcSect or \rcSubsect. Recent changes appear both in the introduction and the appendix, so these commands enable nesting to be done appropriately.
Section/subsection labels should comprise ‘sec’ followed by the chapter label and a descriptive section identifier, each colon-separated. New chapter labels should begin ‘chap:’.
Try to avoid changing chapter/section/subsection labels where possible, as this may break links to the section. If you need to change a label, add it in the file ‘sections.map’. Entries in this file are formatted one per line, with the old section label followed by a tab followed by the new section label.
The quote marks used should be ‘ and ’.
Titles should be in title case (capitalise the first word, nouns, pronouns, verbs, adverbs and adjectives but not articles, conjunctions or prepositions). When referring to a numbered chapter, section, subsection, figure or table, capitalise it, e.g. ‘Section 3.1’. When merely using the words chapter, section, subsection, figure or table, e.g. ‘the next chapter’, do not capitalise them. Proper nouns should be capitalised (‘Java’, ‘Groovy’), as should strings where the capitalisation is significant, but not terms like ‘annotation set’ or ‘document’.
The user guide is rebuilt automatically whenever changes are checked in, so your change should appear in the online version of the guide within 20 or 30 minutes.