Chapter 12
Developing GATE [#]
This chapter describes ways of getting involved in and contributing to the GATE project. Sections 12.1 and 12.2 are good places to start. Sections 12.3 and 12.4 describe protocol and provide information for committers; we cover creating new plugins and updating this user guide. See Section 12.2 for information on becoming a committer.
12.1 Reporting Bugs and Requesting Features [#]
The GATE bug tracker can be found on SourceForge, here. When reporting bugs, please give as much detail as possible. Include the GATE version number and build number, the platform on which you observed the bug, and the version of Java you were using (1.6.0_03, etc.). Include steps to reproduce the problem, and a full stack trace of any exceptions, including ‘Caused by …’. You may wish to first check whether the bug is already fixed in the latest nightly build. You may also request new features.
12.2 Contributing Patches [#]
Patches may be submitted on SourceForge. The best format for patches is an SVN diff against the latest subversion. The diff can be saved as a file and attached; it should not be pasted into the bug report. Note that we generally do not accept patches against earlier versions of GATE. Also, GATE is intended to be compatible with Java 6, so if you regularly develop using a later version of Java it is very important to compile and test your patches on Java 6. Patches that use features from a later version of Java and do not compile and run on Java 6 will not be accepted.
If you intend to submit larger changes, you might prefer to become a committer! We welcome input to the development process of GATE. The code is hosted on SourceForge, providing anonymous Subversion access (see Section 2.2.3). We’re happy to give committer privileges to anyone with a track record of contributing good code to the project. We also make the current version available nightly on the ftp site.
12.3 Creating New Plugins [#]
GATE provides a flexible structure where new resources can be plugged in very easily. There are three types of resources: Language Resource (LR), Processing Resource (PR) and Visual Resource (VR). In the following subsections we describe the necessary steps to write new PRs and VRs, and to add plugins to the nightly build. The guide on writing new LRs will be available soon.
12.3.1 What to Call your Plugin [#]
The plugins are many and the list is constantly expanding. The naming convention aims to impose order and group plugins in a readable manner. When naming new plugins, please adhere to the following guidelines:
- Words comprising plugin names should be capitalized and separated by underscores Like_So. This means that they will format nicely in GATE Developer. For example, ‘Inter_Annotator_Agreement’.
- Plugin names should begin with the word that best describes their function. Practically, this means that words are often reversed from the usual order, for example, the Chemistry Tagger plugin should be called ‘Tagger_Chemistry’. This means that for example parsers will group together alphabetically and thus will be easy to find when someone is looking for parsers. Before naming your plugin, look at the existing plugins and see where it might group well.
12.3.2 Writing a New PR [#]
Class Definition
Below we show a template class definition, which can be used in order to write a new Processing Resource.
2package example;
3
4import gate.*;
5import gate.creole.*;
6import gate.creole.metadata.*;
7
8/**
9 * Processing Resource. The @CreoleResource annotation marks this
10 * class as a GATE Resource, and gives the information GATE needs
11 * to configure the resource appropriately.
12 */
13@CreoleResource(name = "Example PR",
14 comment = "An example processing resource")
15public class NewPlugin extends AbstractLanguageAnalyser {
16
17 /*
18 * this method gets called whenever an object of this
19 * class is created either from GATE Developer GUI or if
20 * initiated using Factory.createResource() method.
21 */
22 public Resource init() throws ResourceInstantiationException {
23 // here initialize all required variables, and may
24 // be throw an exception if the value for any of the
25 // mandatory parameters is not provided
26
27 if(this.rulesURL == null)
28 throw new ResourceInstantiationException("rules URL null");
29
30 return this;
31 }
32
33
34 /*
35 * this method should provide the actual functionality of the PR
36 * (from where the main execution begins). This method
37 * gets called when user click on the "RUN" button in the
38 * GATE Developer GUI’s application window.
39 */
40 public void execute() throws ExecutionException {
41 // write code here
42 }
43
44 /* this method is called to reinitialize the resource */
45 public void reInit() throws ResourceInstantiationException {
46 // reinitialization code
47 }
48
49 /*
50 * There are two types of parameters
51 * 1. Init time parameters − values for these parameters need to be
52 * provided at the time of initializing a new resource and these
53 * values are not supposed to be changed.
54 * 2. Runtime parameters − values for these parameters are provided
55 * at the time of executing the PR. These are runtime parameters and
56 * can be changed before starting the execution
57 * (i.e. before you click on the "RUN" button in GATE Developer)
58 * A parameter myParam is specified by a pair of methods getMyParam
59 * and setMyParam (with the first letter of the parameter name
60 * capitalized in the normal Java Beans style), with the setter
61 * annotated with a @CreoleParameter annotation.
62 *
63 * for example to set a value for outputAnnotationSetName
64 */
65 String outputAnnotationSetName;
66
67 //getter and setter methods
68
69 /* get<parameter name with first letter Capital> */
70 public String getOutputAnnotationSetName() {
71 return outputAnnotationSetName;
72 }
73
74 /* The setter method is annotated to tell GATE that it defines an
75 * optional runtime parameter.
76 */
77 @Optional
78 @RunTime
79 @CreoleParameter(
80 comment = "name of the annotationSet used for output")
81 public void setOutputAnnotationSetName(String setName) {
82 this.outputAnnotationSetName = setName;
83 }
84
85 /** Init−time parameter */
86 URL rulesURL;
87
88 // getter and setter methods
89 public URL getRulesURL() {
90 return rulesURL;
91 }
92
93 /* This parameter is not annotated @RunTime or @Optional, so it is a
94 * required init−time parameter.
95 */
96 @CreoleParameter(
97 comment = "example of an inittime parameter",
98 defaultValue = "resources/morph/default.rul")
99 public void setRulesURL(URL rulesURL) {
100 this.rulesURL = rulesURL;
101 }
102}
PR Creole Entry
The creole.xml file simply needs to tell GATE which JAR file to look in to find the PR.
<?xml version="1.0"?>
<CREOLE-DIRECTORY>
<JAR SCAN="true">newplugin.jar</JAR>
</CREOLE-DIRECTORY>
Alternatively the configuration can be given in the XML file directly instead of using source annotations. Section 4.7 gives the full details.
Context Menu
Each resource (LR,PR) has some predefined actions associated with it. These actions appear in a context menu that appears in GATE Developer when the user right clicks on any of the resources. For example if the selected resource is a Processing Resource, there will be at least four actions available in its context menu: 1. Close 2. Hide 3. Rename and 4. Reinitialize. New actions in addition to the predefined actions can be added by implementing the gate.gui.ActionsPublisher interface in either the LR/PR itself or in any associated VR. Then the user has to implement the following method.
public List getActions() {
return actions;
}
Here the variable actions should contain a list of instances of type javax.swing.AbstractAction. A string passed in the constructor of an AbstractAction object appears in the context menu. Adding a null element adds a separator in the menu.
Listeners
There are at least four important listeners which should be implemented in order to listen to the various relevant events happening in the background. These include:
- CreoleListener
Creole-register keeps information about instances of various resources and refreshes itself on new additions and deletions. In order to listen to these events, a class should implement the gate.event.CreoleListener. Implementing CreoleListener requires users to implement the following methods:
- public void resourceLoaded(CreoleEvent creoleEvent);
- public void resourceUnloaded(CreoleEvent creoleEvent);
- public void resourceRenamed(Resource resource, String oldName, String newName);
- public void datastoreOpened(CreoleEvent creoleEvent);
- public void datastoreCreated(CreoleEvent creoleEvent);
- public void datastoreClosed(CreoleEvent creoleEvent);
- DocumentListener
A traditional GATE document contains text and a set of annotationSets. To get notified about changes in any of these resources, a class should implement the gate.event.DocumentListener. This requires users to implement the following methods:
- public void contentEdited(DocumentEvent event);
- public void annotationSetAdded(DocumentEvent event);
- public void annotationSetRemoved(DocumentEvent event);
- AnnotationSetListener
As the name suggests, AnnotationSet is a set of annotations. To listen to the addition and deletion of annotations, a class should implement the gate.event.AnnotationSetListener and therefore the following methods:
- public void annotationAdded(AnnotationSetEvent event);
- public void annotationRemoved(AnnotationSetEvent event);
- AnnotationListener
Each annotation has a featureMap associated with it, which contains a set of feature names and their respective values. To listen to the changes in annotation, one needs to implement the gate.event.AnnotationListener and implement the following method:
- public void annotationUpdated(AnnotationEvent event);
12.3.3 Writing a New VR [#]
Each resource (PR and LR) can have its own associated visual resource. When double clicked, the resource’s respective visual resource appears in GATE Developer. The GATE Developer GUI is divided into three visible parts (See Figure 12.1). One of them contains a tree that shows the loaded instances of resources. The one below this is used for various purposes - such as to display document features and that the execution is in progress. This part of the GUI is referred to as ‘small’. The third and the largest part of the GUI is referred to as ‘large’. One can specify which one of these two should be used for displaying a new visual resource in the creole.xml.
Class Definition
Below we show a template class definition, which can be used in order to write a new Visual Resource.
2
3import gate.*;
4import gate.creole.*;
5import gate.creole.metadata.*;
6
7/*
8 * An example Visual Resource for the New Plugin
9 * Note that here we extends the AbstractVisualResource class.
10 * The @CreoleResource annotation associates this VR with the
11 * underlying PR type it displays.
12 */
13@CreoleResource(name = "Visual resource for new plugin",
14 guiType = GuiType.LARGE,
15 resourceDisplayed = "example.NewPlugin",
16 mainViewer = true)
17public class NewPluginVR extends AbstractVisualResource {
18
19 /*
20 * An Init method called when the GUI is initialized for
21 * the first time
22 */
23 public Resource init() {
24 // initialize GUI Components
25 return this;
26 }
27
28 /*
29 * Here target is the PR class to which this Visual Resource
30 * belongs. This method is called after the init() method.
31 */
32 public void setTarget(Object target) {
33 // check if the target is an instance of what you expected
34 // and initialize local data structures if required
35 }
36}
Every document has its own document viewer associated with it. It comes with a single component that shows the text of the original document. GATE provides a way to attach new GUI plugins to the document viewer. For example AnnotationSet viewer, AnnotationList viewer and Co-Reference editor. These are the examples of DocumentViewer plugins shipped as part of the core GATE build. These plugins can be displayed either on the right or on top of the document viewer. They can also replace the text viewer in the center (See figure 12.1). A separate button is added at the top of the document viewer which can be pressed to display the GUI plugin.
Below we show a template class definition, which can be used to develop a new DocumentViewer plugin.
2/*
3 * Note that the class needs to extends the AbstractDocumentView class
4 */
5@CreoleResource
6public class DocumentViewerPlugin extends AbstractDocumentView {
7
8 /* Implementers should override this method and use it for
9 * populating the GUI.
10 */
11 public void initGUI() {
12 // write code to initialize GUI
13 }
14
15 /* Returns the type of this view */
16 public int getType() {
17 // it can be any of the following constants
18 // from the gate.gui.docview.DocumentView
19 // CENTRAL, VERTICAL, HORIZONTAL
20 }
21
22 /* Returns the actual UI component this view represents. */
23 public Component getGUI() {
24 // return the top level GUI component
25 }
26
27 /* This method called whenever view becomes active.*/
28 public void registerHooks() {
29 // register listeners
30 }
31
32 /* This method called whenever view becomes inactive. */
33 public void unregisterHooks() {
34 // do nothing
35 }
36}
12.3.4 Writing a ‘Ready Made’ Application [#]
Often a CREOLE plugin may contain an example application to showcase the PRs it contains. These ‘ready made’ applications can be made easily available through GATE Developer by creating a simple PackagedController subclass. In essence such a subclass simply references a saved application and provides details that can be used to create a menu item to load the application.
The following example shows how the example application in the Tagger_Measurement plugin is added to the menus in GATE Developer.
2 icon = "measurements", autoinstances = @AutoInstance(parameters = {
3 @AutoInstanceParam(name="pipelineURL",
4 value="resources/annie-measurements.xgapp"),
5 @AutoInstanceParam(name="menu", value="ANNIE")}))
6public class ANNIEMeasurements extends PackagedController {
7
8}
The menu parameter is used to specify the folder structure in which the menu item will be places. This is a list and works in the same fashion as adding tools to the Tools menu (see Section 4.8.1).
12.3.5 Distributing Your New Plugins [#]
Adding Plugins to the Nightly Build [#]
Each new resource added as a plugin should contain its own subfolder under the %GATEHOME%/plugins folder with an associated creole.xml file. A plugin can have one or more resources declared in its creole.xml file and/or using source-level annotations as described in section 4.7.
If you add a new plugin and want it to be part of the build process, you should create a build.xml file with targets ‘build’, ‘test’, ‘distro.prepare’, ‘javadoc’ and ‘clean’. The build target should build the JAR file, test should run any unit tests, distro.prepare should clean up any intermediate files (e.g. the classes/ directory) and leave just what’s in Subversion, plus the compiled JAR file and javadocs. The clean target should clean up everything, including the compiled JAR and any generated sources, etc. You should also add your plugin to ‘plugins.to.build’ in the top-level build.xml to include it in the build. This is by design - not all the plugins have build files, and of the ones that do, not all are suitable for inclusion in the nightly build (viz. SUPPLE, Section 18.3).
Note that if you are currently building gate by doing ‘ant jar’, be aware that this does not build the plugins. Running just ‘ant’ or ‘ant all’ will do so.
Hosting A Plugin Repository [#]
If you don’t wish to add your new plugin to the main GATE distribution then the easiest way to distribute it to other GATE users is by hosting a plugin repository.
A plugin repository is a simple XML file that points to one or more CREOLE plugins which can be downloaded and installed via the GATE plugin manager (see Section 3.6). The XML is structured as follows:
<UpdateSite>
<CreolePlugin url="http://example.url.com/plugins/sample1/" />
<CreolePlugin url="sample2/" downloadURL="http://example.url.com/sample2.zip" />
</UpdateSite>
Hopefully the structure of this file is fairly self explanatory. Each CreolePlugin element must contain a url attribute which points to a CREOLE directory, i.e. a directory which contains a creole.xml file as described in Section 4.7: note that for plugins distributed via this method the ID and VERSION attributes of the CREOLE-DIRECTORY element must be provided. The URL can be either absolute (as in the first example) or relative; relative URLs will be resolved against the location of the XML file. Each CreolePlugin can also, optionally, contain a downloadURL attribute. If present this should point to a zip file containing a compiled copy of the plugin. If the downloadURL is not present then we assume that it can be found as a file called creole.zip in the directory referenced by the url attribute.
Regardless of the location of the zip file containing the plugin, it should, at the top level, contain a single directory which in turn contains the full plugin including creole.xml etc.
12.4 Updating this User Guide [#]
The GATE User Guide is maintained in the GATE subversion repository at SourceForge. If you
are a developer at Sheffield you do not need to check out the userguide explicitly, as it will appear
under the tao directory when you check out sale. For others, you can check it out as follows:
svn checkout https://svn.sourceforge.net/svnroot/gate/userguide/trunk userguide
The user guide is written in LATEX and translated to PDF using pdflatex and to HTML using tex4ht. The main file that ties it all together is tao_main.tex, which defines the various macros used in the rest of the guide and \inputs the other .tex files, one per chapter.
12.4.1 Building the User Guide [#]
You will need:
- A standard POSIX shell environment including GNU Make. On Windows this generally means Cygwin, on Mac OS X the XCode developer tools and on Unix the relevant packages from your distribution.
- A copy of the userguide sources (see above).
- A LATEX installation, including pdflatex if you want to build the PDF version, and tex4ht if you want to build the HTML. MiKTeX should work for Windows, texlive (available in MacPorts) for Mac OS X, or your choice of package for Unix.
- The BibTeX database big.bib. It must be located in the directory above where you have checked out the userguide, i.e. if the guide sources are in /home/bob/svn/userguide then big.bib needs to go in /home/bib/svn. Sheffield developers will find that it is already in the right place, under sale, others will need to download it from http://gate.ac.uk/sale/big.bib.
- The file http://gate.ac.uk/sale/utils.tex.
- A bit of luck.
Once these are all assembled it should be a case of running make to perform the actual build. To build the PDF do make tao.pdf, for the one page HTML do make index.html and for the several pages HTML do make split.html.
The PDF build generally works without problems, but the HTML build is known to hang on some machines for no apparent reason. If this happens to you try again on a different machine.
12.4.2 Making Changes to the User Guide [#]
To make changes to the guide simply edit the relevant .tex files, make sure the guide still builds (at least the PDF version), and check in your changes to the source files only. Please do not check in your own built copy of the guide, the official user guide builds are produced by a Hudson continuous integration server in Sheffield.
If you add a section or subsection you should use the \sect or \subsect commands rather than the normal LaTeX \section or \subsection. These shorthand commands take an optional first parameter, which is the label to use for the section and should follow the pattern of existing labels. The label is also set as an anchor in the HTML version of the guide. For example a new section for the ‘Fish’ plugin would go in misc-creole.tex with a heading of:
and would have the persistent URL http://gate.ac.uk/userguide/sec:misc-creole:fish.
If your changes are to document a bug fix or a new (or removed) feature then you should also add an entry to the change log in recent-changes.tex. You should include a reference to the full documentation for your change, in the same way as the existing changelog entries do. You should find yourself adding to the changelog every time except where you are just tidying up or rewording existing documentation. Unlike in the other source files, if you add a section or subsection you should use the \rcSect or \rcSubsect. Recent changes appear both in the introduction and the appendix, so these commands enable nesting to be done appropriately.
Section/subsection labels should comprise ‘sec’ followed by the chapter label and a descriptive section identifier, each colon-separated. New chapter labels should begin ‘chap:’.
Try to avoid changing chapter/section/subsection labels where possible, as this may break links to the section. If you need to change a label, add it in the file ‘sections.map’. Entries in this file are formatted one per line, with the old section label followed by a tab followed by the new section label.
The quote marks used should be ‘ and ’.
Titles should be in title case (capitalise the first word, nouns, pronouns, verbs, adverbs and adjectives but not articles, conjunctions or prepositions). When referring to a numbered chapter, section, subsection, figure or table, capitalise it, e.g. ‘Section 3.1’. When merely using the words chapter, section, subsection, figure or table, e.g. ‘the next chapter’, do not capitalise them. Proper nouns should be capitalised (‘Java’, ‘Groovy’), as should strings where the capitalisation is significant, but not terms like ‘annotation set’ or ‘document’.
The user guide is rebuilt automatically whenever changes are checked in, so your change should appear in the online version of the guide within 20 or 30 minutes.