Appendix C
Ant Tasks for GATE [#]
This chapter describes the Ant tasks provided by GATE that you can use in your own build files. The tasks require Ant 1.7 or later.
C.1 Declaring the Tasks [#]
To use the GATE Ant tasks in your build file you must include the following <typedef> (where ${gate.home} is the location of your GATE installation):
<typedef resource="gate/util/ant/antlib.xml">
<classpath> <pathelement location="${gate.home}/bin/gate.jar" /> <fileset dir="${gate.home}/lib" includes="*.jar" /> </classpath> </typedef> |
If you have problems with library conflicts you should be able to reduce the JAR files included from the lib directory to just jdom, xstream and jaxen (plus stax-api and wstx-lgpl if you are running on Java 5, but these are not required on Java 6).
C.2 The packagegapp task - bundling an application with its dependencies [#]
C.2.1 Introduction
GATE saved application states (GAPP files) are an XML representation of the state of a GATE application. One of the features of a GAPP file is that it holds references to the external resource files used by the application as paths relative to the location of the GAPP file itself. This is useful in many cases but if you want to package up a copy of an application to send to a third party or to use in a web application, etc., then you need to be very careful to save the file in a directory above all its resources, and package the resources up with the GAPP file at the same relative paths. If the application refers to resources outside its own file tree (i.e. with relative paths that include ..) then you must either maintain this structure or manually edit the XML to move the resource references around and copy the files to the right places to match. This can be quite tedious and error-prone...
The packagegapp Ant task aims to automate this process. It extracts all the relative paths from a GAPP file, writes a modified version of the file with these paths rewritten to point to locations below the new GAPP file location (i.e. with no .. path segments) and copies the referenced files to their rewritten locations. The result is a directory structure that can be easily packaged into a zip file or similar and moved around as a self-contained unit.
This Ant task is the underlying driver for the “Export for Teamware” option described in section 3.24. Export for teamware does the equivalent of:
<packagegapp src="sourceFile.gapp"
destfile="{tempdir}/application.xgapp" copyPlugins="yes" copyResourceDirs="yes" onUnresolved="recover" /> |
followed by packaging the temporary directory into a zip file. These options are explained in detail below.
The packagegapp task requires Ant 1.7 or later.
C.2.2 Basic Usage [#]
In many cases, the following simple invocation will do what you want:
<packagegapp src="original.xgapp"
destfile="package/target.xgapp" /> |
Note that the parent directory of the destfile (in this case package) must already exist. It will not be created automatically.
This will perform the following steps:
- Read in the original.xgapp file and extract all the relative paths it contains.
- For each plugin referred to by a relative path, foo/bar/MyPluigin, rewrite the plugin location to be plugins/MyPlugin (relative to the location of the destfile).
- For each resource file referred to by the gapp, see if it lives under the original location of one of the plugins moved in the previous step. If so, rewrite its location relative to the new location of the plugin.
- If there are any relative resource paths that are not accounted for by the above rule (i.e. they do not live inside a referenced plugin), the build fails (see section C.2.3 for how to change this behaviour).
- Write out the modified GAPP to the destfile.
- Recursively copy the whole content of each of the plugins from step 2 to their new locations1.
This means that the all the relative paths in the new GAPP file (package/target.xgapp) will point to plugins/Something. You can now bundle up the whole package directory and take it elsewhere.
C.2.3 Handling non-plugin resources
By default, the task only handles relative resource paths that point within one of the plugins that the GAPP refers to. However, many applications refer to resources that live outside the plugin directories, for example custom JAPE grammars, gazetteer lists, etc. The task provides two approaches to support this: it can handle the unresolved references automatically, or you can provide your own “hints” to augment the default plugin-based ones.
Resolving unresolved resources [#]
By default, the build will fail if there are any relative paths that cannot be accounted for by the plugins (or the explicit hints, see section C.2.3). However, this is configurable using the onUnresolved attribute, which can take the following values:
- fail
- (default) the build fails if an unresolved relative path is found.
- absolute
- unresolved relative paths are left pointing to the same location as in the original file, but as an absolute rather than a relative URL. The same file will be used even if you move the GAPP file to a different directory. This option is useful if the resource in question is visible at the same absolute location on the machine where you will be putting the packaged file (for example a very large dictionary or ontology held on a network share).
- recover
- attempt to recover gracefully (see below).
With onUnresolved="recover", unresolved resources are relocated to a directory named application-resources under the target GAPP file location. Resources in the same original directory are copied to the same subdirectory of application-resources, files from different original directories are copied to different subdirectories. Typically, for a resource whose original location was .../myresources/grammar/clever.jape the target location would be application-resources/grammar/clever.jape but if the application also referred to (say) .../otherresources/grammar/clean.jape then this would be mapped into application-resources/grammar-1 to avoid a name clash.
Example:
<packagegapp src="original.xgapp" destfile="package/target.xgapp"
onUnresolved="recover" /> |
Providing mapping hints [#]
By default, the task knows how to handle resources that live inside plugins. You can think of this as a “hint” /foo/bar/MyPlugin -> plugins/MyPlugin, saying that whenever the mapper finds a resource path of the form /foo/bar/MyPlugin/X , it will relocate it to plugins/MyPlugin/X relative to the output GAPP file. You can specify your own hints which will be used the same way.
<packagegapp src="original.xgapp" destfile="package/target.xgapp">
<hint from="${user.home}/my-app-v1" to="resources/my-app" /> <hint from="/share/data/bigfiles" absolute="yes" /> </packagegapp> |
In this example, ~/my-app-v1/grammar/main.jape would be mapped to resources/my-app/grammar/main.jape (as always, relative to the output GAPP file). You can also hint that certain resources should be converted to absolute paths rather than being packaged with the application, using absolute="yes". The from and to values refer to directories - you cannot hint a single file, nor put two files from the same original directory into different directories in the packaged GAPP.
Explicit hints override the default plugin-based hints. For example given the hint from="${gate.home}/plugins/ANNIE/resources" to="resources/ANNIE", resources within the ANNIE plugin would be mapped into resources/ANNIE, but the plugin creole.xml itself would still be mapped into plugins/ANNIE.
As well as providing the hints inline in the build file you can also read them from a file in the normal Java Properties format2, using
<hint file="hints.properties" />
|
The keys in the property file are the from paths (in this case, relative paths are resolved against the project base directory, as with the location attribute of a property task) and the values are the to paths relative to the output file location.
C.2.4 Streamlining your plugins [#]
By default, the task will recursively copy the whole content of every plugin into the target directory. In most cases this is OK but it may be the case that your plugins contain many extraneous resources that are not used by your application. In this case you can specify copyPlugins="no":
<packagegapp src="original.xgapp" destfile="package/target.xgapp"
copyPlugins="no" /> |
In this mode, the packager task will copy only the following files from each plugin:
- creole.xml
- any JAR files referenced from <JAR> elements in creole.xml
In addition it will of course copy any files directly referenced by the GAPP, but not files referenced indirectly (the classic examples being .lst files used by a gazetteer .def, or the individual phases of a multiphase JAPE grammar) or files that are referenced by the creole.xml itself as AUTOINSTANCE parameters (e.g. the annotation schemas in ANNIE). You will need to name these extra files explicitly as extra resources (see the next section).
C.2.5 Bundling extra resources [#]
Apart from plugins (when you don’t use copyPlugins="no"), the only files copied into the target directory are those that are referenced directly from the GAPP file. This is often but not always sufficient, for example if your application contains a multiphase JAPE transducer then packagegapp will include the main JAPE file but not the individual phase files. The task provides two ways to include extra files in the package:
- If you set the attribute copyResourceDirs="yes" on the packagegapp task then whenever the task packages a referenced resource file it will also recursively include the whole contents of the directory containing that file in the output package. You probably don’t want to use this option if you have resource files in a directory shared with other files (e.g. your home directory...).
- To include specific extra resources you can use an <extraresourcespath> (see below).
The <extraresourcespath> allows you to specify specific extra files that should be included in the package:
<packagegapp src="original.xgapp" destfile="package/target.xgapp">
<extraresourcespath> <pathelement location="${user.home}/common-files/README" /> <fileset dir="${user.home}/my-app-v1" includes="grammar/*.jape" /> </extraresourcespath> </packagegapp> |
As the name suggests, this is a path-like structure and supports all the usual elements and attributes of an Ant <path>, including multiple nested fileset, filelist, pathelement and other path elements. For specific types of indirect references, there are helper elements that can be included under extraresourcespath. Currently the only one of these is gazetteerlists, which takes the path to a gazetteer definition file and returns the set of .lst files the definition uses:
<gazetteerlists definition="my/resources/lists.def" encoding="UTF-8" />
|
Other helpers (e.g. for multiphase JAPE) may be implemented in future.
You can also refer to a path defined elsewhere in the usual way:
<path id="extra.files">
... </path> <packagegapp ...> <extraresourcespath refid="extra.files" /> </packagegapp> |
Resources declared in the extraresourcespath and directories included using copyResourceDirs are treated exactly the same as resources that are referenced by the GAPP file - their target locations in the package are determined by the mapping hints, default plugin-based hints, and the onUnresolved setting as above. If you want to put extra resource files at specific locations in the package tree, independent of the mapping hints mechanism, you should do this with a separate <copy> task after the <packagegapp> task has done its work.
C.3 The expandcreoles task - merging annotation-driven config into creole.xml [#]
The expandcreoles task processes a number of creole.xml files from plugins, processes any @CreoleResource and @CreoleParameter annotations on the declared resource classes, and merges this configuration with the original XML configuration into a new copy of the creole.xml. It is not necessary to do this in the normal use of GATE, and this task is documented here simply for completeness. It is intended simply for use with non-GATE tools that can process the creole.xml file format to extract information about plugins (the prime use case for this is to generate the GATE plugins information page automatically from the plugin definitions).
The typical usage of this task (taken from the GATE build.xml) is:
<expandcreoles todir="build/plugins" gatehome="${basedir}">
<fileset dir="plugins" includes="*/creole.xml" /> </expandcreoles> |
This will initialise GATE with the given GATE_HOME directory, then read each file from the nested fileset, parse it as a creole.xml, expand it from any annotation configuration, and write it out to a file under build/plugins. Each output file will be generated at the same location relative to the todir as the original file was relative to the dir of its fileset.
1This is done with an Ant copy task and so is subject to the normal defaultexcludes
2the hint tag supports all the attributes of the standard Ant property tag so can load the hints from a file on disk or from a resource in a JAR file