Chapter 2
Installing and Running GATE [#]
2.1 Downloading GATE [#]
To download GATE point your web browser at http://gate.ac.uk/download/.
2.2 Installing and Running GATE [#]
GATE will run anywhere that supports Java 5 or later, including Solaris, Linux, Mac OS X and Windows platforms. We don’t run tests on other platforms, but have had reports of successful installs elsewhere.
2.2.1 The Easy Way [#]
The easy way to install is to use one of the platform-specific installers (created using the excellent IzPack). Download a ‘platform-specific installer’ and follow the instructions it gives you. Once the installation is complete, you can start GATE Developer using gate.exe (Windows) or GATE.app (Mac) in the top-level installation directory, or gate.sh in the bin directory (other platforms).
Note for Mac users: on 64-bit-capable systems, GATE.app will run as a 64-bit application. It will use the first listed 64-bit JVM in your Java Preferences, even if your highest priority JVM is a 32-bit one. Thus if you want to run using Java 5 rather than 6 you must ensure that “J2SE 5.0 64-bit” is listed ahead of “Java SE 6 64-bit”.
2.2.2 The Hard Way (1) [#]
Download the Java-only release package or the binary build snapshot, and follow the instructions below.
Prerequisites:
- A conforming Java 2 environment,
- version 1.4.2 or above for GATE 3.1
- version 5.0 for GATE 4.0 beta 1 or later.
available free from Sun Microsystems or from your UNIX supplier. (We test on various Sun JDKs on Solaris, Linux and Windows XP.)
- Binaries from the GATE distribution you downloaded: gate.jar, lib/ext/guk.jar
(Unicode editing support) and a suitable script to start Ant, e.g. ant.sh or ant.bat. These
are held in a directory called bin like this:
.../bin/
gate.jar
ant.sh
ant.batYou will also need the lib directory, containing various libraries that GATE depends on.
- An open mind and a sense of humour.
Using the binary distribution:
- Unpack the distribution, creating a directory containing jar files and scripts.
- To run GATE Developer: on Windows, start a Command Prompt window, change to the directory where you unpacked the GATE distribution and run ‘bin/ant.bat run’; on UNIX or mac open a terminal window and run ‘bin/ant run’.
- To embed GATE as a library (GATE Embedded), put gate.jar and all the libraries in the lib directory in your CLASSPATH and tell Java that guk.jar is an extension (-Djava.ext.dirs=path-to-guk.jar).
The Ant scripts that start GATE Developer (ant.bat or ant) require you to set the JAVA_HOME environment variable to point to the top level directory of your JAVA installation. The value of GATE_CONFIG is passed to the system by the scripts using either a -i command-line option, or the Java property gate.config.
2.2.3 The Hard Way (2): Subversion [#]
The GATE code is maintained in a Subversion repository. You can use a Subversion
client to check out the source code – the most up-to-date version of GATE is the trunk:
svn checkout https://gate.svn.sourceforge.net/svnroot/gate/gate/trunk gate
Once you have checked out the code you can build GATE using Ant (see Section 2.5)
You can browse the complete Subversion repository online at http://gate.svn.sourceforge.net/.
2.3 Using System Properties with GATE [#]
During initialisation, GATE reads several Java system properties in order to decide where to find its configuration files.
Here is a list of the properties used, their default values and their meanings:
- gate.home
- sets the location of the GATE install directory. This should point to the top level directory of your GATE installation. This is the only property that is required. If this is not set, the system will display an error message and them it will attempt to guess the correct value.
- gate.plugins.home
- points to the location of the directory containing installed plugins (a.k.a. CREOLE directories). If this is not set then the default value of {gate.home}/plugins is used.
- gate.site.config
- points to the location of the configuration file containing the site-wide options. If not set this will default to {gate.home}/gate.xml. The site configuration file must exist!
- gate.user.config
- points to the file containing the user’s options. If not specified, or if the specified file does not exist at startup time, the default value of gate.xml (.gate.xml on Unix platforms) in the user’s home directory is used.
- gate.user.session
- points to the file containing the user’s saved session. If not specified, the default value of gate.session (.gate.session on Unix) in the user’s home directory is used. When starting up GATE Developer, the session is reloaded from this file if it exists, and when exiting GATE Developer the session is saved to this file (unless the user has disabled ‘save session on exit’ in the configuration dialog). The session is not used when using GATE Embedded.
- load.plugin.path
- is a path-like structure, i.e. a list of URLs separated by ‘;’. All directories listed here will be loaded as CREOLE plugins during initialisation. This has similar functionality with the the -d command line option.
- gate.builtin.creole.dir
- is a URL pointing to the location of GATE’s built-in CREOLE directory. This is the location of the creole.xml file that defines the fundamental GATE resource types, such as documents, document format handlers, controllers and the basic visual resources that make up GATE. The default points to a location inside gate.jar and should not generally need to be overridden.
When using GATE Embedded, you can set the values for these properties before you call Gate.init(). Alternatively, you can set the values programmatically using the static methods setGateHome(), setPluginsHome(), setSiteConfigFile(), etc. before calling Gate.init(). See the Javadoc documentation for details. If you want to set these values from the command line you can use the following syntax for setting gate.home for example:
java -Dgate.home=/my/new/gate/home/directory -cp... gate.Main
When running GATE Developer, you can set the properties by creating a file build.properties in the top level GATE directory. In this file, any system properties which are prefixed with ‘run.’ will be passed to GATE. For example, to set an alternative user config file, put the following line in build.properties1:
run.gate.user.config=${user.home}/alternative-gate.xml
This facility is not limited to the GATE-specific properties listed above, for example the following line changes the default temporary directory for GATE (note the use of forward slashes, even on Windows platforms):
run.java.io.tmpdir=d:/bigtmp
2.4 Configuring GATE [#]
When GATE Developer is started, or when Gate.init() is called from GATE Embedded, GATE loads various sorts of configuration data stored as XML in files generally called something like gate.xml or .gate.xml. This data holds information such as:
- whether to save settings on exit;
- whether to save session on exit;
- what fonts GATE Developer should use;
- plugins to load at start;
- colours of the annotations;
- locations of files for the file chooser;
- and a lot of other GUI related options;
This type of data is stored at two levels (in order from general to specific):
- the site-wide level, which by default is located the gate.xml file in top level directory of the GATE installation (i.e. the GATE home. This location can be overridden by the Java system property gate.site.config;
- the user level, which lives in the user’s HOME directory on UNIX or their profile directory on Windows (note that parts of this file are overwritten when saving user settings). The default location for this file can be overridden by the Java system property gate.user.config.
Where configuration data appears on several different levels, the more specific ones overwrite the more general. This means that you can set defaults for all GATE users on your system, for example, and allow individual users to override those defaults without interfering with others.
Configuration data can be set from the GATE Developer GUI via the ‘Options’ menu then ‘Configuration’. The user can change the appearance of the GUI in the ‘Appearance’ tab, which includes the options of font and the ‘look and feel’. The ‘Advanced’ tab enables the user to include annotation features when saving the document and preserving its format, to save the selected Options automatically on exit, and to save the session automatically on exit. The ‘Input Methods’ submenu from the ‘Options’ menu enables the user to change the default language for input. These options are all stored in the user’s .gate.xml file.
When using GATE Embedded, you can also set the site config location using Gate.setSiteConfigFile(File) prior to calling Gate.init().
2.5 Building GATE [#]
Note that you don’t need to build GATE unless you’re doing development on the system itself.
Prerequisites:
- A conforming Java environment as above.
- A copy of the GATE sources and the build scripts – either the SRC distribution package from the nightly snapshots or a copy of the code obtained through Subversion (see Section 2.2.3).
- An appreciation of natural beauty.
GATE now includes a copy of the ANT build tool which can be accessed through the scripts included in the bin directory (use ant.bat for Windows 98 or ME, ant.cmd for Windows NT, 2000 or XP, and ant.sh for Unix platforms).
To build gate, cd to gate and:
- Type:
bin/ant - [optional] To test the system:
bin/ant test
- [optional] To make the Javadoc documentation:
bin/ant doc
- You can also run GATE Developer using Ant, by typing:
bin/ant run - To see a full list of options type: bin/ant help
(The details of the build process are all specified by the build.xml file in the gate directory.)
You can also use a development environment like Borland JBuilder (click on the gate.jpx file), but note that it’s still advisable to use ant to generate documentation, the jar file and so on. Also note that the run configurations have the location of a gate.xml site configuration file hard-coded into them, so you may need to change these for your site.
2.5.1 Using GATE with Maven [#]
This section is based on contributions by Marin Nozhchev (Ontotext) and Benson Margulies (Basis Technology Corp).
Artifacts for GATE are available from Ontotext’s public Maven repository at http://maven.ontotext.com/archiva/repository/public. To use them in your own Maven-based project you need to include the repository definition in your POM’s <repositories> section:
<id>gate-ontototext</id>
<url>http://maven.ontotext.com/archiva/repository/public</url>
<snapshots>
<!-- set this to true to use snapshot builds of GATE -->
<enabled>false</enabled>
</snapshots>
<releases>
<enabled>true</enabled>
<checksumPolicy>fail</checksumPolicy>
</releases>
</repository>
There is a choice of two different top-level artifacts you could use as dependencies, both in the group gate:
- gate-core
- just gate.jar and its minimal dependencies, sufficient to initialize GATE, load and save documents and corpora as XML, etc. The POM lists many other dependencies which are marked as optional, so you can pick and choose which parts of GATE you wish to depend on.
- gate
- gate.jar plus all the dependencies that are typically found in GATE’s lib directory in a standard release download.
<groupId>gate</groupId>
<artifactId>gate</artifactId>
<version>5.2.1</version>
<!-- or 6.0-SNAPSHOT, etc. -->
</dependency>
In both cases this will include the GATE libraries only (and relevant dependencies) — you must separately obtain the appropriate versions of any GATE plugins your application requires (including ANNIE), typically by downloading the standard GATE release or by fetching them from the relevant place in subversion. For example if you are using GATE 5.2.1 via a Maven dependency then you can obtain the correct ANNIE plugin from https://gate.svn.sourceforge.net/svnroot/gate/gate/tags/release-5.2.1/.
2.6 Uninstalling GATE [#]
If you have used the installer, run:
or just delete the whole of the installation directory (the one containing bin, lib, Uninstaller, etc.). The installer doesn’t install anything outside this directory, but for completeness you might also want to delete the settings files GATE creates in your home directory (.gate.xml and .gate.session).
2.7 Troubleshooting [#]
2.7.1 I don’t see the Java console messages under Windows [#]
Note that the gate.bat script uses javaw.exe to run GATE which means that you will see no console for the java process. If you have problems starting GATE and you would like to be able to see the console to check for messages then you should edit the gate.bat script and replace javaw.exe with java.exe in the definition of the JAVA environment variable.
2.7.2 When I execute GATE, nothing happens [#]
You might get some clues if you start GATE from the command line, using:
which will allow you to see all error messages GATE generates.
2.7.3 On Ubuntu, GATE is very slow or doesn’t start [#]
GATE and many other Java applications are known not to work with GCJ, the open-source Java SDK or others non SUN Java SDK.
Make sure you have the official version of Java installed. Provided by Sun, the package is named ‘sun-java6-jdk’ in Synaptic. GATE also works with Java version 5 so ‘sun-java5-jdk’.
To install it, run in a terminal:
Make sure that your default Java version is the one from SUN. You can do this by running:
This will list the installed Java VMs. You should see ‘java-6-sun’ as one of the options.
Then you should run :
to set the ‘java-6-sun’ as your default.
Finally, try GATE again.
2.7.4 How to use GATE on a 64 bit system? [#]
32-bit vs. 64-bit is a matter of the JVM rather than the build of GATE -
For example, on Mac OS X, either use Applications/Utilities/Java Preferences and put one of the 64-bit options at the top of the list, or run GATE from the terminal using Java 1.6.0 (which is 64-bit only on Mac OS):
bin/ant run
2.7.5 I got the error: Could not reserve enough space for object heap [#]
GATE doesn’t use the JAVA_OPTS variable. The default memory allocations are defined in the gate/build.xml file but you can override them by creating a file called build.properties in the same directory containing
runtime.max.memory=1048m
If you don’t use ant to start GATE but your own application directly with the ‘java’ executable then you must use something like:
2.7.6 From Eclipse, I got the error: java.lang.OutOfMemoryError: Java heap space [#]
Configuring xms and xmx parameters in eclipse.ini file just adds memory to your Eclipse process. If you start a Java application from within Eclipse, that will run in a different process.
To give more memory to your application, as opposed to just to Eclipse, you need to add those values in the ‘VM Arguments’ section of the run application dialog: lower pane, in the second tab of ‘Run Configurations’ dialog.
2.7.7 On MacOS, I got the error: java.lang.OutOfMemoryError: Java heap space [#]
You can try to set the environment variable ANT_OPTS to allow for more memory as follows:
Another cause can be when compiling with Java 6 on Mac. It builds OK using Java 5 with
and once built it runs fine with Java 6. Adding
memoryMaximumSize="500M"
or similar to the <javac> in build.xml might fix it but if you’re making changes that are to be committed to subversion you really ought to be building with Java 5 anyway :-)
2.7.8 I got the error: log4j:WARN No appenders could be found for logger... [#]
You need to copy the ‘gate/bin/log4j.properties’ file to the directory from which you execute your project.
2.7.9 Text is incorrectly refreshed after scrolling and become unreadable [#]
Change the look and feel used in GATE with menu ‘Options’ then ‘Configuration’. Restart GATE and try again. We use mainly ‘Metal’ and ‘Nimbus’ without problem.
Change the video driver you use.
Update Java.
2.7.10 An error occurred when running the TreeTagger plugin [#]
The TreeTagger plugin isn’t supported anymore. However, the TaggerFramework plugin provides support for TreeTagger. Try using that plugin instead. See section 17.4.
2.7.11 I got the error: HighlightData cannot be cast to ...HighlightInfo [#]
That’s a recurring problem when editing a document with annotation highlights showing and it usually involves inserting/deleting some text close or belonging to an annotation in the first place.
The current solutions are to hide the annotation highlights before to edit the text or use document read-only mode so you can only edit annotations or hide then show again the document after the error.
1In this specific case, the alternative config file must already exist when GATE starts up, so you should copy your standard gate.xml file to the new location.