GATE Visual Tutorials
In a hurry? Start here.
See also GATE project and conference talks.
1. GATE Developer in Action
1.1. Videos for Version 5.1 (November 2009)
A new comprehensive GATE Developer introduction
from Matrixware for GATE versions 5.1 and above.
1.2. Screencams for pre-5.0 versions
The best place to start is
this movie tutorial.
Follow the links below for shorter movies:
1.2.1. Loading documents
To create a document follow the steps below, or watch the
movie:
- Right-click on 'Language Resources' and choose 'New', then 'GATE Document'.
- In the dialog box choose the file you want to open in GATE or type a URL.
- Change 'markupAware' to false, if you do not want GATE to analyse the
document format.
- Provide a document name or leave blank to use an automatically generated
name.
- Click OK.
- The document will appear under the list of Language Resources loaded in the
system.
- To view its content, double click on its name.
Gate supports a variety of formats - HTML, XML, SGML, plain text, email, etc.
By default existing markup is converted automatically into GATE annotations
and put in the Original markups set.
1.2.2. Corpora
To learn how to create corpora and populate it with documents follow the steps
below, or watch the movie:
- Choose Language Resources/New/GATE corpus.
- (Optional) Provide the corpus name.
- (Optional) Choose which of the currently loaded documents to be added to the
corpus when created (click on the list editing button).
Populate a corpus from a directory:
- Right-click on a corpus and choose "Populate".
- Choose the directory where your files are located.
- (Optional) Specify the file extension, so only files with this extension are
loaded.
- (Optional) Specify the encoding to be used, when the documents are loaded in
GATE. If left blank (default), then the default platform encoding will be
used (for more information see the User's Guide).
To view corpus double click its name, and to view documents displayed inside a
corpus view, click on their name twice. From corpus view it is possible to
add/remove documents using buttons provided.
1.2.3. Processing Resources
Learn how to load a processing resource watching a movie or follow these steps:
- Right-click on 'Processing Resources' and choose 'New', then choose the
desired resource from the list (e.g., ANNIE English Tokeniser).
- In the dialog box either provide a name for the resource or leave blank to
use an automatically generated one.
- Click OK.
- Load as many resources as you want following steps 1-3.
To unload a resource: right click on the resource's name and choose "Close".
If you wish to unload more than one resource, select them all, right click and
choose "Close all".
To create a pipeline:
- Choose Application/New/Pipeline.
- (Optional) Provide a name for your application.
- From the list of available PRs, add those you wish to the list of "Selected
processing resources" using the right arrow button. These will be the
modules that comprise your application.
- Double-click on each resource and select the document which you wish to
process from the list of available documents.
- Click Run.
1.2.4. ANNIE
ANNIE is a ready-made information extraction system for English. Watch the
movie to learn how to use ANNIE or how to create another
application and run it over a corpus or follow the steps below:
- From GATE's File menu select "Load ANNIE"/"With defaults".
- Wait until all processing resources are loaded and the application is
created. This process cannot be interrupted or cancelled.
- Double-click on the ANNIE application.
- Select the corpus that you wish to process from the list of available
corpora.
- Click Run.
To create another application with loaded resources:
- Load all processing resources you wish to include in your application, e.g.,
a tokeniser, sentence splitter and a part-of-speech tagger.
- Choose Application/New/Corpus Pipeline.
- (Optional) Provide a name for your application.
- From the list of available PRs, add those you wish to the list of "Selected
processing resources" using the right arrow button. These will be the
modules that comprise your application.
- Select the corpus that you wish to process from the list of available
corpora.
- Click Run.
The processing results can be inspected on a per document basis. Just
double-click on each document.
1.2.5. Annotations
The processing resources produce their results in the form of annotations
associated with the text. For example, ANNIE produces annotations for
different types of name entities, such as persons, organisations, dates,
money, percentages. It also has a module which detects orthographic
coreference between names, e.g. all mentions of UK in the text are identified
as referring to an entity of type Location, called UK.
Watch movie or follow the steps below
to view the processing results, i.e., the annotations on each document:
- Double-click on the document's name to display the document editor.
- Press the "Annotation sets" button.
- From the list of annotation types, select the ones you'd like to see, e.g.
Person.
- The viewer will colour all parts of the text, which were annotated with this
type. Note that annotations of more than one type could be associated with
the same string.
- (Optional) To view more details about each of these annotations, press the
"Annotations" button.
- (Optional) To view corefering strings, press the Coreference button (it will
not be there unless the ANNIE Orthomatcher has been run).
1.2.6. Save/restore of application
Applications only need to be created once, then saved and re-loaded again
the next time GATE is started.
To learn how to save an application watch movie or:
- Load the PRs you wish to have in your application, then create it. Do not
set its document/corpus, so that they do not get loaded automatically each
time you restore the application (unless this is what you need).
- Right-click on your application (e.g. ANNIE_001) and choose "Save
application state".
- Provide a file name for your application.
To learn how to restore an application from file see
movie or follow these steps:
- Right-click on "Applications", then select "Restore application from file"
or click on the File menu and choose "Restore application from file".
- Select the file where you have saved your application.
- Wait until it gets restored. This will load all its processing resources and
the application itself, which specifies their execution order.
1.2.7. Data stores
Learn how to create data store and save corpus with documents in it. Watch a
movie.
Watch a load data store movie.
1.2.8. ANNIC
Watch ANNIC movie.
Update for Annic 2008.
1.2.9. Ontologies
NOTE: This is VERY out of date!
Load ontology, edit its classes and properties, create new instances with this
GATE's plug-in. Learn how.
2. GATE demos
ANNIE entity recognition web service demo.
The CLIE movie shows how to:
- load CLIE (Controlled Language Information Extraction),
- load a sample document in the controlled language,
- run the CLIE application to generate a ontology from the text,
- add more text and update the ontology,
- identify and correct errors in the text, and
- save and reload ontologies in various formats (.RDF, .XML, .OWL).
The OBIE movie shows how to:
- load an unannotated document and automatically annotate it with OBIE
(Ontology-Based Information Extraction),
- manually add, delete and change annotations using OCAT (Ontology Corpus
Annotation Tool), which displays an ontology tree of annotations, and
- send the document back to the trainer to improve the model incrementally.
This movie demonstrates that as the user keeps making corrections and sending
them back to the trainer, the quality of the automatic annotation improves.
The PrestoSpace demo demonstrates using
GATE for annotation and Information Extraction (IE) over multimedia content.
Watch demo (turn the sound
on!).
CLOnE QL demo developed as a part
of TAO project demonstrates a Natural Language
Processing Tool for enchanced knowledge access ie. querying the knowledge
store using human-like language.
3. GATE training course
Visit this page for complete list of videos.