GATE Visual Tutorials
New Video Tutorials Available Here!
We recommend the GATE Developer introduction for GATE versions 5.1 and above, or the demos page, or the documentation summary page. This page contains older videos on more advanced topics, often recorded using older GATE versions.
Contents
- 1. GATE Developer in Action
- 1.1. Videos for Version 5.1 (November 2009)
- 1.2. Screencams for pre-5.0 versions
- 1.2.1. Loading documents
- 1.2.2. Corpora
- 1.2.3. Processing Resources
- 1.2.4. ANNIE
- 1.2.5. Annotations
- 1.2.6. Save/restore of application
- 1.2.7. Data stores
- 1.2.8. ANNIC
- 1.2.9. Ontologies
- 2. GATE demos
1. GATE Developer in Action
1.1. Videos for Version 5.1 (November 2009)
A new comprehensive GATE Developer introduction from Matrixware for GATE versions 5.1 and above.
1.2. Screencams for pre-5.0 versions
The best place to start is this movie tutorial.
Follow the links below for shorter movies:
- Creating documents
- Creating and populating corpora
- Loading Processing Resources (PRs) and running a pipeline over a document
- Loading and running ANNIE and creating a corpus pipeline
- Inspecting the Processing Results
- Saving and restoring applications
- Creating data store
- Loading corpus and documents from data store
- ANNotation In Context
- Jena Ontology Tool
1.2.1. Loading documents
To create a document follow the steps below, or watch the movie:
- Right-click on 'Language Resources' and choose 'New', then 'GATE Document'.
- In the dialog box choose the file you want to open in GATE or type a URL.
- Change 'markupAware' to false, if you do not want GATE to analyse the document format.
- Provide a document name or leave blank to use an automatically generated name.
- Click OK.
- The document will appear under the list of Language Resources loaded in the system.
- To view its content, double click on its name.
Gate supports a variety of formats - HTML, XML, SGML, plain text, email, etc.
By default existing markup is converted automatically into GATE annotations and put in the Original markups set.
1.2.2. Corpora
To learn how to create corpora and populate it with documents follow the steps below, or watch the movie:
- Choose Language Resources/New/GATE corpus.
- (Optional) Provide the corpus name.
- (Optional) Choose which of the currently loaded documents to be added to the corpus when created (click on the list editing button).
Populate a corpus from a directory:
- Right-click on a corpus and choose "Populate".
- Choose the directory where your files are located.
- (Optional) Specify the file extension, so only files with this extension are loaded.
- (Optional) Specify the encoding to be used, when the documents are loaded in GATE. If left blank (default), then the default platform encoding will be used (for more information see the User's Guide).
To view corpus double click its name, and to view documents displayed inside a corpus view, click on their name twice. From corpus view it is possible to add/remove documents using buttons provided.
1.2.3. Processing Resources
Learn how to load a processing resource watching a movie or follow these steps:
- Right-click on 'Processing Resources' and choose 'New', then choose the desired resource from the list (e.g., ANNIE English Tokeniser).
- In the dialog box either provide a name for the resource or leave blank to use an automatically generated one.
- Click OK.
- Load as many resources as you want following steps 1-3.
To unload a resource: right click on the resource's name and choose "Close". If you wish to unload more than one resource, select them all, right click and choose "Close all".
To create a pipeline:
- Choose Application/New/Pipeline.
- (Optional) Provide a name for your application.
- From the list of available PRs, add those you wish to the list of "Selected processing resources" using the right arrow button. These will be the modules that comprise your application.
- Double-click on each resource and select the document which you wish to process from the list of available documents.
- Click Run.
1.2.4. ANNIE
ANNIE is a ready-made information extraction system for English. Watch the movie to learn how to use ANNIE or how to create another application and run it over a corpus or follow the steps below:
- From GATE's File menu select "Load ANNIE"/"With defaults".
- Wait until all processing resources are loaded and the application is created. This process cannot be interrupted or cancelled.
- Double-click on the ANNIE application.
- Select the corpus that you wish to process from the list of available corpora.
- Click Run.
To create another application with loaded resources:
- Load all processing resources you wish to include in your application, e.g., a tokeniser, sentence splitter and a part-of-speech tagger.
- Choose Application/New/Corpus Pipeline.
- (Optional) Provide a name for your application.
- From the list of available PRs, add those you wish to the list of "Selected processing resources" using the right arrow button. These will be the modules that comprise your application.
- Select the corpus that you wish to process from the list of available corpora.
- Click Run.
The processing results can be inspected on a per document basis. Just double-click on each document.
1.2.5. Annotations
The processing resources produce their results in the form of annotations associated with the text. For example, ANNIE produces annotations for different types of name entities, such as persons, organisations, dates, money, percentages. It also has a module which detects orthographic coreference between names, e.g. all mentions of UK in the text are identified as referring to an entity of type Location, called UK.
Watch movie or follow the steps below to view the processing results, i.e., the annotations on each document:
- Double-click on the document's name to display the document editor.
- Press the "Annotation sets" button.
- From the list of annotation types, select the ones you'd like to see, e.g. Person.
- The viewer will colour all parts of the text, which were annotated with this type. Note that annotations of more than one type could be associated with the same string.
- (Optional) To view more details about each of these annotations, press the "Annotations" button.
- (Optional) To view corefering strings, press the Coreference button (it will not be there unless the ANNIE Orthomatcher has been run).
1.2.6. Save/restore of application
Applications only need to be created once, then saved and re-loaded again the next time GATE is started.
To learn how to save an application watch movie or:
- Load the PRs you wish to have in your application, then create it. Do not set its document/corpus, so that they do not get loaded automatically each time you restore the application (unless this is what you need).
- Right-click on your application (e.g. ANNIE_001) and choose "Save application state".
- Provide a file name for your application.
To learn how to restore an application from file see movie or follow these steps:
- Right-click on "Applications", then select "Restore application from file" or click on the File menu and choose "Restore application from file".
- Select the file where you have saved your application.
- Wait until it gets restored. This will load all its processing resources and the application itself, which specifies their execution order.
1.2.7. Data stores
Learn how to create data store and save corpus with documents in it. Watch a movie.
Watch a load data store movie.
1.2.8. ANNIC
Watch ANNIC movie. Update for Annic 2008.
1.2.9. Ontologies
NOTE: This is VERY out of date!
Load ontology, edit its classes and properties, create new instances with this GATE's plug-in. Learn how.
2. GATE demos
ANNIE entity recognition web service demo.
The CLIE movie shows how to:
- load CLIE (Controlled Language Information Extraction),
- load a sample document in the controlled language,
- run the CLIE application to generate a ontology from the text,
- add more text and update the ontology,
- identify and correct errors in the text, and
- save and reload ontologies in various formats (.RDF, .XML, .OWL).
The OBIE movie shows how to:
- load an unannotated document and automatically annotate it with OBIE (Ontology-Based Information Extraction),
- manually add, delete and change annotations using OCAT (Ontology Corpus Annotation Tool), which displays an ontology tree of annotations, and
- send the document back to the trainer to improve the model incrementally.
This movie demonstrates that as the user keeps making corrections and sending them back to the trainer, the quality of the automatic annotation improves.
The PrestoSpace demo demonstrates using GATE for annotation and Information Extraction (IE) over multimedia content. Watch demo (turn the sound on!).
CLOnE QL demo developed as a part of TAO project demonstrates a Natural Language Processing Tool for enchanced knowledge access ie. querying the knowledge store using human-like language.