Log in Help
Homegatedoc 〉 plugins.html

This page lists some of the plugins that are currently available with GATE:

For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".

To submit a plugin, please contact us via the gate-users mailing list.

Plugins included in the GATE distribution

Compound Document  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentImpl
Compound Document From Xml  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentFromXml
Compound Document Editor  Editor for compound documents. (docs gate.compound.gui.CompoundDocumentEditor
GATE Composite document  GATE Composite document. (docs gate.composite.impl.CompositeDocumentImpl
Switch Member PR  Sets the focus of a compound document to a specified member document. (docs gate.compound.impl.SwitchMemberPR
Delete Member PR  Deletes one member document from a compound doc. (docs gate.compound.impl.DeleteMemberPR
Combine Members PR  Combines documents in a composite document. (docs gate.composite.impl.CombineMembersPR
Segment Processing PR  Processes individual segments as separate documents (docs gate.composite.impl.SegmentProcessingPR
ExportAlignmentPR  A PR to export alignment information in an xml file.  gate.alignment.ExportAlignmentPR
Annotation Schema  An annotation type and its features. (docs gate.creole.AnnotationSchema
GATE Unicode Tokeniser  A customisable Unicode tokeniser. (docs gate.creole.tokeniser.SimpleTokeniser
ANNIE English Tokeniser  A customisable English tokeniser. (docs gate.creole.tokeniser.DefaultTokeniser
ANNIE Gazetteer  A list lookup component. (docs gate.creole.gazetteer.DefaultGazetteer
Sharable Gazettee  A list lookup component. (docs gate.creole.gazetteer.SharedDefaultGazetteer
Hash Gazetteer  A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs com.ontotext.gate.gazetteer.HashGazetteer
JAPE Transducer  A module for executing Jape grammars. (docs gate.creole.Transducer
ANNIE NE Transducer  ANNIE named entity grammar. (docs gate.creole.ANNIETransducer
ANNIE Sentence Splitter  ANNIE sentence splitter. (docs gate.creole.splitter.SentenceSplitter
RegEx Sentence Splitter  A sentence splitter based on regular expressions. (docs gate.creole.splitter.RegexSentenceSplitter
ANNIE POS Tagger  Mark Hepple's Brill-style POS tagger (docs gate.creole.POSTagger
ANNIE OrthoMatcher  ANNIE orthographical coreference component. (docs gate.creole.orthomatcher.OrthoMatcher
ANNIE Pronominal Coreferencer  Pronominal Coreference resolution component. (docs gate.creole.coref.Coreferencer
ANNIE Nominal Coreferencer  Nominal Coreference resolution component (docs gate.creole.coref.NominalCoref
Document Reset PR  Remove named annotation sets or reset the default annotation set (docs gate.creole.annotdelete.AnnotationDeletePR
Jape Viewer  A JAPE grammar file viewer (docs gate.gui.jape.JapeViewer
Gazetteer Editor  Gazetteer viewer and editor. (docs gate.gui.GazetteerEditor
Annotation Merging PR  Merge Annotations from different annotators. (docs gate.merger.AnnotationMergingMain
Copy Anns to Another Doc PR  Copy the annotations from one document to another document. (docs gate.copyAS2AnoDoc.CopyAS2AnoDocMain
Legacy Coref Data Writer  A simple PR that converts co-reference data from the Relations-based model to the legacy format (based on 'matches' annotation and document features).  gate.creole.coref.LegacyCorefDataWriter
OrthoRef  An orthographic coreferencer  gate.creole.coref.OrthoRef
MediaWiki Document Format    gate.corpora.MediaWikiDocumentFormat
MediaWiki XML Document Format    gate.corpora.MediaWikiXMLDocumentFormat
GATE .cochrane.txt document format  Load this to allow the opening of Cochrane text documents, and choose the mime type "text/x-cochrane", or use the correct file extension.  gate.corpora.CochraneTextDocumentFormat
GATE .pubMed.txt document format  Load this to allow the opening of PubMed text documents, and choose the mime type "text/x-pubmed"or use the correct file extension.  gate.corpora.PubmedTextDocumentFormat
Large KB Gazetteer  KIM KB based alias-lookup commponent (docs com.ontotext.kim.gate.KimGazetteer
Semantic Enrichment PR  The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories. This is distributed in the same CREOLE plugin as the LKB Gazetteer. (docs com.ontotext.kim.gate.SesameEnrichment
Onto Root Gazetteer  A ontology lookup component (docs gate.clone.ql.OntoRootGaz
Fake Sentence Splitter  Fake Sentence Splitter is used by Onto Root Gazetteer internally as it creates 'fake' annotation type 'Sentence' without analysing the text by a proper Sentence Splitter. The reason for doing this is enabling the POS Tagger to work properly, as the input text is usually not a proper sentence (i.e. ontology resource name or label). 'Faking' sentence splitting optimises the processing as Onto Root Gazetteer usually does not process internally any multisentence text.   gate.clone.ql.FakeSentenceSplitter
GENIA Sentence Splitter  A processing resource that takes document and corpus parameters (docs gate.creole.genia.splitter.GENIASentenceSplitter
Groovy support for GATE    gate.groovy.GroovySupport
Groovy scripting PR  Runs a Groovy script as a processing resource (docs gate.groovy.ScriptPR
Scriptable Controller  A controller whose execution strategy is controlled by a Groovy script (docs gate.groovy.ScriptableController
Control Script  Editor for the Groovy script controlling a scriptable controller  gate.groovy.gui.ControllerScriptEditor
Script Editor  Editor for the Groovy script behind this PR  gate.groovy.gui.ScriptPREditor
SearchPR  Provides IR functionality. (docs gate.creole.ir.SearchPR
Search Results  Viewer for IR search results  gate.gui.SearchPRViewer
IAA Computation PR  Compute inter-annotator agreement (IAA). (docs gate.iaaplugin.IaaMain
JAPE-Plus Viewer  A JAPE grammar file viewer (docs gate.gui.jape.plus.Viewer
JAPE-Plus Transducer  An optimised, JAPE-compatible transducer.  gate.jape.plus.Transducer
KEA Keyphrase Extractor  A Keyphrase Extractor by Eibe Frank. (docs gate.creole.kea.Kea
KEA Corpus Importer  Imports a KEA-style corpus into GATE  gate.creole.kea.CorpusImporter
TextCat Fingerprint Generator  Generate language fingerprints for use with the TextCat Language Indentification PR (docs org.knallgrau.utils.textcat.FingerprintGenerator
TextCat Language Identification  Recognizes the document language using TextCat (docs org.knallgrau.utils.textcat.LanguageIdentifier
Arabic Gazetteer Collector    arabic.ArabicGazCollector
Arabic Gazetteer  A list lookup component. (docs arabic.ArabicGazetteer
Arabic IE System    arabic.ArabicIE
Arabic Infered Gazetteer  A list lookup component. (docs arabic.ArabicInferedGazetteer
Arabic OrthoMatcher  ANNIE orthographical coreference component. (docs arabic.ArabicOrthoMatcher
Arabic Tokeniser  A customisable English tokeniser. (docs arabic.ArabicTokeniser
Arabic Main Grammar  A module for executing Jape grammars. (docs arabic.ArabicTransducer
Cebuano POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Cebuano Gazetteer  A list lookup component. (docs cebuano.CebuanoGazetteer
Cebuano Gazetteer Tokeniser  A list lookup component. (docs cebuano.CebuanoGazetteerTokeniser
Cebuano IE System    cebuano.CebuanoIE
Cebuano Tokeniser  A customisable English tokeniser. (docs cebuano.CebuanoTokeniser
Cebuano Transducer  A module for executing Jape grammars. (docs cebuano.CebuanoTransducer
Cebuano Transducer Postprocessor  A module for executing Jape grammars. (docs cebuano.CebuanoTransducerPost
Chinese Segmenter PR  Segment the Chinese text into words, based on the PAUM learning algorithm. (docs gate.chineseSeg.ChineseSegMain
Chinese IE System    chinese.ChineseIE
French IE System    french.FrenchIE
German IE System    german.GermanIE
Hindi Tokeniser  A customisable Hindi tokeniser.  hindi.HindiTokeniser
Hindi Gazetteer  A list lookup component.  hindi.HindiGazetteer
Hindi Splitter  A Sentence Splitter.  hindi.HindiSplitter
Hindi Tokeniser Gazetteer  A list lookup component.  hindi.HindiTokeniserGazetteer
Hindi Main Grammar  A module for executing Jape grammars  hindi.HindiTransducer
Hindi Tokeniser Postprocessor  A module for executing Jape grammars  hindi.HindiTokeniserPostprocessor
Hindi OrthoMatcher  Hindi Orthomatcher  hindi.HindiOrthoMatcher
Hindi POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Romanian Tokeniser  A customisable Romanian tokeniser.  romanian.RomanianTokeniser
Romanian Gazetteer  A list lookup component.  romanian.RomanianGazetteer
Romanian Transducer  A module for executing Jape grammars  romanian.RomanianTransducer
Romanian IE System    romanian.RomanianIE
Batch Learning PR  Supports training, application and evaluation of machine learning models for NLP tasks (docs gate.learning.LearningAPIMain
LingPipe Tokenizer PR  Provides a LingPipe tokenizer. (docs gate.lingpipe.TokenizerPR
LingPipe NER PR  LingPipe Named Entity Recognizer (docs gate.lingpipe.NamedEntityRecognizerPR
LingPipe Language Identifier PR  GATE PR for language identification using LingPipe (docs gate.lingpipe.LanguageIdentifierPR
LingPipe POS Tagger PR  Provides a LingPipe part of speech tagger. (docs gate.lingpipe.POSTaggerPR
LingPipe Sentence Splitter PR  Provides an interface to LingPipe sentence splitter API. (docs gate.lingpipe.SentenceSplitterPR
Machine Learning PR  Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs gate.creole.ml.MachineLearningPR
ConnectSesameOntology  Connect to a repository containing and ontology (docs gate.creole.ontology.impl.sesame.ConnectSesameOntology
CreateSesameOntology  Create a ontology from a Sesame configuration file for a repository (docs gate.creole.ontology.impl.sesame.CreateSesameOntology
OWLIM Ontology  Ontology created as a temporary OWLIM3 in-memory repository (docs gate.creole.ontology.impl.sesame.OWLIMOntology
OWLIM Ontology DEPRECATED  Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs gate.creole.ontology.owlim.OWLIMOntologyLR
BDM Computation PR  Compute BDM score for each pair of concepts in the given ontology. (docs gate.bdmComp.BDMCompMain
OntoGazetteer  A list lookup component based on mapping between ontology classes and gazetteer lists. (docs gate.creole.gazetteer.OntoGazetteerImpl
GATE Ontology Editor  Ontology editing tool. (docs gate.gui.ontology.OntologyEditor
OAT  Ontology Annotation Tool. (docs gate.creole.ontology.ocat.OntologyViewer
RAT-C  Relation Annotation Tool Class view. (docs gate.gui.docview.OntologyClassView
RAT-I  Relation Annotation Tool Instance view. (docs gate.gui.docview.OntologyInstanceView
GAZE  Gazetteer viewer and editor (docs com.ontotext.gate.vr.Gaze
OpenNLP NER  NER PR using a set of OpenNLP maxent models (docs gate.opennlp.OpenNLPNameFin
OpenNLP Chunker  Chunker using an OpenNLP maxent model (docs gate.opennlp.OpenNlpChunker
OpenNLP POS Tagger  POS Tagger using an OpenNLP maxent model (docs gate.opennlp.OpenNlpPOS
OpenNLP Sentence Splitter  Sentence splitter using an OpenNLP maxent model (docs gate.opennlp.OpenNlpSentenceSplit
OpenNLP Tokenizer  Tokenizer using an OpenNLP maxent model (docs gate.opennlp.OpenNlpTokenizer
Minipar Wrapper  MiniPar is a shallow parser. It determines the dependency relationships between the words of a sentence. (docs minipar.Minipar
RASP2 Tokenizer  RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs com.digitalpebble.rasp2.token.RASPTokenizer
RASP POS Converter  Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs com.digitalpebble.rasp2.tagger.C2Transducer
RASP2 POS Tagger  RASP part-of-speech tagger, creating WordForm annotations (docs com.digitalpebble.rasp2.tagger.PosTagger
RASP2 Morphological Analyser  RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs com.digitalpebble.rasp2.morph.MorphoAnnotator
RASP2 Parser  RASP dependency parser (docs com.digitalpebble.rasp2.parser.ParserAnnotator
StanfordParser  Stanford parser wrapper (docs gate.stanford.Parser
English Dependency Parser    gate.stanford.apps.EnglishDependencies
English POS Tagger and Dependency Parser    gate.stanford.apps.EnglishPOSDependencies
SUPPLE Parser  SUPPLE bottom-up chart parser. (docs shef.nlp.supple.SUPPLE
Schema Annotations Editor  An annotation editor restricted by schemas. (docs gate.gui.annedit.SchemaAnnotationEditor
Annotation Schema  An annotation type and its features. (docs gate.creole.AnnotationSchema
Schema Enforcer  Produces an annotation set whose content is restricted by the specified set of schemas (docs gate.creole.schema.SchemaEnforcer
Simple Schema Viewer  A Simple Annotation Schema Viewer  gate.gui.schema.SimpleSchemaViewer
Stemmer PR  Wrapper for the Snowball stemmer. (docs stemmer.SnowballStemmer
ABNER Tagger  GATE wrapper over ABNER (docs gate.abner.AbnerTagger
Boilerpipe Content Detection  Uses boilerpipe to determine which sections of a document are interesting content and which are just boilerplate (docs gate.creole.boilerpipe.BoilerPipe
Chemistry Tagger  A tagger for chemical names. (docs mark.chemistry.Tagger
Date Normalizer  provides normalized values for all known dates  gate.creole.dates.DateNormalizer
GenericTagger  The Generic Tagger is Generic! (docs gate.taggerframework.GenericTagger
Lupedia Service PR  Runs a lupedia annotation service on a GATE document  gate.lupedia.LupediaServicePR
ANNIE+Measurements    gate.creole.measurements.ANNIEMeasurements
Measurements    gate.creole.measurements.MeasurementsApplication
Measurement Tagger  A measurement tagger based upon GNU Units  gate.creole.measurements.MeasurementsTagger
MetaMap Annotator  This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port (docs gate.metamap.MetaMapPR
MutationFinder  GATE MutationFinder Wrapper (docs gate.creole.mutationfinder.MutationFinderPR
NormaGene Tagger  A processing resource that takes document and corpus parameters  gate.creole.normagene.NormaGene
Noun Phrase Chunker  Implementation of the Ramshaw and Marcus base noun phrase chunker (docs mark.chunking.GATEWrapper
Noun Phrase Chunker    mark.chunking.ChunkingApp
Numbers Tagger  Finds numbers in (both words and digits) and annotates them with their numeric value (docs gate.creole.numbers.NumbersTagger
Roman Numerals Tagger  Finds and annotates Roman numerals (docs gate.creole.numbers.RomanNumeralsTagger
OpenCalais Tagger  An OpenCalais based semantic annotator (docs gate.opencalais.OpenCalais
Penn BioTagger    gate.creole.pennbio.BioTagger
Penn BioTagger: Genes  A processing resource that takes document and corpus parameters (docs gate.creole.pennbio.GeneTagger
Penn BioTagger: Malignancy  A processing resource that takes document and corpus parameters (docs gate.creole.pennbio.MalignancyTagger
Penn BioTokenizer  A processing resource that takes document and corpus parameters (docs gate.creole.pennbio.Tokenizer
Penn BioTagger: Variation  A processing resource that takes document and corpus parameters (docs gate.creole.pennbio.VariationTagger
Zemanta Service PR  Runs a zemanta annotation service on a GATE document  gate.zemanta.ZemantaServicePR
QA Summariser for Teamware  The Quality Assurance PR for teamware (docs gate.qa.QAForTeamwarePR
TermRaider English Term Extraction    gate.termraider.TermRaiderEnglish
Termbank Score Copier  Copy scores from Termbanks back to their source annotations  gate.termraider.apply.TermScoreCopier
AnnotationTermbank  TermRaider Termbank derived from document annotations  gate.termraider.bank.AnnotationTermbank
HyponymyTermbank  TermRaider Termbank derived from head/string hyponymy  gate.termraider.bank.HyponymyTermbank
PMI Bank  Pointwise Mutual Information from corpora  gate.termraider.bank.PMIBank
TfIdfTermbank  TermRaider Termbank derived from vectors in document features  gate.termraider.bank.TfIdfTermbank
Pairbank Viewer  viewer for the TermRaider Pairbank  gate.termraider.gui.PairbankViewer
Termbank Viewer  viewer for the TermRaider Termbank  gate.termraider.gui.TermbankViewer
Gazetteer List Collector  Gazetteer lists collector. (docs gate.creole.GazetteerListsCollector
ANNIE VP Chunker  ANNIE VP Chunker component. (docs gate.creole.VPChunker
Annotation Set Transfer  Annotation set transfer component. (docs gate.creole.annotransfer.AnnotationSetTransfer
Flexible Exporter  Exports a document with GATE annotations to its original format. (docs gate.creole.dumpingPR.DumpingPR
GATE Morphological analyser  Morphological Analyzer for the English Language. (docs gate.creole.morph.Morph
Flexible Gazetteer  A more flexible list lookup component. (docs gate.creole.gazetteer.FlexibleGazetteer
Syntax tree viewer  Viewer for syntax trees generated by a parser. (docs gate.gui.SyntaxTreeViewer
Configurable Exporter  Allows annotations to be exported according to a specified format.  gate.configurableexporter.ConfigurableExporter
Quality Assurance PR  The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer  gate.qa.QualityAssurancePR
UIMA Analysis Engine  Wrapper for a Text Analysis Engine from UIMA. (docs gate.uima.AnalysisEnginePR
Crawler PR  GATE implementation of the Websphinx crawling API (docs crawl.CrawlPR
WordNet 1.6  Princeton WordNet 1.6. (docs gate.wordnet.IndexFileWordNetImpl
WordNet  WordNet (docs gate.wordnet.JWNLWordNetImpl
WordNet Viewer  WordNet viewer  gate.gui.wordnet.WordNetViewer

Other contributed plugins

website download
OrganismTagger The OrganismTagger is a hybrid rule-based/machine-learning system that extracts organism mentions from the biomedical literature, normalizes them to their scientific name, and provides grounding to the NCBI Taxonomy database.  
Multi-lingual Noun Phrase Extractor (MuNPEx)
website download
Munpex MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). en-np_main.jape
Durm German lemmatizer
website download
Durm German lemmatizer The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns.  
XCES tools
website download
Tools to handle documents conforming to the XML Corpus Encoding Standard (XCES) format, used by the American National Corpus. XCES is a way of encoding texts with standoff markup in XML.
ANC Document An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. org.xces.gate.XCESDocument
ANC Load Standoff Loads standoff annotations into an existing document. org.xces.creole.LoadStandoff
ANC Save Content Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. org.xces.creole.SaveContent
ANC Save Standoff Saves annotations from a Document to an XCES-compliant standoff markup file. org.xces.creole.SaveStandoff
Sen wrapper (Japanese morphological analyser)
website (in Japanese)
Sen is a morphological analyser for Japanese. This is a wrapper to allow it to be used from GATE, you must also install sen itself. See the documentation (in Japanese) for details.
Sen Wrapper Morphological analyser for Japanese jp.co.ditlab.jgate.SenWrapper
Russian morph tagger
website download
Provides GATE wrapper for mystem russian morphological parser. Allows to execute native analyzer, parse output, and assign morpho features as GATE annotations for the document.
Russian MorphTagger MorphTagger for russian language, based on MyStem Yandex' parser ru.itbrains.gate.morph.MorphTagger
String Annotation Plugin
website download
Processing resources for directly annotating the string content of a document.
Extended List Gazetteer Extended version of the GATE Default List Gazetteer. In addition to the features of the original, built-in version of the List Gazetteer, this version provides features for more powerful matching of partial words and annotating prefixes and suffixes as well as more versatile handling of word boundaries and whitespace. at.ofai.gate.extendedgazetteer.ExtendedGazetteer
Simple Regexp Annotator Use rules based on Java regular expressions to annotate the document content. at.ofai.gate.regexpannotator.SimpleRegexpAnnotator
AppDoc — GATE Application Documentation
website download
A plugin that generations documentation from your application/pipeline/gapp files in various formats. In addition it provides new visual resources in the GATE Developer GUI to add author, version and documentation comments to pipelines and processing resources.
AppDoc Visual resource for adding author/version/comment to pipelines and processing resources at.ofai.gate.appdoc.AppDoc
AppDocGen Visual resource for selecting a documentation template and generating documentation files at.ofai.gate.appdoc.AppDocGen
VirtualCorpus — Directory- and JDBC Corpus LRs
website download
A plugin that provides two new corpus language resources, DirectoryCorpus for directly using files in a directory through a corpus LR and JDBCCorpus for directly using documents stored in a field of a JDBC database table.
DirectoryCorpus Language resource for accessing GATE XML files in a directory directly via a corpus resource at.ofai.gate.virtualcorpus.DirectoryCorpus
JDBCCorpus Language resource for accessing GATE XML documents stored in a field of a JDBC database table directly via a corpus resource at.ofai.gate.virtualcorpus.JDBCCorpus
BWP Gazetteer
website download
This plugin provides an approximate gazetteer for GATE, based on Levenshtein's Edit Distance for strings. Its goal is to handle texts with noise and errors, in which GATE's default gazetteers may have difficulties.
BWP Gazetteer Extended version of the transducer-based List Gazetteer  
website download
Annotates documents like a gazetteer, but takes the terms from OWL annotation properties in an ontology, rather than from a separate list of terms.
Apolda Ontology Annotator Ontology-based lookup taking terms from properties in the ontology. telin.Apolda
Reported Speech Tagger
website download
Automatically detects and tags reported speech constructs, in particular the source, reporting verb and content.
Reporting Verb marker JAPE transducer which tags reporting verbs  
Reported Speech finder JAPE transducer which tags reported speech  
Keyphrase Extraction Module
website download
Keyphrase extraction and language identification.
LanguageIdentification identifies the language of a document using character n-grams  

NLP group NLP Group
kewl red line