Log in Help
Homegatedoc 〉 plugins.html

This page lists some of the plugins that are currently available with GATE:

For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".

To submit a plugin, please contact us via the gate-users mailing list.

Plugins included in the GATE distribution

AlchemyAPI: Entity Extraction  Runs the AlchemyAPI Entity Extraction service on a GATE document  gate.alchemyAPI.EntityExtraction
AlchemyAPI: Keyword Extraction  Runs the AlchemyAPI Keyword Extraction service on a GATE document  gate.alchemyAPI.KeywordExtraction
Compound Document  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentImpl
Compound Document From Xml  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentFromXml
Compound Document Editor  Editor for compound documents. (docs gate.compound.gui.CompoundDocumentEditor
GATE Composite document  GATE Composite document. (docs gate.composite.impl.CompositeDocumentImpl
Switch Member PR  Sets the focus of a compound document to a specified member document. (docs gate.compound.impl.SwitchMemberPR
Delete Member PR  Deletes one member document from a compound doc. (docs gate.compound.impl.DeleteMemberPR
Combine Members PR  Combines documents in a composite document. (docs gate.composite.impl.CombineMembersPR
Segment Processing PR  Processes individual segments as separate documents (docs gate.composite.impl.SegmentProcessingPR
ExportAlignmentPR  A PR to export alignment information in an xml file.  gate.alignment.ExportAlignmentPR
GATE Unicode Tokeniser  A customisable Unicode tokeniser. (docs gate.creole.tokeniser.SimpleTokeniser
ANNIE English Tokeniser  A customisable English tokeniser. (docs gate.creole.tokeniser.DefaultTokeniser
ANNIE Gazetteer  A list lookup component. (docs gate.creole.gazetteer.DefaultGazetteer
Sharable Gazettee  A list lookup component. (docs gate.creole.gazetteer.SharedDefaultGazetteer
Hash Gazetteer  A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs com.ontotext.gate.gazetteer.HashGazetteer
JAPE Transducer  A module for executing Jape grammars. (docs gate.creole.Transducer
ANNIE NE Transducer  ANNIE named entity grammar. (docs gate.creole.ANNIETransducer
ANNIE Sentence Splitter  ANNIE sentence splitter. (docs gate.creole.splitter.SentenceSplitter
RegEx Sentence Splitter  A sentence splitter based on regular expressions. (docs gate.creole.splitter.RegexSentenceSplitter
ANNIE POS Tagger  Mark Hepple's Brill-style POS tagger (docs gate.creole.POSTagger
ANNIE OrthoMatcher  ANNIE orthographical coreference component. (docs gate.creole.orthomatcher.OrthoMatcher
ANNIE Pronominal Coreferencer  Pronominal Coreference resolution component. (docs gate.creole.coref.Coreferencer
ANNIE Nominal Coreferencer  Nominal Coreference resolution component (docs gate.creole.coref.NominalCoref
Document Reset PR  Remove named annotation sets or reset the default annotation set (docs gate.creole.annotdelete.AnnotationDeletePR
Jape Viewer  A JAPE grammar file viewer (docs gate.gui.jape.JapeViewer
Gazetteer Editor  Gazetteer viewer and editor. (docs gate.gui.GazetteerEditor
Annotation Merging PR  Merge Annotations from different annotators. (docs gate.merger.AnnotationMergingMain
Copy Anns to Another Doc PR  Copy the annotations from one document to another document. (docs gate.copyAS2AnoDoc.CopyAS2AnoDocMain
Legacy Coref Data Writer  A simple PR that converts co-reference data from the Relations-based model to the legacy format (based on 'matches' annotation and document features).  gate.creole.coref.LegacyCorefDataWriter
OrthoRef  An orthographic coreferencer  gate.creole.coref.OrthoRef
Entity Classification Job Builder  Build a CrowdFlower job asking users to select the right label for entities (docs gate.crowdsource.classification.EntityClassificationJobBuilder
Entity Classification Results Importer  Import judgments from a CrowdFlower job created by the Entity Classification Job Builder as GATE annotations. (docs gate.crowdsource.classification.EntityClassificationResultsImporter
Majority-vote consensus builder (classification)  Process results of a crowd annotation task to find where annotators agree and disagree. (docs gate.crowdsource.classification.MajorityVoteClassificationConsensus
Entity Annotation Job Builder  Build a CrowdFlower job asking users to annotate entities within a snippet of text (docs gate.crowdsource.ne.EntityAnnotationJobBuilder
Entity Annotation Results Importer  Import judgments from a CrowdFlower job created by the Entity Annotation Job Builder as GATE annotations. (docs gate.crowdsource.ne.EntityAnnotationResultsImporter
Majority-vote consensus builder (annotation)  Process results of a crowd annotation task to find where annotators agree and disagree. (docs gate.crowdsource.ne.MajorityVoteAnnotationConsensus
EDT Monitor  Warns whenever an AWT component is updated from anywhere other than the event dispatch thread (docs gate.creole.EDTMonitor
Java Heap Dumper  Dumps the Java heap to the specified file (docs gate.creole.HeapDumper
Log4J Level: ALL  Allows the Log4J log level to be set to ALL from within the GUI (docs gate.creole.Log4JALL
Show/Hide Resources  Show resources that would otherwise be hidden, e.g. resources created for internal use by other resources (docs gate.creole.Reveal
The Duplicator  Duplicate any resource with a right click menu option (docs gate.creole.TheDuplicator
Unload Unused Plugins  Unloads all plugins for which we cannot find any loaded instances (docs gate.creole.UnusedPluginUnloader
Document normalizer  Normalize document content to remove "smart quotes" etc. (docs gate.creole.DocumentNormalizer
CSV Corpus Populater  Populate a corpus from CSV files (docs gate.corpora.CSVImporter
GATE DataSift JSON Document Format  Format parser for DataSift JSON files  gate.corpora.DataSiftFormat
Fast Infoset Document Format  Format parser for GATE XML stored in the binary Fast Infoset format (docs gate.corpora.FastInfosetDocumentFormat
Fast Infoset Exporter  Export GATE documents to GATE XML stored in the binary Fast Infoset format (docs gate.corpora.FastInfosetExporter
HTML5 Microdata Exporter  Exports Annotations as HTML5 Microdata  gate.creole.microdata.MicrodataExporter
MediaWiki Document Format  Document format for parsing MediaWiki markup (docs gate.corpora.MediaWikiDocumentFormat
MediaWiki Corpus Populater  Populate a corpus from a MediaWiki XML dump (docs gate.corpora.MediaWikiPopulater
MediaWiki XML Document Format  Deprecated MediaWiki importer  gate.corpora.MediaWikiXMLDocumentFormat
GATE .cochrane.txt document format  Load this to allow the opening of Cochrane text documents, and choose the mime type "text/x-cochrane", or use the correct file extension.  gate.corpora.CochraneTextDocumentFormat
GATE .pubMed.txt document format  Load this to allow the opening of PubMed text documents, and choose the mime type "text/x-pubmed"or use the correct file extension.  gate.corpora.PubmedTextDocumentFormat
Large KB Gazetteer  KIM KB based alias-lookup commponent (docs com.ontotext.kim.gate.KimGazetteer
Semantic Enrichment PR  The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories. (docs com.ontotext.kim.gate.SesameEnrichment
Onto Root Gazetteer  An ontology lookup component (docs gate.clone.ql.OntoRootGaz
GENIA Sentence Splitter  A processing resource that takes document and corpus parameters (docs gate.creole.genia.splitter.GENIASentenceSplitter
Groovy support for GATE    gate.groovy.GroovySupport
Groovy scripting PR  Runs a Groovy script as a processing resource (docs gate.groovy.ScriptPR
Scriptable Controller  A controller whose execution strategy is controlled by a Groovy script (docs gate.groovy.ScriptableController
Control Script  Editor for the Groovy script controlling a scriptable controller  gate.groovy.gui.ControllerScriptEditor
Script Editor  Editor for the Groovy script behind this PR  gate.groovy.gui.ScriptPREditor
SearchPR  Provides IR functionality. (docs gate.creole.ir.SearchPR
Search Results  Viewer for IR search results  gate.gui.SearchPRViewer
Corpus Indexing Support    gate.creole.ir.CorpusIndexingSupport
Lucene IR Engine    gate.creole.ir.lucene.LuceneIREngine
IAA Computation PR  Compute inter-annotator agreement (IAA). (docs gate.iaaplugin.IaaMain
JAPE-Plus Viewer  A JAPE grammar file viewer (docs gate.gui.jape.plus.Viewer
JAPE-Plus Transducer  An optimised, JAPE-compatible transducer.  gate.jape.plus.Transducer
KEA Keyphrase Extractor  A Keyphrase Extractor by Eibe Frank. (docs gate.creole.kea.Kea
KEA Corpus Importer  Imports a KEA-style corpus into GATE  gate.creole.kea.CorpusImporter
TextCat Fingerprint Generator  Generate language fingerprints for use with the TextCat Language Indentification PR (docs org.knallgrau.utils.textcat.FingerprintGenerator
TextCat Language Identification  Recognizes the document language using TextCat (docs org.knallgrau.utils.textcat.LanguageIdentifier
Arabic Gazetteer Collector    arabic.ArabicGazCollector
Arabic Gazetteer  A list lookup component. (docs arabic.ArabicGazetteer
Arabic IE System  Ready-made Arabic IE application  arabic.ArabicIE
Arabic Infered Gazetteer  A list lookup component. (docs arabic.ArabicInferedGazetteer
Arabic OrthoMatcher  ANNIE orthographical coreference component. (docs arabic.ArabicOrthoMatcher
Arabic Tokeniser  A customisable English tokeniser. (docs arabic.ArabicTokeniser
Arabic Main Grammar  A module for executing Jape grammars. (docs arabic.ArabicTransducer
BulStem  This plugin is an implementation of the BulStem stemmer algorithm for Bulgarian developed by Preslav Nakov. (docs gate.bulstem.BulStemPR
Cebuano POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Cebuano Gazetteer  A list lookup component. (docs cebuano.CebuanoGazetteer
Cebuano Gazetteer Tokeniser  A list lookup component. (docs cebuano.CebuanoGazetteerTokeniser
Cebuano IE System  Ready-made Cebuano IE application  cebuano.CebuanoIE
Cebuano Tokeniser  A customisable English tokeniser. (docs cebuano.CebuanoTokeniser
Cebuano Transducer  A module for executing Jape grammars. (docs cebuano.CebuanoTransducer
Cebuano Transducer Postprocessor  A module for executing Jape grammars. (docs cebuano.CebuanoTransducerPost
Chinese Segmenter PR  Segment the Chinese text into words, based on the PAUM learning algorithm. (docs gate.chineseSeg.ChineseSegMain
Chinese IE System  Ready-made Chinese IE application  chinese.ChineseIE
French IE System  Ready-made French IE application  french.FrenchIE
German IE System  Ready-made German IE application  german.GermanIE
Hindi Tokeniser  A customisable Hindi tokeniser.  hindi.HindiTokeniser
Hindi Gazetteer  A list lookup component.  hindi.HindiGazetteer
Hindi Splitter  A Sentence Splitter.  hindi.HindiSplitter
Hindi Tokeniser Gazetteer  A list lookup component.  hindi.HindiTokeniserGazetteer
Hindi Main Grammar  A module for executing Jape grammars  hindi.HindiTransducer
Hindi Tokeniser Postprocessor  A module for executing Jape grammars  hindi.HindiTokeniserPostprocessor
Hindi OrthoMatcher  Hindi Orthomatcher  hindi.HindiOrthoMatcher
Hindi POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Romanian Tokeniser  A customisable Romanian tokeniser.  romanian.RomanianTokeniser
Romanian Gazetteer  A list lookup component.  romanian.RomanianGazetteer
Romanian Transducer  A module for executing Jape grammars  romanian.RomanianTransducer
Romanian IE System  Ready-made Romanian IE application  romanian.RomanianIE
RussIE  Basic version of the RussIE application (docs com.ontotext.russie.apps.RussIE
RussIE + Inflectional Gazetter  RussIE application with inflexional gazetteer (docs com.ontotext.russie.apps.RussIEInflex
RussIE + OrthoMatcher  RussIE application with orthomatcher (docs com.ontotext.russie.apps.RussIEOrtho
RussIE + Inflectional Gazetteer & OrthoMatcher  RussIE application with orthomatcher and inflexional gazetteer (docs com.ontotext.russie.apps.RussIEOrthoInflex
Inflectional gazetteer  Gazetteer with support for inflectional morphology (docs com.ontotext.russie.gazetteer.InflectionalGazetteer
Russian Gazetteer  Customised version of the hash gazetteer (docs com.ontotext.russie.gazetteer.RussGazetteer
POS Mapper  Map complex Russian morphology tags into simpler POS categories (docs com.ontotext.russie.morph.POSMapper
Russian POS Tagger  Part-of-speech tagger for Russian (docs com.ontotext.russie.morph.POSTagger
Batch Learning PR  Supports training, application and evaluation of machine learning models for NLP tasks (docs gate.learning.LearningAPIMain
LingPipe Tokenizer PR  Provides a LingPipe tokenizer. (docs gate.lingpipe.TokenizerPR
LingPipe NER PR  LingPipe Named Entity Recognizer (docs gate.lingpipe.NamedEntityRecognizerPR
LingPipe Language Identifier PR  GATE PR for language identification using LingPipe (docs gate.lingpipe.LanguageIdentifierPR
LingPipe POS Tagger PR  Provides a LingPipe part of speech tagger. (docs gate.lingpipe.POSTaggerPR
LingPipe Sentence Splitter PR  Provides an interface to LingPipe sentence splitter API. (docs gate.lingpipe.SentenceSplitterPR
Simplified Text Exporter  Simplified text exporter (HTML output)  gate.corpora.export.ExportSimplifiedHTML
Simplified Text Exporter  Simplified text exporter (plain text output)  gate.corpora.export.ExportSimplifiedText
Linguistic Simplifier  A processing resource that takes document and corpus parameters  gate.creole.summarization.linguistic.Simplifier
Linguistic Simplifier  Example application for the linguistic simplifier  gate.creole.summarization.linguistic.SimplifierApplication
Machine Learning PR  Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs gate.creole.ml.MachineLearningPR
ConnectSesameOntology  Connect to a repository containing and ontology (docs gate.creole.ontology.impl.sesame.ConnectSesameOntology
CreateSesameOntology  Create a ontology from a Sesame configuration file for a repository (docs gate.creole.ontology.impl.sesame.CreateSesameOntology
OWLIM Ontology  Ontology created as a temporary OWLIM3 in-memory repository (docs gate.creole.ontology.impl.sesame.OWLIMOntology
OWLIM Ontology DEPRECATED  Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs gate.creole.ontology.owlim.OWLIMOntologyLR
BDM Computation PR  Compute BDM score for each pair of concepts in the given ontology. (docs gate.bdmComp.BDMCompMain
OntoGazetteer  A list lookup component based on mapping between ontology classes and gazetteer lists. (docs gate.creole.gazetteer.OntoGazetteerImpl
GATE Ontology Editor  Ontology editing tool. (docs gate.gui.ontology.OntologyEditor
OAT  Ontology Annotation Tool. (docs gate.creole.ontology.ocat.OntologyViewer
RAT-C  Relation Annotation Tool Class view. (docs gate.gui.docview.OntologyClassView
RAT-I  Relation Annotation Tool Instance view. (docs gate.gui.docview.OntologyInstanceView
GAZE  Gazetteer viewer and editor (docs com.ontotext.gate.vr.Gaze
OpenNLP NER  NER PR using a set of OpenNLP maxent models (docs gate.opennlp.OpenNLPNameFin
OpenNLP Chunker  Chunker using an OpenNLP maxent model (docs gate.opennlp.OpenNlpChunker
OpenNLP POS Tagger  POS Tagger using an OpenNLP maxent model (docs gate.opennlp.OpenNlpPOS
OpenNLP Parser  Syntactic parser from Apache OpenNLP (docs gate.opennlp.OpenNlpParser
OpenNLP Sentence Splitter  Sentence splitter using an OpenNLP maxent model (docs gate.opennlp.OpenNlpSentenceSplit
OpenNLP Tokenizer  Tokenizer using an OpenNLP maxent model (docs gate.opennlp.OpenNlpTokenizer
Minipar Wrapper  MiniPar is a shallow parser. It determines the dependency relationships between the words of a sentence. (docs minipar.Minipar
RASP2 Tokenizer  RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs com.digitalpebble.rasp2.token.RASPTokenizer
RASP POS Converter  Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs com.digitalpebble.rasp2.tagger.C2Transducer
RASP2 POS Tagger  RASP part-of-speech tagger, creating WordForm annotations (docs com.digitalpebble.rasp2.tagger.PosTagger
RASP2 Morphological Analyser  RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs com.digitalpebble.rasp2.morph.MorphoAnnotator
RASP2 Parser  RASP dependency parser (docs com.digitalpebble.rasp2.parser.ParserAnnotator
SUPPLE Parser  SUPPLE bottom-up chart parser. (docs shef.nlp.supple.SUPPLE
Schema Annotations Editor  An annotation editor restricted by schemas. (docs gate.gui.annedit.SchemaAnnotationEditor
Schema Enforcer  Produces an annotation set whose content is restricted by the specified set of schemas (docs gate.creole.schema.SchemaEnforcer
Simple Schema Viewer  A Simple Annotation Schema Viewer  gate.gui.schema.SimpleSchemaViewer
Stanford NER  Stanford Named Entity Recogniser (docs gate.stanford.NER
StanfordParser  Stanford parser wrapper (docs gate.stanford.Parser
Stanford POS Tagger  Stanford Part-of-Speech Tagger (docs gate.stanford.Tagger
Stanford PTB Tokenizer  Stanford Penn Treebank v3 Tokenizer, for English (docs gate.stanford.Tokenizer
English Dependency Parser  Ready-made application for Stanford English parser  gate.stanford.apps.EnglishDependencies
English POS Tagger and Dependency Parser  Ready-made application for Stanford English POS tagger and parser  gate.stanford.apps.EnglishPOSDependencies
Stemmer PR  Wrapper for the Snowball stemmer. (docs stemmer.SnowballStemmer
ABNER Tagger  GATE wrapper over ABNER (docs gate.abner.AbnerTagger
Boilerpipe Content Detection  Uses boilerpipe to determine which sections of a document are interesting content and which are just boilerplate (docs gate.creole.boilerpipe.BoilerPipe
Chemistry Tagger  A tagger for chemical names. (docs mark.chemistry.Tagger
Date Annotation Normalizer  provides normalized values for all existing date annotations (docs gate.creole.dates.DateAnnotationNormalizer
Date Normalizer  provides normalized values for all known dates (docs gate.creole.dates.DateNormalizer
GenericTagger  The Generic Tagger is Generic! (docs gate.taggerframework.GenericTagger
Lupedia Service PR  Runs a lupedia annotation service on a GATE document (docs gate.lupedia.LupediaServicePR
ANNIE+Measurements  Ready-made application for ANNIE plus the measurement tagger  gate.creole.measurements.ANNIEMeasurements
Measurements  Ready-made application for measurement annotator  gate.creole.measurements.MeasurementsApplication
Measurement Tagger  A measurement tagger based upon GNU Units  gate.creole.measurements.MeasurementsTagger
MetaMap Annotator  This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port (docs gate.metamap.MetaMapPR
MutationFinder  GATE MutationFinder Wrapper (docs gate.creole.mutationfinder.MutationFinderPR
NormaGene Tagger  A processing resource that takes document and corpus parameters  gate.creole.normagene.NormaGene
Noun Phrase Chunker  Implementation of the Ramshaw and Marcus base noun phrase chunker (docs mark.chunking.GATEWrapper
Noun Phrase Chunker  Ready-made NP chunking application  mark.chunking.ChunkingApp
Numbers Tagger  Finds numbers in (both words and digits) and annotates them with their numeric value (docs gate.creole.numbers.NumbersTagger
Roman Numerals Tagger  Finds and annotates Roman numerals (docs gate.creole.numbers.RomanNumeralsTagger
OpenCalais Tagger  An OpenCalais based semantic annotator (docs gate.opencalais.OpenCalais
Penn BioTagger  Ready-made application for the Penn BioTagger  gate.creole.pennbio.BioTagger
Penn BioTagger: Genes  Penn BioTagger for Genes (docs gate.creole.pennbio.GeneTagger
Penn BioTagger: Malignancy  Penn BioTagger for malignancy types (docs gate.creole.pennbio.MalignancyTagger
Penn BioTokenizer  Tokenizer for biomedical text (docs gate.creole.pennbio.Tokenizer
Penn BioTagger: Variation  Penn BioTagger for variations (docs gate.creole.pennbio.VariationTagger
TextRazor Service PR  Runs the TextRazor annotation service (http://textrazor.com) on a GATE document (docs gate.textrazor.TextRazorServicePR
Zemanta Service PR  Runs a zemanta annotation service on a GATE document (docs gate.zemanta.ZemantaServicePR
QA Summariser for Teamware  The Quality Assurance PR for teamware (docs gate.qa.QAForTeamwarePR
PMI Example (English)  Example application for the PMI (pointwise mutual information) tool  gate.termraider.PMIExample
TermRaider English Term Extraction  Example application showing typical set-up for the TermRaider tools  gate.termraider.TermRaiderEnglish
Termbank Score Copier  Copy scores from Termbanks back to their source annotations (docs gate.termraider.apply.TermScoreCopier
AnnotationTermbank  TermRaider Termbank derived from document annotations (docs gate.termraider.bank.AnnotationTermbank
DocumentFrequencyBank  Document frequency counter derived from corpora and other DFBs (docs gate.termraider.bank.DocumentFrequencyBank
HyponymyTermbank  TermRaider Termbank derived from head/string hyponymy (docs gate.termraider.bank.HyponymyTermbank
PMI Bank  Pointwise Mutual Information from corpora (docs gate.termraider.bank.PMIBank
TfIdfTermbank  TermRaider Termbank derived from vectors in document features (docs gate.termraider.bank.TfIdfTermbank
Pairbank Viewer  viewer for the TermRaider Pairbank (docs gate.termraider.gui.PairbankViewer
Termbank Viewer  viewer for the TermRaider Termbank (docs gate.termraider.gui.TermbankViewer
TextCategorizationPR  A processing resource that takes document and corpus parameters  gate.ml.categorization.TextCategorizationPR
Gazetteer List Collector  Gazetteer lists collector. (docs gate.creole.GazetteerListsCollector
ANNIE VP Chunker  ANNIE VP Chunker component. (docs gate.creole.VPChunker
Annotation Set Transfer  Annotation set transfer component. (docs gate.creole.annotransfer.AnnotationSetTransfer
Flexible Exporter  Exports a document with GATE annotations to its original format. (docs gate.creole.dumpingPR.DumpingPR
GATE Morphological analyser  Morphological Analyzer for the English Language. (docs gate.creole.morph.Morph
Flexible Gazetteer  A more flexible list lookup component. (docs gate.creole.gazetteer.FlexibleGazetteer
Syntax tree viewer  Viewer for syntax trees generated by a parser. (docs gate.gui.SyntaxTreeViewer
Configurable Exporter  Allows annotations to be exported according to a specified format.  gate.configurableexporter.ConfigurableExporter
Quality Assurance PR  The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer  gate.qa.QualityAssurancePR
GATE JSON Tweet Document Format  Format parser for Twitter JSON files (docs gate.corpora.JSONTweetFormat
GATE JSON Exporter  Export documents and corpora in JSON format  gate.corpora.export.GATEJsonExporter
Twitter Corpus Populator  Populate a corpus from Twitter JSON containing multiple Tweets (docs gate.corpora.twitter.Population
Hashtag Tokenizer  Tokenizes Multi-Word Hashtags (docs gate.twitter.HashtagTokenizer
Tweet Normaliser  Normalise texts in tweets (convert into standard English spelling mistakes, colloquialisms, typing variations and so on) (docs gate.twitter.Normaliser
TwitIE (EN)  English TwitIE application  gate.twitter.apps.TwitIEEN
Twitter POS Tagger (EN)  Stanford POS tagger trained on Tweets (docs gate.twitter.pos.POSTaggerEN
Twitter Tokenizer (EN)  Tokenizer tuned for Tweets (docs gate.twitter.tokenizer.TokenizerEN
UIMA Analysis Engine  Wrapper for a Text Analysis Engine from UIMA. (docs gate.uima.AnalysisEnginePR
Crawler PR  GATE implementation of the Websphinx crawling API (docs crawl.CrawlPR
WordNet 1.6  Princeton WordNet 1.6. (docs gate.wordnet.IndexFileWordNetImpl
WordNet  WordNet (docs gate.wordnet.JWNLWordNetImpl
WordNet Viewer  WordNet viewer  gate.gui.wordnet.WordNetViewer

Other contributed plugins

website download
OrganismTagger The OrganismTagger is a hybrid rule-based/machine-learning system that extracts organism mentions from the biomedical literature, normalizes them to their scientific name, and provides grounding to the NCBI Taxonomy database.  
Multi-lingual Noun Phrase Extractor (MuNPEx)
website download
Munpex MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). en-np_main.jape
Durm German lemmatizer
website download
Durm German lemmatizer The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns.  
GATE Plugin for S4
website (download via the update site)
S4 Annotator Provides access to the text analytics services of Ontotext's Self Service Semantic Suite (S4) directly from the GATE platform, via their RESTful APIs. The PR can be integrated in any GATE processing pipeline regardless of the context and it does not have any requirements or assumptions about the type of pre-processing or post-processing of the textual data being annotated. com.ontotext.s4.api.S4Plugin
XCES tools
website download
Tools to handle documents conforming to the XML Corpus Encoding Standard (XCES) format, used by the American National Corpus. XCES is a way of encoding texts with standoff markup in XML.
ANC Document An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. org.xces.gate.XCESDocument
ANC Load Standoff Loads standoff annotations into an existing document. org.xces.creole.LoadStandoff
ANC Save Content Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. org.xces.creole.SaveContent
ANC Save Standoff Saves annotations from a Document to an XCES-compliant standoff markup file. org.xces.creole.SaveStandoff
Sen wrapper (Japanese morphological analyser)
website (in Japanese)
Sen is a morphological analyser for Japanese. This is a wrapper to allow it to be used from GATE, you must also install sen itself. See the documentation (in Japanese) for details.
Sen Wrapper Morphological analyser for Japanese jp.co.ditlab.jgate.SenWrapper
Russian morph tagger
website download
Provides GATE wrapper for mystem russian morphological parser. Allows to execute native analyzer, parse output, and assign morpho features as GATE annotations for the document.
Russian MorphTagger MorphTagger for russian language, based on MyStem Yandex' parser ru.itbrains.gate.morph.MorphTagger
String Annotation Plugin
website download
Processing resources for directly annotating the string content of a document.
Extended List Gazetteer Extended version of the GATE Default List Gazetteer. In addition to the features of the original, built-in version of the List Gazetteer, this version provides features for more powerful matching of partial words and annotating prefixes and suffixes as well as more versatile handling of word boundaries and whitespace. at.ofai.gate.extendedgazetteer.ExtendedGazetteer
Simple Regexp Annotator Use rules based on Java regular expressions to annotate the document content. at.ofai.gate.regexpannotator.SimpleRegexpAnnotator
AppDoc — GATE Application Documentation
website download
A plugin that generations documentation from your application/pipeline/gapp files in various formats. In addition it provides new visual resources in the GATE Developer GUI to add author, version and documentation comments to pipelines and processing resources.
AppDoc Visual resource for adding author/version/comment to pipelines and processing resources at.ofai.gate.appdoc.AppDoc
AppDocGen Visual resource for selecting a documentation template and generating documentation files at.ofai.gate.appdoc.AppDocGen
VirtualCorpus — Directory- and JDBC Corpus LRs
website download
A plugin that provides two new corpus language resources, DirectoryCorpus for directly using files in a directory through a corpus LR and JDBCCorpus for directly using documents stored in a field of a JDBC database table.
DirectoryCorpus Language resource for accessing GATE XML files in a directory directly via a corpus resource at.ofai.gate.virtualcorpus.DirectoryCorpus
JDBCCorpus Language resource for accessing GATE XML documents stored in a field of a JDBC database table directly via a corpus resource at.ofai.gate.virtualcorpus.JDBCCorpus
BWP Gazetteer
website download
This plugin provides an approximate gazetteer for GATE, based on Levenshtein's Edit Distance for strings. Its goal is to handle texts with noise and errors, in which GATE's default gazetteers may have difficulties.
BWP Gazetteer Extended version of the transducer-based List Gazetteer  
website download
Annotates documents like a gazetteer, but takes the terms from OWL annotation properties in an ontology, rather than from a separate list of terms.
Apolda Ontology Annotator Ontology-based lookup taking terms from properties in the ontology. telin.Apolda
Reported Speech Tagger
website download
Automatically detects and tags reported speech constructs, in particular the source, reporting verb and content.
Reporting Verb marker JAPE transducer which tags reporting verbs  
Reported Speech finder JAPE transducer which tags reported speech  
Keyphrase Extraction Module
website download
Keyphrase extraction and language identification.
LanguageIdentification identifies the language of a document using character n-grams  

NLP group NLP Group
kewl red line