Log in Help
Print
Homegatedoc 〉 plugins.html
 
GATE

This page lists some of the plugins that are currently available with GATE:

For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".

To submit a plugin, please contact us via the gate-users mailing list.


Plugins included in the GATE distribution

AlchemyAPI
AlchemyAPI: Entity Extraction  Runs the AlchemyAPI Entity Extraction service on a GATE document  gate.alchemyAPI.EntityExtraction
AlchemyAPI: Keyword Extraction  Runs the AlchemyAPI Keyword Extraction service on a GATE document  gate.alchemyAPI.KeywordExtraction
Alignment
Compound Document  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentImpl
Compound Document From Xml  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentFromXml
Compound Document Editor  Editor for compound documents. (docs gate.compound.gui.CompoundDocumentEditor
GATE Composite document  GATE Composite document. (docs gate.composite.impl.CompositeDocumentImpl
Switch Member PR  Sets the focus of a compound document to a specified member document. (docs gate.compound.impl.SwitchMemberPR
Delete Member PR  Deletes one member document from a compound doc. (docs gate.compound.impl.DeleteMemberPR
Combine Members PR  Combines documents in a composite document. (docs gate.composite.impl.CombineMembersPR
Segment Processing PR  Processes individual segments as separate documents (docs gate.composite.impl.SegmentProcessingPR
ExportAlignmentPR  A PR to export alignment information in an xml file.  gate.alignment.ExportAlignmentPR
ANNIE
GATE Unicode Tokeniser  A customisable Unicode tokeniser. (docs gate.creole.tokeniser.SimpleTokeniser
ANNIE English Tokeniser  A customisable English tokeniser. (docs gate.creole.tokeniser.DefaultTokeniser
ANNIE Gazetteer  A list lookup component. (docs gate.creole.gazetteer.DefaultGazetteer
Sharable Gazettee  A list lookup component. (docs gate.creole.gazetteer.SharedDefaultGazetteer
Hash Gazetteer  A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs com.ontotext.gate.gazetteer.HashGazetteer
JAPE Transducer  A module for executing Jape grammars. (docs gate.creole.Transducer
ANNIE NE Transducer  ANNIE named entity grammar. (docs gate.creole.ANNIETransducer
ANNIE Sentence Splitter  ANNIE sentence splitter. (docs gate.creole.splitter.SentenceSplitter
RegEx Sentence Splitter  A sentence splitter based on regular expressions. (docs gate.creole.splitter.RegexSentenceSplitter
ANNIE POS Tagger  Mark Hepple's Brill-style POS tagger (docs gate.creole.POSTagger
ANNIE OrthoMatcher  ANNIE orthographical coreference component. (docs gate.creole.orthomatcher.OrthoMatcher
ANNIE Pronominal Coreferencer  Pronominal Coreference resolution component. (docs gate.creole.coref.Coreferencer
ANNIE Nominal Coreferencer  Nominal Coreference resolution component (docs gate.creole.coref.NominalCoref
Document Reset PR  Remove named annotation sets or reset the default annotation set (docs gate.creole.annotdelete.AnnotationDeletePR
Jape Viewer  A JAPE grammar file viewer (docs gate.gui.jape.JapeViewer
Gazetteer Editor  Gazetteer viewer and editor. (docs gate.gui.GazetteerEditor
Annotation_Merging
Annotation Merging PR  Merge Annotations from different annotators. (docs gate.merger.AnnotationMergingMain
Copy_Annots_Between_Docs
Copy Anns to Another Doc PR  Copy the annotations from one document to another document. (docs gate.copyAS2AnoDoc.CopyAS2AnoDocMain
Coref_Tools
Legacy Coref Data Writer  A simple PR that converts co-reference data from the Relations-based model to the legacy format (based on 'matches' annotation and document features).  gate.creole.coref.LegacyCorefDataWriter
OrthoRef  An orthographic coreferencer  gate.creole.coref.OrthoRef
Crowd_Sourcing
Entity Classification Job Builder  Build a CrowdFlower job asking users to select the right label for entities (docs gate.crowdsource.classification.EntityClassificationJobBuilder
Entity Classification Results Importer  Import judgments from a CrowdFlower job created by the Entity Classification Job Builder as GATE annotations. (docs gate.crowdsource.classification.EntityClassificationResultsImporter
Entity Annotation Job Builder  Build a CrowdFlower job asking users to annotate entities within a snippet of text (docs gate.crowdsource.ne.EntityAnnotationJobBuilder
Entity Annotation Results Importer  Import judgments from a CrowdFlower job created by the Entity Annotation Job Builder as GATE annotations. (docs gate.crowdsource.ne.EntityAnnotationResultsImporter
Developer_Tools
EDT Monitor  Warns whenever an AWT component is updated from anywhere other than the event dispatch thread (docs gate.creole.EDTMonitor
Show/Hide Resources  Show resources that would otherwise be hidden, e.g. resources created for internal use by other resources (docs gate.creole.Reveal
The Duplicator  Duplicate any resource with a right click menu option (docs gate.creole.TheDuplicator
DocumentNormalizer
Document normalizer  Normalize document content to remove "smart quotes" etc. (docs gate.creole.DocumentNormalizer
Format_CSV
CSV Corpus Populater  Populate a corpus from CSV files (docs gate.corpora.CSVImporter
Format_FastInfoset
Fast Infoset Document Format  Format parser for GATE XML stored in the binary FastInfoset format (docs gate.corpora.FastInfosetDocumentFormat
Fast Infoset Exporter  Export GATE documents to GATE XML stored in the binary FastInfoset format (docs gate.corpora.FastInfosetExporter
Format_MediaWiki
MediaWiki Document Format  Document format for parsing MediaWiki markup (docs gate.corpora.MediaWikiDocumentFormat
MediaWiki Corpus Populater  Populate a corpus from a MediaWiki XML dump (docs gate.corpora.MediaWikiPopulater
MediaWiki XML Document Format  Deprecated MediaWiki importer  gate.corpora.MediaWikiXMLDocumentFormat
Format_PubMed
GATE .cochrane.txt document format  Load this to allow the opening of Cochrane text documents, and choose the mime type "text/x-cochrane", or use the correct file extension.  gate.corpora.CochraneTextDocumentFormat
GATE .pubMed.txt document format  Load this to allow the opening of PubMed text documents, and choose the mime type "text/x-pubmed"or use the correct file extension.  gate.corpora.PubmedTextDocumentFormat
Gazetteer_LKB
Large KB Gazetteer  KIM KB based alias-lookup commponent (docs com.ontotext.kim.gate.KimGazetteer
Semantic Enrichment PR  The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories. (docs com.ontotext.kim.gate.SesameEnrichment
Gazetteer_Ontology_Based
Onto Root Gazetteer  An ontology lookup component (docs gate.clone.ql.OntoRootGaz
GENIA
GENIA Sentence Splitter  A processing resource that takes document and corpus parameters (docs gate.creole.genia.splitter.GENIASentenceSplitter
Groovy
Groovy support for GATE    gate.groovy.GroovySupport
Groovy scripting PR  Runs a Groovy script as a processing resource (docs gate.groovy.ScriptPR
Scriptable Controller  A controller whose execution strategy is controlled by a Groovy script (docs gate.groovy.ScriptableController
Control Script  Editor for the Groovy script controlling a scriptable controller  gate.groovy.gui.ControllerScriptEditor
Script Editor  Editor for the Groovy script behind this PR  gate.groovy.gui.ScriptPREditor
Information_Retrieval
SearchPR  Provides IR functionality. (docs gate.creole.ir.SearchPR
Search Results  Viewer for IR search results  gate.gui.SearchPRViewer
Corpus Indexing Support    gate.creole.ir.CorpusIndexingSupport
Lucene IR Engine    gate.creole.ir.lucene.LuceneIREngine
Inter_Annotator_Agreement
IAA Computation PR  Compute inter-annotator agreement (IAA). (docs gate.iaaplugin.IaaMain
JAPE_Plus
JAPE-Plus Viewer  A JAPE grammar file viewer (docs gate.gui.jape.plus.Viewer
JAPE-Plus Transducer  An optimised, JAPE-compatible transducer.  gate.jape.plus.Transducer
Keyphrase_Extraction_Algorithm
KEA Keyphrase Extractor  A Keyphrase Extractor by Eibe Frank. (docs gate.creole.kea.Kea
KEA Corpus Importer  Imports a KEA-style corpus into GATE  gate.creole.kea.CorpusImporter
Language_Identification
TextCat Fingerprint Generator  Generate language fingerprints for use with the TextCat Language Indentification PR (docs org.knallgrau.utils.textcat.FingerprintGenerator
TextCat Language Identification  Recognizes the document language using TextCat (docs org.knallgrau.utils.textcat.LanguageIdentifier
Lang_Arabic
Arabic Gazetteer Collector    arabic.ArabicGazCollector
Arabic Gazetteer  A list lookup component. (docs arabic.ArabicGazetteer
Arabic IE System    arabic.ArabicIE
Arabic Infered Gazetteer  A list lookup component. (docs arabic.ArabicInferedGazetteer
Arabic OrthoMatcher  ANNIE orthographical coreference component. (docs arabic.ArabicOrthoMatcher
Arabic Tokeniser  A customisable English tokeniser. (docs arabic.ArabicTokeniser
Arabic Main Grammar  A module for executing Jape grammars. (docs arabic.ArabicTransducer
Lang_Bulgarian
BulStem  This plugin is an implementation of the BulStem stemmer algorithm for Bulgarian developed by Preslav Nakov. (docs gate.bulstem.BulStemPR
Lang_Cebuano
Cebuano POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Cebuano Gazetteer  A list lookup component. (docs cebuano.CebuanoGazetteer
Cebuano Gazetteer Tokeniser  A list lookup component. (docs cebuano.CebuanoGazetteerTokeniser
Cebuano IE System    cebuano.CebuanoIE
Cebuano Tokeniser  A customisable English tokeniser. (docs cebuano.CebuanoTokeniser
Cebuano Transducer  A module for executing Jape grammars. (docs cebuano.CebuanoTransducer
Cebuano Transducer Postprocessor  A module for executing Jape grammars. (docs cebuano.CebuanoTransducerPost
Lang_Chinese
Chinese Segmenter PR  Segment the Chinese text into words, based on the PAUM learning algorithm. (docs gate.chineseSeg.ChineseSegMain
Chinese IE System    chinese.ChineseIE
Lang_French
French IE System    french.FrenchIE
Lang_German
German IE System    german.GermanIE
Lang_Hindi
Hindi Tokeniser  A customisable Hindi tokeniser.  hindi.HindiTokeniser
Hindi Gazetteer  A list lookup component.  hindi.HindiGazetteer
Hindi Splitter  A Sentence Splitter.  hindi.HindiSplitter
Hindi Tokeniser Gazetteer  A list lookup component.  hindi.HindiTokeniserGazetteer
Hindi Main Grammar  A module for executing Jape grammars  hindi.HindiTransducer
Hindi Tokeniser Postprocessor  A module for executing Jape grammars  hindi.HindiTokeniserPostprocessor
Hindi OrthoMatcher  Hindi Orthomatcher  hindi.HindiOrthoMatcher
Hindi POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Lang_Romanian
Romanian Tokeniser  A customisable Romanian tokeniser.  romanian.RomanianTokeniser
Romanian Gazetteer  A list lookup component.  romanian.RomanianGazetteer
Romanian Transducer  A module for executing Jape grammars  romanian.RomanianTransducer
Romanian IE System    romanian.RomanianIE
Lang_Russian
RussIE  Basic version of the RussIE application (docs com.ontotext.russie.apps.RussIE
RussIE + Inflectional Gazetter  RussIE application with inflexional gazetteer (docs com.ontotext.russie.apps.RussIEInflex
RussIE + OrthoMatcher  RussIE application with orthomatcher (docs com.ontotext.russie.apps.RussIEOrtho
RussIE + Inflectional Gazetteer & OrthoMatcher  RussIE application with orthomatcher and inflexional gazetteer (docs com.ontotext.russie.apps.RussIEOrthoInflex
Inflectional gazetteer  Gazetteer with support for inflectional morphology (docs com.ontotext.russie.gazetteer.InflectionalGazetteer
Russian Gazetteer  Customised version of the hash gazetteer (docs com.ontotext.russie.gazetteer.RussGazetteer
POS Mapper  Map complex Russian morphology tags into simpler POS categories (docs com.ontotext.russie.morph.POSMapper
Russian POS Tagger  Part-of-speech tagger for Russian (docs com.ontotext.russie.morph.POSTagger
Learning
Batch Learning PR  Supports training, application and evaluation of machine learning models for NLP tasks (docs gate.learning.LearningAPIMain
LingPipe
LingPipe Tokenizer PR  Provides a LingPipe tokenizer. (docs gate.lingpipe.TokenizerPR
LingPipe NER PR  LingPipe Named Entity Recognizer (docs gate.lingpipe.NamedEntityRecognizerPR
LingPipe Language Identifier PR  GATE PR for language identification using LingPipe (docs gate.lingpipe.LanguageIdentifierPR
LingPipe POS Tagger PR  Provides a LingPipe part of speech tagger. (docs gate.lingpipe.POSTaggerPR
LingPipe Sentence Splitter PR  Provides an interface to LingPipe sentence splitter API. (docs gate.lingpipe.SentenceSplitterPR
Machine_Learning
Machine Learning PR  Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs gate.creole.ml.MachineLearningPR
Ontology
ConnectSesameOntology  Connect to a repository containing and ontology (docs gate.creole.ontology.impl.sesame.ConnectSesameOntology
CreateSesameOntology  Create a ontology from a Sesame configuration file for a repository (docs gate.creole.ontology.impl.sesame.CreateSesameOntology
OWLIM Ontology  Ontology created as a temporary OWLIM3 in-memory repository (docs gate.creole.ontology.impl.sesame.OWLIMOntology
OWLIM Ontology DEPRECATED  Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs gate.creole.ontology.owlim.OWLIMOntologyLR
Ontology_BDM_Computation
BDM Computation PR  Compute BDM score for each pair of concepts in the given ontology. (docs gate.bdmComp.BDMCompMain
Ontology_Tools
OntoGazetteer  A list lookup component based on mapping between ontology classes and gazetteer lists. (docs gate.creole.gazetteer.OntoGazetteerImpl
GATE Ontology Editor  Ontology editing tool. (docs gate.gui.ontology.OntologyEditor
OAT  Ontology Annotation Tool. (docs gate.creole.ontology.ocat.OntologyViewer
RAT-C  Relation Annotation Tool Class view. (docs gate.gui.docview.OntologyClassView
RAT-I  Relation Annotation Tool Instance view. (docs gate.gui.docview.OntologyInstanceView
GAZE  Gazetteer viewer and editor (docs com.ontotext.gate.vr.Gaze
OpenNLP
OpenNLP NER  NER PR using a set of OpenNLP maxent models (docs gate.opennlp.OpenNLPNameFin
OpenNLP Chunker  Chunker using an OpenNLP maxent model (docs gate.opennlp.OpenNlpChunker
OpenNLP POS Tagger  POS Tagger using an OpenNLP maxent model (docs gate.opennlp.OpenNlpPOS
OpenNLP Parser  Syntactic parser from Apache OpenNLP (docs gate.opennlp.OpenNlpParser
OpenNLP Sentence Splitter  Sentence splitter using an OpenNLP maxent model (docs gate.opennlp.OpenNlpSentenceSplit
OpenNLP Tokenizer  Tokenizer using an OpenNLP maxent model (docs gate.opennlp.OpenNlpTokenizer
Parser_Minipar
Minipar Wrapper  MiniPar is a shallow parser. It determines the dependency relationships between the words of a sentence. (docs minipar.Minipar
Parser_RASP
RASP2 Tokenizer  RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs com.digitalpebble.rasp2.token.RASPTokenizer
RASP POS Converter  Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs com.digitalpebble.rasp2.tagger.C2Transducer
RASP2 POS Tagger  RASP part-of-speech tagger, creating WordForm annotations (docs com.digitalpebble.rasp2.tagger.PosTagger
RASP2 Morphological Analyser  RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs com.digitalpebble.rasp2.morph.MorphoAnnotator
RASP2 Parser  RASP dependency parser (docs com.digitalpebble.rasp2.parser.ParserAnnotator
Parser_Stanford
StanfordParser  Stanford parser wrapper (docs gate.stanford.Parser
English Dependency Parser    gate.stanford.apps.EnglishDependencies
English POS Tagger and Dependency Parser    gate.stanford.apps.EnglishPOSDependencies
Parser_SUPPLE
SUPPLE Parser  SUPPLE bottom-up chart parser. (docs shef.nlp.supple.SUPPLE
Schema_Annotation_Editor
Schema Annotations Editor  An annotation editor restricted by schemas. (docs gate.gui.annedit.SchemaAnnotationEditor
Schema_Tools
Schema Enforcer  Produces an annotation set whose content is restricted by the specified set of schemas (docs gate.creole.schema.SchemaEnforcer
Simple Schema Viewer  A Simple Annotation Schema Viewer  gate.gui.schema.SimpleSchemaViewer
Stemmer_Snowball
Stemmer PR  Wrapper for the Snowball stemmer. (docs stemmer.SnowballStemmer
Tagger_Abner
ABNER Tagger  GATE wrapper over ABNER (docs gate.abner.AbnerTagger
Tagger_Boilerpipe
Boilerpipe Content Detection  Uses boilerpipe to determine which sections of a document are interesting content and which are just boilerplate (docs gate.creole.boilerpipe.BoilerPipe
Tagger_Chemistry
Chemistry Tagger  A tagger for chemical names. (docs mark.chemistry.Tagger
Tagger_DateNormalizer
Date Annotation Normalizer  provides normalized values for all existing date annotations (docs gate.creole.dates.DateAnnotationNormalizer
Date Normalizer  provides normalized values for all known dates (docs gate.creole.dates.DateNormalizer
Tagger_Framework
GenericTagger  The Generic Tagger is Generic! (docs gate.taggerframework.GenericTagger
Tagger_Lupedia
Lupedia Service PR  Runs a lupedia annotation service on a GATE document (docs gate.lupedia.LupediaServicePR
Tagger_Measurements
ANNIE+Measurements    gate.creole.measurements.ANNIEMeasurements
Measurements    gate.creole.measurements.MeasurementsApplication
Measurement Tagger  A measurement tagger based upon GNU Units  gate.creole.measurements.MeasurementsTagger
Tagger_MetaMap
MetaMap Annotator  This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port (docs gate.metamap.MetaMapPR
Tagger_MutationFinder
MutationFinder  GATE MutationFinder Wrapper (docs gate.creole.mutationfinder.MutationFinderPR
Tagger_NormaGene
NormaGene Tagger  A processing resource that takes document and corpus parameters  gate.creole.normagene.NormaGene
Tagger_NP_Chunking
Noun Phrase Chunker  Implementation of the Ramshaw and Marcus base noun phrase chunker (docs mark.chunking.GATEWrapper
Noun Phrase Chunker    mark.chunking.ChunkingApp
Tagger_Numbers
Numbers Tagger  Finds numbers in (both words and digits) and annotates them with their numeric value (docs gate.creole.numbers.NumbersTagger
Roman Numerals Tagger  Finds and annotates Roman numerals (docs gate.creole.numbers.RomanNumeralsTagger
Tagger_OpenCalais
OpenCalais Tagger  An OpenCalais based semantic annotator (docs gate.opencalais.OpenCalais
Tagger_PennBio
Penn BioTagger    gate.creole.pennbio.BioTagger
Penn BioTagger: Genes  Penn BioTagger for Genes (docs gate.creole.pennbio.GeneTagger
Penn BioTagger: Malignancy  Penn BioTagger for malignancy types (docs gate.creole.pennbio.MalignancyTagger
Penn BioTokenizer  Tokenizer for biomedical text (docs gate.creole.pennbio.Tokenizer
Penn BioTagger: Variation  Penn BioTagger for variations (docs gate.creole.pennbio.VariationTagger
Tagger_Stanford
Stanford POS Tagger  Stanford Part-of-Speech Tagger (docs gate.stanford.Tagger
Tagger_TextRazor
TextRazor Service PR  Runs the TextRazor annotation service (http://textrazor.com) on a GATE document (docs gate.textrazor.TextRazorServicePR
Tagger_Zemanta
Zemanta Service PR  Runs a zemanta annotation service on a GATE document (docs gate.zemanta.ZemantaServicePR
Teamware_Tools
QA Summariser for Teamware  The Quality Assurance PR for teamware (docs gate.qa.QAForTeamwarePR
TermRaider
PMI Example (English)    gate.termraider.PMIExample
TermRaider English Term Extraction    gate.termraider.TermRaiderEnglish
Termbank Score Copier  Copy scores from Termbanks back to their source annotations (docs gate.termraider.apply.TermScoreCopier
AnnotationTermbank  TermRaider Termbank derived from document annotations (docs gate.termraider.bank.AnnotationTermbank
DocumentFrequencyBank  Document frequency counter derived from corpora and other DFBs (docs gate.termraider.bank.DocumentFrequencyBank
HyponymyTermbank  TermRaider Termbank derived from head/string hyponymy (docs gate.termraider.bank.HyponymyTermbank
PMI Bank  Pointwise Mutual Information from corpora (docs gate.termraider.bank.PMIBank
TfIdfTermbank  TermRaider Termbank derived from vectors in document features (docs gate.termraider.bank.TfIdfTermbank
Pairbank Viewer  viewer for the TermRaider Pairbank (docs gate.termraider.gui.PairbankViewer
Termbank Viewer  viewer for the TermRaider Termbank (docs gate.termraider.gui.TermbankViewer
Text_Categorization
TextCategorizationPR  A processing resource that takes document and corpus parameters  gate.ml.categorization.TextCategorizationPR
Tools
Gazetteer List Collector  Gazetteer lists collector. (docs gate.creole.GazetteerListsCollector
ANNIE VP Chunker  ANNIE VP Chunker component. (docs gate.creole.VPChunker
Annotation Set Transfer  Annotation set transfer component. (docs gate.creole.annotransfer.AnnotationSetTransfer
Flexible Exporter  Exports a document with GATE annotations to its original format. (docs gate.creole.dumpingPR.DumpingPR
GATE Morphological analyser  Morphological Analyzer for the English Language. (docs gate.creole.morph.Morph
Flexible Gazetteer  A more flexible list lookup component. (docs gate.creole.gazetteer.FlexibleGazetteer
Syntax tree viewer  Viewer for syntax trees generated by a parser. (docs gate.gui.SyntaxTreeViewer
Configurable Exporter  Allows annotations to be exported according to a specified format.  gate.configurableexporter.ConfigurableExporter
Quality Assurance PR  The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer  gate.qa.QualityAssurancePR
Twitter
GATE JSON Tweet Document Format  Format parser for Twitter JSON files (docs gate.corpora.JSONTweetFormat
Twitter Corpus Populator  Populate a corpus from Twitter JSON containing multiple Tweets (docs gate.corpora.twitter.Population
Hashtag Tokenizer  Tokenizes Multi-Word Hashtags (docs gate.twitter.HashtagTokenizer
Tweet Normaliser  Normalise texts in tweets (convert into standard English spelling mistakes, colloquialisms, typing variations and so on) (docs gate.twitter.Normaliser
TwitIE (EN)    gate.twitter.apps.TwitIEEN
Twitter POS Tagger (EN)  Stanford POS tagger trained on Tweets (docs gate.twitter.pos.POSTaggerEN
Twitter Tokenizer (EN)  Tokenizer tuned for Tweets (docs gate.twitter.tokenizer.TokenizerEN
UIMA
UIMA Analysis Engine  Wrapper for a Text Analysis Engine from UIMA. (docs gate.uima.AnalysisEnginePR
Web_Crawler_Websphinx
Crawler PR  GATE implementation of the Websphinx crawling API (docs crawl.CrawlPR
WordNet
WordNet 1.6  Princeton WordNet 1.6. (docs gate.wordnet.IndexFileWordNetImpl
WordNet  WordNet (docs gate.wordnet.JWNLWordNetImpl
WordNet Viewer  WordNet viewer  gate.gui.wordnet.WordNetViewer

Other contributed plugins

OrganismTagger
website download
OrganismTagger The OrganismTagger is a hybrid rule-based/machine-learning system that extracts organism mentions from the biomedical literature, normalizes them to their scientific name, and provides grounding to the NCBI Taxonomy database.  
Multi-lingual Noun Phrase Extractor (MuNPEx)
website download
Munpex MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). en-np_main.jape
Durm German lemmatizer
website download
Durm German lemmatizer The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns.  
XCES tools
website download
Tools to handle documents conforming to the XML Corpus Encoding Standard (XCES) format, used by the American National Corpus. XCES is a way of encoding texts with standoff markup in XML.
ANC Document An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. org.xces.gate.XCESDocument
ANC Load Standoff Loads standoff annotations into an existing document. org.xces.creole.LoadStandoff
ANC Save Content Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. org.xces.creole.SaveContent
ANC Save Standoff Saves annotations from a Document to an XCES-compliant standoff markup file. org.xces.creole.SaveStandoff
Sen wrapper (Japanese morphological analyser)
website (in Japanese)
Sen is a morphological analyser for Japanese. This is a wrapper to allow it to be used from GATE, you must also install sen itself. See the documentation (in Japanese) for details.
Sen Wrapper Morphological analyser for Japanese jp.co.ditlab.jgate.SenWrapper
Russian morph tagger
website download
Provides GATE wrapper for mystem russian morphological parser. Allows to execute native analyzer, parse output, and assign morpho features as GATE annotations for the document.
Russian MorphTagger MorphTagger for russian language, based on MyStem Yandex' parser ru.itbrains.gate.morph.MorphTagger
String Annotation Plugin
website download
Processing resources for directly annotating the string content of a document.
Extended List Gazetteer Extended version of the GATE Default List Gazetteer. In addition to the features of the original, built-in version of the List Gazetteer, this version provides features for more powerful matching of partial words and annotating prefixes and suffixes as well as more versatile handling of word boundaries and whitespace. at.ofai.gate.extendedgazetteer.ExtendedGazetteer
Simple Regexp Annotator Use rules based on Java regular expressions to annotate the document content. at.ofai.gate.regexpannotator.SimpleRegexpAnnotator
AppDoc — GATE Application Documentation
website download
A plugin that generations documentation from your application/pipeline/gapp files in various formats. In addition it provides new visual resources in the GATE Developer GUI to add author, version and documentation comments to pipelines and processing resources.
AppDoc Visual resource for adding author/version/comment to pipelines and processing resources at.ofai.gate.appdoc.AppDoc
AppDocGen Visual resource for selecting a documentation template and generating documentation files at.ofai.gate.appdoc.AppDocGen
VirtualCorpus — Directory- and JDBC Corpus LRs
website download
A plugin that provides two new corpus language resources, DirectoryCorpus for directly using files in a directory through a corpus LR and JDBCCorpus for directly using documents stored in a field of a JDBC database table.
DirectoryCorpus Language resource for accessing GATE XML files in a directory directly via a corpus resource at.ofai.gate.virtualcorpus.DirectoryCorpus
JDBCCorpus Language resource for accessing GATE XML documents stored in a field of a JDBC database table directly via a corpus resource at.ofai.gate.virtualcorpus.JDBCCorpus
BWP Gazetteer
website download
This plugin provides an approximate gazetteer for GATE, based on Levenshtein's Edit Distance for strings. Its goal is to handle texts with noise and errors, in which GATE's default gazetteers may have difficulties.
BWP Gazetteer Extended version of the transducer-based List Gazetteer  
Apolda
website download
Annotates documents like a gazetteer, but takes the terms from OWL annotation properties in an ontology, rather than from a separate list of terms.
Apolda Ontology Annotator Ontology-based lookup taking terms from properties in the ontology. telin.Apolda
Reported Speech Tagger
website download
Automatically detects and tags reported speech constructs, in particular the source, reporting verb and content.
Reporting Verb marker JAPE transducer which tags reporting verbs  
Reported Speech finder JAPE transducer which tags reported speech  
Keyphrase Extraction Module
website download
Keyphrase extraction and language identification.
FrequencyAnalyser    
KeywordAnalyser    
LanguageIdentification identifies the language of a document using character n-grams  
POSTagMapper    
SimpleNounChunking    
StopwordMarker    

NLP group NLP Group
kewl red line