This page lists some of the plugins that are currently available with GATE:
For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".
To submit a plugin, please contact us via the gate-users mailing list.
Plugins included in the GATE distribution
- Alignment
- ANNIE
- Annotation_Merging
- Copy_Annots_Between_Docs
- Gazetteer_LKB
- Gazetteer_Ontology_Based
- Groovy
- Information_Retrieval
- Inter_Annotator_Agreement
- Jape_Compiler
- Keyphrase_Extraction_Algorithm
- Language_Identification
- Lang_Arabic
- Lang_Cebuano
- Lang_Chinese
- Lang_Hindi
- Lang_Romanian
- Learning
- LingPipe
- Machine_Learning
- Obsolete/apf-exporter
- Obsolete/document-editor
- Obsolete/html-documentformat
- Obsolete/Montreal_Transducer
- Obsolete/rasp
- Ontology
- Ontology_BDM_Computation
- Ontology_OWLIM2
- Ontology_Tools
- OpenNLP
- Parser_Minipar
- Parser_RASP
- Parser_Stanford
- Parser_SUPPLE
- Schema_Annotation_Editor
- Stemmer_Snowball
- Tagger_Abner
- Tagger_Chemistry
- Tagger_Framework
- Tagger_NP_Chunking
- Tagger_OpenCalais
- Tagger_TreeTagger
- Tools
- UIMA
- Web_Crawler_Websphinx
- Web_Search_Google
- Web_Search_Yahoo
- WordNet
| Alignment | ||
|---|---|---|
| Compound Document | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentImpl |
| Compound Document From Xml | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentFromXml |
| Compound Document Editor | Editor for compound documents. (docs) | gate.compound.gui.CompoundDocumentEditor |
| GATE Composite document | GATE Composite document. (docs) | gate.composite.impl.CompositeDocumentImpl |
| Alignment Editor | Alignment editor. (docs) | gate.alignment.gui.AlignmentEditor |
| Switch Member PR | Sets the focus of a compound document to a specified member document. (docs) | gate.compound.impl.SwitchMemberPR |
| Delete Member PR | Deletes one member document from a compound doc. (docs) | gate.compound.impl.DeleteMemberPR |
| Combine Members PR | Combines documents in a composite document. (docs) | gate.composite.impl.CombineMembersPR |
| Segment Processing PR | Processes individual segments as separate documents (docs) | gate.composite.impl.SegmentProcessingPR |
| ANNIE | ||
| Annotation Schema | An annotation type and its features. (docs) | gate.creole.AnnotationSchema |
| GATE Unicode Tokeniser | A customisable Unicode tokeniser. (docs) | gate.creole.tokeniser.SimpleTokeniser |
| ANNIE English Tokeniser | A customisable English tokeniser. (docs) | gate.creole.tokeniser.DefaultTokeniser |
| ANNIE Gazetteer | A list lookup component. (docs) | gate.creole.gazetteer.DefaultGazetteer |
| Sharable Gazetteer | A sharable list lookup component. (docs) | gate.creole.gazetteer.SharedDefaultGazetteer |
| Hash Gazetteer | A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs) | com.ontotext.gate.gazetteer.HashGazetteer |
| Jape Transducer | A module for executing Jape grammars. (docs) | gate.creole.Transducer |
| ANNIE NE Transducer | ANNIE named entity grammar. (docs) | gate.creole.ANNIETransducer |
| ANNIE Sentence Splitter | ANNIE sentence splitter. (docs) | gate.creole.splitter.SentenceSplitter |
| RegEx Sentence Splitter | A sentence splitter based on regular expressions. (docs) | gate.creole.splitter.RegexSentenceSplitter |
| ANNIE POS Tagger | Mark Hepple's Brill-style POS tagger. (docs) | gate.creole.POSTagger |
| ANNIE OrthoMatcher | ANNIE orthographical coreference component. (docs) | gate.creole.orthomatcher.OrthoMatcher |
| ANNIE Pronominal Coreferencer | Pronominal Coreference resolution component. (docs) | gate.creole.coref.Coreferencer |
| ANNIE Nominal Coreferencer | Nominal Coreference resolution component (docs) | gate.creole.coref.NominalCoref |
| Document Reset PR | Document cleaner. (docs) | gate.creole.annotdelete.AnnotationDeletePR |
| Jape Viewer | A JAPE grammar file viewer. (docs) | gate.gui.jape.JapeViewer |
| Gazetteer Editor | Gazetteer viewer and editor. (docs) | gate.gui.GazetteerEditor |
| Gaze | HashGazetteer viewer and editor. (docs) | com.ontotext.gate.vr.Gaze |
| Annotation_Merging | ||
| Annotation Merging PR | Merge Annotations from different annotators. (docs) | gate.merger.AnnotationMergingMain |
| Copy_Annots_Between_Docs | ||
| Copy Anns to Another Doc PR | Copy the annotations from one document to another document. (docs) | gate.copyAS2AnoDoc.CopyAS2AnoDocMain |
| Gazetteer_LKB | ||
| Large KB Gazetteer | com.ontotext.kim.gate.KimGazetteer | |
| Semantic Annotation Enrichment | com.ontotext.kim.gate.SesameEnrichment | |
| Gazetteer_Ontology_Based | ||
| Onto Root Gazetteer | A ontology lookup component (docs) | gate.clone.ql.OntoRootGaz |
| Fake Sentence Splitter | Fake Sentence Splitter is used by Onto Root Gazetteer internally as it creates 'fake' annotation type 'Sentence' without analysing the text by a proper Sentence Splitter. The reason for doing this is enabling the POS Tagger to work properly, as the input text is usually not a proper sentence (i.e. ontology resource name or label). 'Faking' sentence splitting optimises the processing as Onto Root Gazetteer usually does not process internally any multisentence text. | gate.clone.ql.FakeSentenceSplitter |
| Groovy | ||
| Groovy support for GATE | gate.groovy.GroovySupport | |
| Groovy scripting PR | Runs a Groovy script as a processing resource (docs) | gate.groovy.ScriptPR |
| Information_Retrieval | ||
| SearchPR | Provides IR functionality. (docs) | gate.creole.ir.SearchPR |
| Search Results | Viewer for IR search results | gate.gui.SearchPRViewer |
| Inter_Annotator_Agreement | ||
| IAA Computation PR | Compute inter-annotator agreement (IAA). (docs) | gate.iaaplugin.IaaMain |
| Jape_Compiler | ||
| Ontotext Japec Transducer | JAPE compiler. (docs) | com.ontotext.gate.japec.JapecTransducer |
| Keyphrase_Extraction_Algorithm | ||
| KEA Keyphrase Extractor | A Keyphrase Extractor by Eibe Frank. (docs) | gate.creole.kea.Kea |
| KEA Corpus Importer | Imports a KEA-style corpus into GATE | gate.creole.kea.CorpusImporter |
| Language_Identification | ||
| TextCat PR | Recognizes the document language using TextCat. Possible languages: german, english, french, spanish, italian, swedish, polish, dutch, norwegian, finnish, albanian slovakian, slovenian, danish, hungarian. | org.knallgrau.utils.textcat.LanguageIdentifier |
| Lang_Arabic | ||
| Arabic Tokeniser | A customisable Arabic tokeniser. | arabic.ArabicTokeniser |
| Arabic Gazetteer | A list lookup component. | arabic.ArabicGazetteer |
| Arabic Infered Gazetteer | A list lookup component. | arabic.ArabicInferedGazetteer |
| Arabic Main Grammar | A module for executing Jape grammars | arabic.ArabicTransducer |
| Arabic OrthoMatcher | Arabic Orthomatcher | arabic.ArabicOrthoMatcher |
| Lang_Cebuano | ||
| Cebuano Tokeniser | A customisable Cebuano tokeniser. | cebuano.CebuanoTokeniser |
| Cebuano Gazetteer | A list lookup component. | cebuano.CebuanoGazetteer |
| Cebuano Gazetteer Tokeniser | A list lookup component. | cebuano.CebuanoGazetteerTokeniser |
| Cebuano Transducer | A module for executing Jape grammars | cebuano.CebuanoTransducer |
| Cebuano Transducer Postprocessor | A module for executing Jape grammars | cebuano.CebuanoTransducerPost |
| Cebuano POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
| Lang_Chinese | ||
| Chinese Segmenter PR | Segment the Chinese text into words, based on the PAUM learning algorithm. (docs) | gate.chineseSeg.ChineseSegMain |
| Lang_Hindi | ||
| Hindi Tokeniser | A customisable Hindi tokeniser. | hindi.HindiTokeniser |
| Hindi Gazetteer | A list lookup component. | hindi.HindiGazetteer |
| Hindi Splitter | A Sentence Splitter. | hindi.HindiSplitter |
| Hindi Tokeniser Gazetteer | A list lookup component. | hindi.HindiTokeniserGazetteer |
| Hindi Main Grammar | A module for executing Jape grammars | hindi.HindiTransducer |
| Hindi Tokeniser Postprocessor | A module for executing Jape grammars | hindi.HindiTokeniserPostprocessor |
| Hindi OrthoMatcher | Hindi Orthomatcher | hindi.HindiOrthoMatcher |
| Hindi POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
| Lang_Romanian | ||
| Romanian Tokeniser | A customisable Romanian tokeniser. | romanian.RomanianTokeniser |
| Romanian Gazetteer | A list lookup component. | romanian.RomanianGazetteer |
| Romanian Transducer | A module for executing Jape grammars | romanian.RomanianTransducer |
| Learning | ||
| Batch Learning PR | Supports training, application and evaluation of machine learning models for NLP tasks (docs) | gate.learning.LearningAPIMain |
| LingPipe | ||
| LingPipe Tokenizer PR | Provides a LingPipe tokenizer. (docs) | gate.lingpipe.TokenizerPR |
| LingPipe NER PR | LingPipe Named Entity Recognizer (docs) | gate.lingpipe.NamedEntityRecognizerPR |
| LingPipe Language Identifier PR | LingPipe Identifier PR (docs) | gate.lingpipe.LanguageIdentifierPR |
| LingPipe POS Tagger PR | Provides a LingPipe part of speech tagger. (docs) | gate.lingpipe.POSTaggerPR |
| LingPipe Sentence Splitter PR | Provides an interface to LingPipe sentence splitter API. (docs) | gate.lingpipe.SentenceSplitterPR |
| Machine_Learning | ||
| Machine Learning PR | Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs) | gate.creole.ml.MachineLearningPR |
| Obsolete/apf-exporter | ||
| GATE APF exporter | An APF exporter . | gate.creole.APFormatExporter |
| Obsolete/document-editor | ||
| OLD Document Editor | Old editor for documents, superseded by gate.gui.docview.* | gate.gui.DocumentEditor |
| Unrestricted annotation editor | gate.gui.UnrestrictedAnnotationEditor | |
| Schema annotation editor | gate.gui.SchemaAnnotationEditor | |
| Features Editor | Old editor for feature values of any resource. Superseded by the small feature editor in the bottom-left corner of the GUI. | gate.gui.FeaturesEditor |
| Obsolete/html-documentformat | ||
| Old GATE HTML Document Format | Old HTML document parser, based on the Swing parser that drives JEditorPane. | gate.corpora.HtmlDocumentFormat |
| Obsolete/Montreal_Transducer | ||
| Montreal Transducer | A module for executing augmented Jape grammars. Many of its features have now been subsumed into the standard JAPE implementation. | ca.umontreal.iro.rali.gate.creole.MtlTransducer |
| Obsolete/rasp | ||
| RASP Parser | RASP (Robust Accurate Statistical Parsing) is a robust parsing system for English. | gate.rasp.rasp |
| Ontology | ||
| ConnectSesameOntology | Connect to a repository containing and ontology (docs) | gate.creole.ontology.impl.sesame.ConnectSesameOntology |
| CreateSesameOntology | Create a ontology from a Sesame configuration file for a repository (docs) | gate.creole.ontology.impl.sesame.CreateSesameOntology |
| OWLIM Ontology | Ontology created as a temporary OWLIM3 in-memory repository (docs) | gate.creole.ontology.impl.sesame.OWLIMOntology |
| OWLIM Ontology DEPRECATED | Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs) | gate.creole.ontology.owlim.OWLIMOntologyLR |
| Ontology_BDM_Computation | ||
| BDM Computation PR | Compute BDM score for each pair of concepts in the given ontology. (docs) | gate.bdmComp.BDMCompMain |
| Ontology_OWLIM2 | ||
| OWLIM2 Ontology LR | Ontology based on Sesame1/OWLIM2. Deprecated but kept for backwards compatibility with the pre-GATE 5.1 ontology implementation. (docs) | gate.creole.ontology.owlim.OWLIMOntologyLR |
| Ontology_Tools | ||
| OntoGazetteer | A list lookup component based on mapping between ontology classes and gazetteer lists. (docs) | gate.creole.gazetteer.OntoGazetteerImpl |
| GATE Ontology Editor | Ontology editing tool. (docs) | gate.gui.ontology.OntologyEditor |
| OAT | Ontology Annotation Tool. (docs) | gate.creole.ontology.ocat.OntologyViewer |
| OpenNLP | ||
| OpenNlpSentenceSplit | Gate wrapper of the OpenNlp Sentence Splitter. (docs) | gate.opennlp.OpenNlpSentenceSplit |
| OpenNlpTokenizer | Implementation of the OpenNlp Token Splitter. (docs) | gate.opennlp.OpenNlpTokenizer |
| OpenNlpPOS | Implementation of the OpenNlp POS Tagger. (docs) | gate.opennlp.OpenNlpPOS |
| OpenNlpChunker | Implementation of the OpenNlp Chunker. (docs) | gate.opennlp.OpenNlpChunker |
| OpenNlpNameFinder | Implementation of the OpenNlp Name Finder. (docs) | gate.opennlp.OpenNLPNameFin |
| Parser_Minipar | ||
| Minipar Wrapper | MiniPar is a shallow parser. It determines the dependency relationships between the words of a sentence. (docs) | minipar.Minipar |
| Parser_RASP | ||
| RASP2 Tokenizer | RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs) | com.digitalpebble.rasp2.token.RASPTokenizer |
| RASP POS Converter | Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs) | com.digitalpebble.rasp2.tagger.C2Transducer |
| RASP2 POS Tagger | RASP part-of-speech tagger, creating WordForm annotations (docs) | com.digitalpebble.rasp2.tagger.PosTagger |
| RASP2 Morphological Analyser | RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs) | com.digitalpebble.rasp2.morph.MorphoAnnotator |
| RASP2 Parser | RASP dependency parser (docs) | com.digitalpebble.rasp2.parser.ParserAnnotator |
| Parser_Stanford | ||
| StanfordParser | Stanford parser wrapper (docs) | gate.stanford.Parser |
| Parser_SUPPLE | ||
| SUPPLE Parser | SUPPLE bottom-up chart parser. (docs) | shef.nlp.supple.SUPPLE |
| Schema_Annotation_Editor | ||
| Schema Annotations Editor | An annotation editor restricted by schemas. (docs) | gate.gui.annedit.SchemaAnnotationEditor |
| Stemmer_Snowball | ||
| Stemmer PR | Wrapper for the Snowball stemmer. (docs) | stemmer.SnowballStemmer |
| Tagger_Abner | ||
| AbnerTagger | Gate wrapper over Abner. (docs) | gate.abner.AbnerTagger |
| Tagger_Chemistry | ||
| Chemistry Tagger | A tagger for chemical names. (docs) | mark.chemistry.Tagger |
| Tagger_Framework | ||
| GenericTagger | The Generic Tagger is Generic! (docs) | gate.taggerframework.GenericTagger |
| Tagger_NP_Chunking | ||
| Noun Phrase Chunker | Implementation of the Ramshaw and Marcus base noun phrase chunker (docs) | mark.chunking.GATEWrapper |
| Tagger_OpenCalais | ||
| OpenCalais Tagger | An OpenCalais based semantic annotator (docs) | gate.opencalais.OpenCalais |
| Tagger_TreeTagger | ||
| TreeTagger | The TreeTagger is a language-independent part-of-speech tagger, which currently supports English, French, German, and Spanish. (docs) | gate.treetagger.TreeTagger |
| Tools | ||
| Gazetteer List Collector | Gazetteer lists collector. (docs) | gate.creole.GazetteerListsCollector |
| ANNIE VP Chunker | ANNIE VP Chunker component. (docs) | gate.creole.VPChunker |
| Annotation Set Transfer | Annotation set transfer component. (docs) | gate.creole.annotransfer.AnnotationSetTransfer |
| Flexible Exporter | Exports a document with GATE annotations to its original format. (docs) | gate.creole.dumpingPR.DumpingPR |
| GATE Morphological analyser | Morphological Analyzer for the English Language. (docs) | gate.creole.morph.Morph |
| Flexible Gazetteer | A more flexible list lookup component. (docs) | gate.creole.gazetteer.FlexibleGazetteer |
| Syntax tree viewer | Viewer for syntax trees generated by a parser. (docs) | gate.gui.SyntaxTreeViewer |
| UIMA | ||
| UIMA Analysis Engine | Wrapper for a Text Analysis Engine from UIMA. (docs) | gate.uima.AnalysisEnginePR |
| Web_Crawler_Websphinx | ||
| CrawlerPR | Provides interface to the webspinx API. (docs) | crawl.CrawlPR |
| Web_Search_Google | ||
| GooglePR | Provides an interface to Google API. (docs) | google.GooglePR |
| Web_Search_Yahoo | ||
| YahooPR | Provides an interface to Yahoo API. (docs) | gate.yahoo.YahooPR |
| WordNet | ||
| WordNet 1.6 | Princeton WordNet 1.6. (docs) | gate.wordnet.IndexFileWordNetImpl |
| WordNet 1.6 Viewer | WordNet viewer | gate.gui.wordnet.WordNetViewer |
Other contributed plugins
- Multi-lingual Noun Phrase Extractor (MuNPEx)
- XCES tools
- Sen wrapper (Japanese morphological analyser)
- Russian morph tagger
- OFAI List Gazetteer
- BWP Gazetteer
- Apolda
- Reported Speech Tagger
- Keyphrase extraction module from SmILE
| Multi-lingual Noun Phrase Extractor (MuNPEx) website download |
||
|---|---|---|
| Munpex | MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). | en-np_main.jape |
| Durm German lemmatizer website download |
||
| Durm German lemmatizer | The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns. | |
| XCES tools website download |
||
| ANC Document | An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. | org.xces.gate.XCESDocument |
| ANC Load Standoff | Loads standoff annotations into an existing document. | org.xces.creole.LoadStandoff |
| ANC Save Content | Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. | org.xces.creole.SaveContent |
| ANC Save Standoff | Saves annotations from a Document to an XCES-compliant standoff markup file. | org.xces.creole.SaveStandoff |
| Sen wrapper (Japanese morphological analyser) website (in Japanese) |
||
| Sen Wrapper | Morphological analyser for Japanese | jp.co.ditlab.jgate.SenWrapper |
| Russian morph tagger website download |
||
| Russian MorphTagger | MorphTagger for russian language, based on MyStem Yandex' parser | ru.itbrains.gate.morph.MorphTagger |
| OFAI List Gazetteer website download |
||
| OFAI List Gazetteer | Extended version of the transducer-based List Gazetteer | at.ofai.gate.ListGazetteer |
| BWP Gazetteer website download |
||
| BWP Gazetteer | Extended version of the transducer-based List Gazetteer | |
| Apolda website download |
||
| Apolda Ontology Annotator | Ontology-based lookup taking terms from properties in the ontology. | telin.Apolda |
| Reported Speech Tagger website download |
||
| Reporting Verb marker | JAPE transducer which tags reporting verbs | |
| Reported Speech finder | JAPE transducer which tags reported speech | |
| Keyphrase Extraction Module website download |
||
| FrequencyAnalyser | ||
| KeywordAnalyser | ||
| LanguageIdentification | identifies the language of a document using character n-grams | |
| POSTagMapper | ||
| SimpleNounChunking | ||
| StopwordMarker | ||
|
|





