This page lists some of the plugins that are currently available with GATE:
For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".
To submit a plugin, please contact us via the gate-users mailing list.
Plugins included in the GATE distribution
- Alignment
- ANNIE
- Annotation_Merging
- Copy_Annots_Between_Docs
- Gazetteer_LKB
- Gazetteer_Ontology_Based
- Groovy
- Information_Retrieval
- Inter_Annotator_Agreement
- Jape_Compiler
- Keyphrase_Extraction_Algorithm
- Language_Identification
- Lang_Arabic
- Lang_Cebuano
- Lang_Chinese
- Lang_Hindi
- Lang_Romanian
- Learning
- LingPipe
- Machine_Learning
- Obsolete/apf-exporter
- Obsolete/document-editor
- Obsolete/html-documentformat
- Obsolete/Montreal_Transducer
- Obsolete/rasp
- Ontology
- Ontology_BDM_Computation
- Ontology_OWLIM2
- Ontology_Tools
- OpenNLP
- Parser_Minipar
- Parser_RASP
- Parser_Stanford
- Parser_SUPPLE
- Schema_Annotation_Editor
- Stemmer_Snowball
- Tagger_Abner
- Tagger_Chemistry
- Tagger_Framework
- Tagger_MetaMap
- Tagger_NP_Chunking
- Tagger_OpenCalais
- Tools
- UIMA
- Web_Crawler_Websphinx
- Web_Search_Google
- Web_Search_Yahoo
- Web_Translate_Google
- WordNet
Alignment | ||
---|---|---|
Compound Document | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentImpl |
Compound Document From Xml | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentFromXml |
Compound Document Editor | Editor for compound documents. (docs) | gate.compound.gui.CompoundDocumentEditor |
GATE Composite document | GATE Composite document. (docs) | gate.composite.impl.CompositeDocumentImpl |
Switch Member PR | Sets the focus of a compound document to a specified member document. (docs) | gate.compound.impl.SwitchMemberPR |
Delete Member PR | Deletes one member document from a compound doc. (docs) | gate.compound.impl.DeleteMemberPR |
Combine Members PR | Combines documents in a composite document. (docs) | gate.composite.impl.CombineMembersPR |
Segment Processing PR | Processes individual segments as separate documents (docs) | gate.composite.impl.SegmentProcessingPR |
ExportAlignmentPR | A PR to export alignment information in an xml file. | gate.alignment.ExportAlignmentPR |
ANNIE | ||
Annotation Schema | An annotation type and its features. (docs) | gate.creole.AnnotationSchema |
GATE Unicode Tokeniser | A customisable Unicode tokeniser. (docs) | gate.creole.tokeniser.SimpleTokeniser |
ANNIE English Tokeniser | A customisable English tokeniser. (docs) | gate.creole.tokeniser.DefaultTokeniser |
ANNIE Gazetteer | A list lookup component. (docs) | gate.creole.gazetteer.DefaultGazetteer |
Sharable Gazetteer | A sharable list lookup component. (docs) | gate.creole.gazetteer.SharedDefaultGazetteer |
Hash Gazetteer | A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs) | com.ontotext.gate.gazetteer.HashGazetteer |
Jape Transducer | A module for executing Jape grammars. (docs) | gate.creole.Transducer |
ANNIE NE Transducer | ANNIE named entity grammar. (docs) | gate.creole.ANNIETransducer |
ANNIE Sentence Splitter | ANNIE sentence splitter. (docs) | gate.creole.splitter.SentenceSplitter |
RegEx Sentence Splitter | A sentence splitter based on regular expressions. (docs) | gate.creole.splitter.RegexSentenceSplitter |
ANNIE POS Tagger | Mark Hepple's Brill-style POS tagger. (docs) | gate.creole.POSTagger |
ANNIE OrthoMatcher | ANNIE orthographical coreference component. (docs) | gate.creole.orthomatcher.OrthoMatcher |
ANNIE Pronominal Coreferencer | Pronominal Coreference resolution component. (docs) | gate.creole.coref.Coreferencer |
ANNIE Nominal Coreferencer | Nominal Coreference resolution component (docs) | gate.creole.coref.NominalCoref |
Document Reset PR | Document cleaner. (docs) | gate.creole.annotdelete.AnnotationDeletePR |
Jape Viewer | A JAPE grammar file viewer. (docs) | gate.gui.jape.JapeViewer |
Gaze | HashGazetteer viewer and editor. (docs) | com.ontotext.gate.vr.Gaze |
Gazetteer Editor | Gazetteer viewer and editor. (docs) | gate.gui.GazetteerEditor |
Annotation_Merging | ||
Annotation Merging PR | Merge Annotations from different annotators. (docs) | gate.merger.AnnotationMergingMain |
Copy_Annots_Between_Docs | ||
Copy Anns to Another Doc PR | Copy the annotations from one document to another document. (docs) | gate.copyAS2AnoDoc.CopyAS2AnoDocMain |
Gazetteer_LKB | ||
Large KB Gazetteer | com.ontotext.kim.gate.KimGazetteer | |
Semantic Enrichment PR | The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories. (docs) | com.ontotext.kim.gate.SesameEnrichment |
Gazetteer_Ontology_Based | ||
Onto Root Gazetteer | A ontology lookup component (docs) | gate.clone.ql.OntoRootGaz |
Fake Sentence Splitter | Fake Sentence Splitter is used by Onto Root Gazetteer internally as it creates 'fake' annotation type 'Sentence' without analysing the text by a proper Sentence Splitter. The reason for doing this is enabling the POS Tagger to work properly, as the input text is usually not a proper sentence (i.e. ontology resource name or label). 'Faking' sentence splitting optimises the processing as Onto Root Gazetteer usually does not process internally any multisentence text. | gate.clone.ql.FakeSentenceSplitter |
Groovy | ||
Groovy support for GATE | gate.groovy.GroovySupport | |
Groovy scripting PR | Runs a Groovy script as a processing resource (docs) | gate.groovy.ScriptPR |
Scriptable Controller | A controller whose execution strategy is controlled by a Groovy script (docs) | gate.groovy.ScriptableController |
Control Script | Editor for the Groovy script controlling a scriptable controller | gate.groovy.gui.ControllerScriptEditor |
Information_Retrieval | ||
SearchPR | Provides IR functionality. (docs) | gate.creole.ir.SearchPR |
Search Results | Viewer for IR search results | gate.gui.SearchPRViewer |
Inter_Annotator_Agreement | ||
IAA Computation PR | Compute inter-annotator agreement (IAA). (docs) | gate.iaaplugin.IaaMain |
Jape_Compiler | ||
Ontotext Japec Transducer | JAPE compiler. (docs) | com.ontotext.gate.japec.JapecTransducer |
Keyphrase_Extraction_Algorithm | ||
KEA Keyphrase Extractor | A Keyphrase Extractor by Eibe Frank. (docs) | gate.creole.kea.Kea |
KEA Corpus Importer | Imports a KEA-style corpus into GATE | gate.creole.kea.CorpusImporter |
Language_Identification | ||
TextCat PR | Recognizes the document language using TextCat. Possible languages: german, english, french, spanish, italian, swedish, polish, dutch, norwegian, finnish, albanian slovakian, slovenian, danish, hungarian. | org.knallgrau.utils.textcat.LanguageIdentifier |
Lang_Arabic | ||
Arabic Tokeniser | A customisable Arabic tokeniser. | arabic.ArabicTokeniser |
Arabic Gazetteer | A list lookup component. | arabic.ArabicGazetteer |
Arabic Infered Gazetteer | A list lookup component. | arabic.ArabicInferedGazetteer |
Arabic Main Grammar | A module for executing Jape grammars | arabic.ArabicTransducer |
Arabic OrthoMatcher | Arabic Orthomatcher | arabic.ArabicOrthoMatcher |
Lang_Cebuano | ||
Cebuano Tokeniser | A customisable Cebuano tokeniser. | cebuano.CebuanoTokeniser |
Cebuano Gazetteer | A list lookup component. | cebuano.CebuanoGazetteer |
Cebuano Gazetteer Tokeniser | A list lookup component. | cebuano.CebuanoGazetteerTokeniser |
Cebuano Transducer | A module for executing Jape grammars | cebuano.CebuanoTransducer |
Cebuano Transducer Postprocessor | A module for executing Jape grammars | cebuano.CebuanoTransducerPost |
Cebuano POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
Lang_Chinese | ||
Chinese Segmenter PR | Segment the Chinese text into words, based on the PAUM learning algorithm. (docs) | gate.chineseSeg.ChineseSegMain |
Lang_Hindi | ||
Hindi Tokeniser | A customisable Hindi tokeniser. | hindi.HindiTokeniser |
Hindi Gazetteer | A list lookup component. | hindi.HindiGazetteer |
Hindi Splitter | A Sentence Splitter. | hindi.HindiSplitter |
Hindi Tokeniser Gazetteer | A list lookup component. | hindi.HindiTokeniserGazetteer |
Hindi Main Grammar | A module for executing Jape grammars | hindi.HindiTransducer |
Hindi Tokeniser Postprocessor | A module for executing Jape grammars | hindi.HindiTokeniserPostprocessor |
Hindi OrthoMatcher | Hindi Orthomatcher | hindi.HindiOrthoMatcher |
Hindi POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
Lang_Romanian | ||
Romanian Tokeniser | A customisable Romanian tokeniser. | romanian.RomanianTokeniser |
Romanian Gazetteer | A list lookup component. | romanian.RomanianGazetteer |
Romanian Transducer | A module for executing Jape grammars | romanian.RomanianTransducer |
Learning | ||
Batch Learning PR | Supports training, application and evaluation of machine learning models for NLP tasks (docs) | gate.learning.LearningAPIMain |
LingPipe | ||
LingPipe Tokenizer PR | Provides a LingPipe tokenizer. (docs) | gate.lingpipe.TokenizerPR |
LingPipe NER PR | LingPipe Named Entity Recognizer (docs) | gate.lingpipe.NamedEntityRecognizerPR |
LingPipe Language Identifier PR | GATE PR for language identification using LingPipe (docs) | gate.lingpipe.LanguageIdentifierPR |
LingPipe POS Tagger PR | Provides a LingPipe part of speech tagger. (docs) | gate.lingpipe.POSTaggerPR |
LingPipe Sentence Splitter PR | Provides an interface to LingPipe sentence splitter API. (docs) | gate.lingpipe.SentenceSplitterPR |
Machine_Learning | ||
Machine Learning PR | Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs) | gate.creole.ml.MachineLearningPR |
Obsolete/apf-exporter | ||
GATE APF exporter | An APF exporter . | gate.creole.APFormatExporter |
Obsolete/document-editor | ||
OLD Document Editor | Old editor for documents, superseded by gate.gui.docview.* | gate.gui.DocumentEditor |
Unrestricted annotation editor | gate.gui.UnrestrictedAnnotationEditor | |
Schema annotation editor | gate.gui.SchemaAnnotationEditor | |
Features Editor | Old editor for feature values of any resource. Superseded by the small feature editor in the bottom-left corner of the GUI. | gate.gui.FeaturesEditor |
Obsolete/html-documentformat | ||
Old GATE HTML Document Format | Old HTML document parser, based on the Swing parser that drives JEditorPane. | gate.corpora.HtmlDocumentFormat |
Obsolete/Montreal_Transducer | ||
Montreal Transducer | A module for executing augmented Jape grammars. Many of its features have now been subsumed into the standard JAPE implementation. | ca.umontreal.iro.rali.gate.creole.MtlTransducer |
Obsolete/rasp | ||
RASP Parser | RASP (Robust Accurate Statistical Parsing) is a robust parsing system for English. | gate.rasp.rasp |
Ontology | ||
ConnectSesameOntology | Connect to a repository containing and ontology (docs) | gate.creole.ontology.impl.sesame.ConnectSesameOntology |
CreateSesameOntology | Create a ontology from a Sesame configuration file for a repository (docs) | gate.creole.ontology.impl.sesame.CreateSesameOntology |
OWLIM Ontology | Ontology created as a temporary OWLIM3 in-memory repository (docs) | gate.creole.ontology.impl.sesame.OWLIMOntology |
OWLIM Ontology DEPRECATED | Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs) | gate.creole.ontology.owlim.OWLIMOntologyLR |
Ontology_BDM_Computation | ||
BDM Computation PR | Compute BDM score for each pair of concepts in the given ontology. (docs) | gate.bdmComp.BDMCompMain |
Ontology_OWLIM2 | ||
OWLIM2 Ontology LR | Ontology based on Sesame1/OWLIM2. Deprecated but kept for backwards compatibility with the pre-GATE 5.1 ontology implementation. (docs) | gate.creole.ontology.owlim.OWLIMOntologyLR |
Ontology_Tools | ||
OntoGazetteer | A list lookup component based on mapping between ontology classes and gazetteer lists. (docs) | gate.creole.gazetteer.OntoGazetteerImpl |
GATE Ontology Editor | Ontology editing tool. (docs) | gate.gui.ontology.OntologyEditor |
OAT | Ontology Annotation Tool. (docs) | gate.creole.ontology.ocat.OntologyViewer |
RAT-C | Relation Annotation Tool Class view. (docs) | gate.gui.docview.OntologyClassView |
RAT-I | Relation Annotation Tool Instance view. (docs) | gate.gui.docview.OntologyInstanceView |
OpenNLP | ||
OpenNlpSentenceSplit | Gate wrapper of the OpenNlp Sentence Splitter. (docs) | gate.opennlp.OpenNlpSentenceSplit |
OpenNlpTokenizer | Implementation of the OpenNlp Token Splitter. (docs) | gate.opennlp.OpenNlpTokenizer |
OpenNlpPOS | Implementation of the OpenNlp POS Tagger. (docs) | gate.opennlp.OpenNlpPOS |
OpenNlpChunker | Implementation of the OpenNlp Chunker. (docs) | gate.opennlp.OpenNlpChunker |
OpenNlpNameFinder | Implementation of the OpenNlp Name Finder. (docs) | gate.opennlp.OpenNLPNameFin |
Parser_Minipar | ||
Minipar Wrapper | MiniPar is a shallow parser. It determines the dependency relationships between the words of a sentence. (docs) | minipar.Minipar |
Parser_RASP | ||
RASP2 Tokenizer | RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs) | com.digitalpebble.rasp2.token.RASPTokenizer |
RASP POS Converter | Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs) | com.digitalpebble.rasp2.tagger.C2Transducer |
RASP2 POS Tagger | RASP part-of-speech tagger, creating WordForm annotations (docs) | com.digitalpebble.rasp2.tagger.PosTagger |
RASP2 Morphological Analyser | RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs) | com.digitalpebble.rasp2.morph.MorphoAnnotator |
RASP2 Parser | RASP dependency parser (docs) | com.digitalpebble.rasp2.parser.ParserAnnotator |
Parser_Stanford | ||
StanfordParser | Stanford parser wrapper (docs) | gate.stanford.Parser |
Parser_SUPPLE | ||
SUPPLE Parser | SUPPLE bottom-up chart parser. (docs) | shef.nlp.supple.SUPPLE |
Schema_Annotation_Editor | ||
Schema Annotations Editor | An annotation editor restricted by schemas. (docs) | gate.gui.annedit.SchemaAnnotationEditor |
Stemmer_Snowball | ||
Stemmer PR | Wrapper for the Snowball stemmer. (docs) | stemmer.SnowballStemmer |
Tagger_Abner | ||
AbnerTagger | Gate wrapper over Abner. (docs) | gate.abner.AbnerTagger |
Tagger_Chemistry | ||
Chemistry Tagger | A tagger for chemical names. (docs) | mark.chemistry.Tagger |
Tagger_Framework | ||
GenericTagger | The Generic Tagger is Generic! (docs) | gate.taggerframework.GenericTagger |
Tagger_MetaMap | ||
MetaMap Annotator | This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port (docs) | gate.metamap.MetaMapPR |
Tagger_NP_Chunking | ||
Noun Phrase Chunker | Implementation of the Ramshaw and Marcus base noun phrase chunker (docs) | mark.chunking.GATEWrapper |
Tagger_OpenCalais | ||
OpenCalais Tagger | An OpenCalais based semantic annotator (docs) | gate.opencalais.OpenCalais |
Tools | ||
Gazetteer List Collector | Gazetteer lists collector. (docs) | gate.creole.GazetteerListsCollector |
ANNIE VP Chunker | ANNIE VP Chunker component. (docs) | gate.creole.VPChunker |
Annotation Set Transfer | Annotation set transfer component. (docs) | gate.creole.annotransfer.AnnotationSetTransfer |
Flexible Exporter | Exports a document with GATE annotations to its original format. (docs) | gate.creole.dumpingPR.DumpingPR |
GATE Morphological analyser | Morphological Analyzer for the English Language. (docs) | gate.creole.morph.Morph |
Flexible Gazetteer | A more flexible list lookup component. (docs) | gate.creole.gazetteer.FlexibleGazetteer |
Syntax tree viewer | Viewer for syntax trees generated by a parser. (docs) | gate.gui.SyntaxTreeViewer |
Quality Assurance PR | The Quality Assurance PR that runs when the pipeline is processing the last document | gate.qa.QualityAssurancePR |
UIMA | ||
UIMA Analysis Engine | Wrapper for a Text Analysis Engine from UIMA. (docs) | gate.uima.AnalysisEnginePR |
Web_Crawler_Websphinx | ||
CrawlerPR | GATE implementation of the Websphinx crawling API (docs) | crawl.CrawlPR |
Web_Search_Google | ||
GooglePR | Provides an interface to Google API. (docs) | google.GooglePR |
Web_Search_Yahoo | ||
YahooPR | Provides an interface to Yahoo API. (docs) | gate.yahoo.YahooPR |
Web_Translate_Google | ||
Google Translator PR | Runs a google translator over the source member document and produces the translated document. User can also specify if he/she wants to align unitOfTranslation in the source and the target documents. | gate.translate.google.GoogleTranslatorPR |
WordNet | ||
WordNet 1.6 | Princeton WordNet 1.6. (docs) | gate.wordnet.IndexFileWordNetImpl |
WordNet | WordNet (docs) | gate.wordnet.JWNLWordNetImpl |
WordNet Viewer | WordNet viewer | gate.gui.wordnet.WordNetViewer |
Other contributed plugins
- Multi-lingual Noun Phrase Extractor (MuNPEx)
- XCES tools
- Sen wrapper (Japanese morphological analyser)
- Russian morph tagger
- OFAI List Gazetteer
- BWP Gazetteer
- Apolda
- Reported Speech Tagger
- Keyphrase extraction module from SmILE
Multi-lingual Noun Phrase Extractor (MuNPEx) website download |
||
---|---|---|
Munpex | MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). | en-np_main.jape |
Durm German lemmatizer website download |
||
Durm German lemmatizer | The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns. | |
XCES tools website download |
||
ANC Document | An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. | org.xces.gate.XCESDocument |
ANC Load Standoff | Loads standoff annotations into an existing document. | org.xces.creole.LoadStandoff |
ANC Save Content | Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. | org.xces.creole.SaveContent |
ANC Save Standoff | Saves annotations from a Document to an XCES-compliant standoff markup file. | org.xces.creole.SaveStandoff |
Sen wrapper (Japanese morphological analyser) website (in Japanese) |
||
Sen Wrapper | Morphological analyser for Japanese | jp.co.ditlab.jgate.SenWrapper |
Russian morph tagger website download |
||
Russian MorphTagger | MorphTagger for russian language, based on MyStem Yandex' parser | ru.itbrains.gate.morph.MorphTagger |
OFAI List Gazetteer website download |
||
OFAI List Gazetteer | Extended version of the transducer-based List Gazetteer | at.ofai.gate.ListGazetteer |
BWP Gazetteer website download |
||
BWP Gazetteer | Extended version of the transducer-based List Gazetteer | |
Apolda website download |
||
Apolda Ontology Annotator | Ontology-based lookup taking terms from properties in the ontology. | telin.Apolda |
Reported Speech Tagger website download |
||
Reporting Verb marker | JAPE transducer which tags reporting verbs | |
Reported Speech finder | JAPE transducer which tags reported speech | |
Keyphrase Extraction Module website download |
||
FrequencyAnalyser | ||
KeywordAnalyser | ||
LanguageIdentification | identifies the language of a document using character n-grams | |
POSTagMapper | ||
SimpleNounChunking | ||
StopwordMarker |
|