Log in Help
Print
Homereleasesgate-5.1-beta1-build3397-ALLdoc 〉 plugins.html
 
GATE

This page lists some of the plugins that are currently available with GATE:

For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".

To submit a plugin, please contact us via the gate-users mailing list.


Plugins included in the GATE distribution

Alignment
Compound Document  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentImpl
Compound Document From Xml  GATE Compound Document. (docs gate.compound.impl.CompoundDocumentFromXml
Compound Document Editor  Editor for compound documents. (docs gate.compound.gui.CompoundDocumentEditor
GATE Composite document  GATE Composite document. (docs gate.composite.impl.CompositeDocumentImpl
Alignment Editor  Alignment editor. (docs gate.alignment.gui.AlignmentEditor
Switch Member PR  Sets the focus of a compound document to a specified member document. (docs gate.compound.impl.SwitchMemberPR
Delete Member PR  Deletes one member document from a compound doc. (docs gate.compound.impl.DeleteMemberPR
Combine Members PR  Combines documents in a composite document. (docs gate.composite.impl.CombineMembersPR
Segment Processing PR  Processes individual segments as separate documents (docs gate.composite.impl.SegmentProcessingPR
ANNIE
Annotation schema  An annotation type and its features. (docs gate.creole.AnnotationSchema
GATE Unicode Tokeniser  A customisable Unicode tokeniser. (docs gate.creole.tokeniser.SimpleTokeniser
ANNIE English Tokeniser  A customisable English tokeniser. (docs gate.creole.tokeniser.DefaultTokeniser
ANNIE Gazetteer  A list lookup component. (docs gate.creole.gazetteer.DefaultGazetteer
Hash Gazetteer  A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs com.ontotext.gate.gazetteer.HashGazetteer
Jape Transducer  A module for executing Jape grammars. (docs gate.creole.Transducer
ANNIE NE Transducer  ANNIE named entity grammar. (docs gate.creole.ANNIETransducer
ANNIE Sentence Splitter  ANNIE sentence splitter. (docs gate.creole.splitter.SentenceSplitter
RegEx Sentence Splitter  A sentence splitter based on regular expressions. (docs gate.creole.splitter.RegexSentenceSplitter
ANNIE POS Tagger  Mark Hepple's Brill-style POS tagger. (docs gate.creole.POSTagger
ANNIE OrthoMatcher  ANNIE orthographical coreference component. (docs gate.creole.orthomatcher.OrthoMatcher
ANNIE Pronominal Coreferencer  Pronominal Coreference resolution component. (docs gate.creole.coref.Coreferencer
ANNIE Nominal Coreferencer  Nominal Coreference resolution component (docs gate.creole.coref.NominalCoref
Document Reset PR  Document cleaner. (docs gate.creole.annotdelete.AnnotationDeletePR
Jape Viewer  A JAPE grammar file viewer. (docs gate.gui.jape.JapeViewer
Gaze  Gazetteer viewer and editor. (docs com.ontotext.gate.vr.Gaze
Annotation_Merging
Annotation Merging PR  Merge Annotations from different annotators. (docs gate.merger.AnnotationMergingMain
Copy_Annots_Between_Docs
Copy Anns to Another Doc PR  Copy the annotations from one document to another document. (docs gate.copyAS2AnoDoc.CopyAS2AnoDocMain
Gazetteer_LKB
Large KB Gazetteer    com.ontotext.kim.gate.KimGazetteer
Semantic Annotation Enrichment    com.ontotext.kim.gate.SesameEnrichment
Gazetteer_Ontology_Based
Onto Root Gazetteer  A ontology lookup component (docs gate.clone.ql.OntoRootGaz
Fake Sentence Splitter  Fake Sentence Splitter is used by Onto Root Gazetteer internally as it creates 'fake' annotation type 'Sentence' without analysing the text by a proper Sentence Splitter. The reason for doing this is enabling the POS Tagger to work properly, as the input text is usually not a proper sentence (i.e. ontology resource name or label). 'Faking' sentence splitting optimises the processing as Onto Root Gazetteer usually does not process internally any multisentence text.   gate.clone.ql.FakeSentenceSplitter
Information_Retrieval
SearchPR  Provides IR functionality. (docs gate.creole.ir.SearchPR
Search Results  Viewer for IR search results  gate.gui.SearchPRViewer
Inter_Annotator_Agreement
IAA Computation PR  Compute inter-annotator agreement (IAA). (docs gate.iaaplugin.IaaMain
Jape_Compiler
Ontotext Japec Transducer  JAPE compiler. (docs com.ontotext.gate.japec.JapecTransducer
Keyphrase_Extraction_Algorithm
KEA Keyphrase Extractor  A Keyphrase Extractor by Eibe Frank. (docs gate.creole.kea.Kea
KEA Corpus Importer  Imports a KEA-style corpus into GATE  gate.creole.kea.CorpusImporter
Lang_Arabic
Arabic Tokeniser  A customisable Arabic tokeniser.  arabic.ArabicTokeniser
Arabic Gazetteer  A list lookup component.  arabic.ArabicGazetteer
Arabic Infered Gazetteer  A list lookup component.  arabic.ArabicInferedGazetteer
Arabic Main Grammar  A module for executing Jape grammars  arabic.ArabicTransducer
Arabic OrthoMatcher  Arabic Orthomatcher  arabic.ArabicOrthoMatcher
Lang_Cebuano
Cebuano Tokeniser  A customisable Cebuano tokeniser.  cebuano.CebuanoTokeniser
Cebuano Gazetteer  A list lookup component.  cebuano.CebuanoGazetteer
Cebuano Gazetteer Tokeniser  A list lookup component.  cebuano.CebuanoGazetteerTokeniser
Cebuano Transducer  A module for executing Jape grammars  cebuano.CebuanoTransducer
Cebuano Transducer Postprocessor  A module for executing Jape grammars  cebuano.CebuanoTransducerPost
Cebuano POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Lang_Hindi
Hindi Tokeniser  A customisable Hindi tokeniser.  hindi.HindiTokeniser
Hindi Gazetteer  A list lookup component.  hindi.HindiGazetteer
Hindi Splitter  A Sentence Splitter.  hindi.HindiSplitter
Hindi Tokeniser Gazetteer  A list lookup component.  hindi.HindiTokeniserGazetteer
Hindi Main Grammar  A module for executing Jape grammars  hindi.HindiTransducer
Hindi Tokeniser Postprocessor  A module for executing Jape grammars  hindi.HindiTokeniserPostprocessor
Hindi OrthoMatcher  Hindi Orthomatcher  hindi.HindiOrthoMatcher
Hindi POS Tagger  Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword  cebtag.postag.CebuanoPOSTagger
Lang_Romanian
Romanian Tokeniser  A customisable Romanian tokeniser.  romanian.RomanianTokeniser
Romanian Gazetteer  A list lookup component.  romanian.RomanianGazetteer
Romanian Transducer  A module for executing Jape grammars  romanian.RomanianTransducer
Learning
Batch Learning PR  Supports training, application and evaluation of machine learning models for NLP tasks (docs gate.learning.LearningAPIMain
LingPipe
LingPipe Tokenizer PR  Provides a LingPipe tokenizer. (docs gate.lingpipe.TokenizerPR
LingPipe NER PR  LingPipe Named Entity Recognizer (docs gate.lingpipe.NamedEntityRecognizerPR
LingPipe Language Identifier PR  LingPipe Identifier PR (docs gate.lingpipe.LanguageIdentifierPR
LingPipe POS Tagger PR  Provides a LingPipe part of speech tagger. (docs gate.lingpipe.POSTaggerPR
LingPipe Sentence Splitter PR  Provides an interface to LingPipe sentence splitter API. (docs gate.lingpipe.SentenceSplitterPR
Machine_Learning
Machine Learning PR  Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs gate.creole.ml.MachineLearningPR
Obsolete/apf-exporter
GATE APF exporter  An APF exporter .  gate.creole.APFormatExporter
Obsolete/document-editor
OLD Document Editor  Old editor for documents, superseded by gate.gui.docview.*  gate.gui.DocumentEditor
Unrestricted annotation editor    gate.gui.UnrestrictedAnnotationEditor
Schema annotation editor    gate.gui.SchemaAnnotationEditor
Features Editor  Old editor for feature values of any resource. Superseded by the small feature editor in the bottom-left corner of the GUI.  gate.gui.FeaturesEditor
Obsolete/html-documentformat
Old GATE HTML Document Format  Old HTML document parser, based on the Swing parser that drives JEditorPane.  gate.corpora.HtmlDocumentFormat
Obsolete/Montreal_Transducer
Montreal Transducer  A module for executing augmented Jape grammars. Many of its features have now been subsumed into the standard JAPE implementation.   ca.umontreal.iro.rali.gate.creole.MtlTransducer
Obsolete/rasp
RASP Parser  RASP (Robust Accurate Statistical Parsing) is a robust parsing system for English.  gate.rasp.rasp
Ontology
ConnectSesameOntology  Connect to a repository containing and ontology (docs gate.creole.ontology.impl.sesame.ConnectSesameOntology
CreateSesameOntology  Create a ontology from a Sesame configuration file for a repository (docs gate.creole.ontology.impl.sesame.CreateSesameOntology
OWLIM Ontology  Ontology created as a temporary OWLIM3 in-memory repository (docs gate.creole.ontology.impl.sesame.OWLIMOntology
OWLIM Ontology DEPRECATED  Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs gate.creole.ontology.owlim.OWLIMOntologyLR
Ontology_BDM_Computation
BDM Computation PR  Compute BDM score for each pair of concepts in the given ontology. (docs gate.bdmComp.BDMCompMain
Ontology_OWLIM2
OWLIM2 Ontology LR  Ontology based on Sesame1/OWLIM2. Deprecated but kept for backwards compatibility with the pre-GATE 5.1 ontology implementation. (docs gate.creole.ontology.owlim.OWLIMOntologyLR
Ontology_Tools
OntoGazetteer  A list lookup component based on mapping between ontology classes and gazetteer lists. (docs gate.creole.gazetteer.OntoGazetteerImpl
GATE Ontology Editor  Ontology editing tool. (docs gate.gui.ontology.OntologyEditor
OAT  Ontology Annotation Tool. (docs gate.creole.ontology.ocat.OntologyViewer
OpenNLP
OpenNlpSentenceSplit  Gate wrapper of the OpenNlp Sentence Splitter. (docs gate.opennlp.OpenNlpSentenceSplit
OpenNlpTokenizer  Implementation of the OpenNlp Token Splitter. (docs gate.opennlp.OpenNlpTokenizer
OpenNlpPOS  Implementation of the OpenNlp POS Tagger. (docs gate.opennlp.OpenNlpPOS
OpenNlpChunker  Implementation of the OpenNlp Chunker. (docs gate.opennlp.OpenNlpChunker
OpenNlpNameFinder  Implementation of the OpenNlp Name Finder. (docs gate.opennlp.OpenNLPNameFin
Parser_Minipar
Minipar Wrapper  MiniPar is a shallow parser. It determines the dependency relationships between the words of a sentence. (docs minipar.Minipar
Parser_RASP
RASP2 Tokenizer  RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs com.digitalpebble.rasp2.token.RASPTokenizer
RASP POS Converter  Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs com.digitalpebble.rasp2.tagger.C2Transducer
RASP2 POS Tagger  RASP part-of-speech tagger, creating WordForm annotations (docs com.digitalpebble.rasp2.tagger.PosTagger
RASP2 Morphological Analyser  RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs com.digitalpebble.rasp2.morph.MorphoAnnotator
RASP2 Parser  RASP dependency parser (docs com.digitalpebble.rasp2.parser.ParserAnnotator
Parser_Stanford
StanfordParser  Stanford parser wrapper (docs gate.stanford.Parser
Parser_SUPPLE
SUPPLE Parser  SUPPLE bottom-up chart parser. (docs shef.nlp.supple.SUPPLE
Schema_Annotation_Editor
Schema Annotations Editor  An annotation editor restricted by schemas. (docs gate.gui.annedit.SchemaAnnotationEditor
Segmenter_Chinese
Chinese Segmenter PR  Segment the Chinese text into words, based on the PAUM learning algorithm. (docs gate.chineseSeg.ChineseSegMain
Stemmer_Snowball
Stemmer PR  Wrapper for the Snowball stemmer. (docs stemmer.SnowballStemmer
Tagger_Abner
AbnerTagger  Gate wrapper over Abner. (docs gate.abner.AbnerTagger
Tagger_Chemistry
Chemistry Tagger  A tagger for chemical names. (docs mark.chemistry.Tagger
Tagger_Framework
GenericTagger  The Generic Tagger is Generic! (docs gate.taggerframework.GenericTagger
Tagger_NP_Chunking
Noun Phrase Chunker  Implementation of the Ramshaw and Marcus base noun phrase chunker (docs mark.chunking.GATEWrapper
Tagger_OpenCalais
OpenCalais Tagger  An OpenCalais based semantic annotator (docs gate.opencalais.OpenCalais
Tagger_TreeTagger
TreeTagger  The TreeTagger is a language-independent part-of-speech tagger, which currently supports English, French, German, and Spanish. (docs gate.treetagger.TreeTagger
Tools
Gazetteer List Collector  Gazetteer lists collector. (docs gate.creole.GazetteerListsCollector
ANNIE VP Chunker  ANNIE VP Chunker component. (docs gate.creole.VPChunker
Annotation Set Transfer  Annotation set transfer component. (docs gate.creole.annotransfer.AnnotationSetTransfer
Flexible Exporter  Exports a document with GATE annotations to its original format. (docs gate.creole.dumpingPR.DumpingPR
GATE Morphological analyser  Morphological Analyzer for the English Language. (docs gate.creole.morph.Morph
Flexible Gazetteer  A more flexible list lookup component. (docs gate.creole.gazetteer.FlexibleGazetteer
Syntax tree viewer  Viewer for syntax trees generated by a parser. (docs gate.gui.SyntaxTreeViewer
UIMA
UIMA Analysis Engine  Wrapper for a Text Analysis Engine from UIMA. (docs gate.uima.AnalysisEnginePR
Web_Crawler_Websphinx
CrawlerPR  Provides interface to the webspinx API. (docs crawl.CrawlPR
Web_Search_Google
GooglePR  Provides an interface to Google API. (docs google.GooglePR
Web_Search_Yahoo
YahooPR  Provides an interface to Yahoo API. (docs gate.yahoo.YahooPR
WordNet
WordNet 1.6  Princeton WordNet 1.6. (docs gate.wordnet.IndexFileWordNetImpl
WordNet 1.6 Viewer  WordNet viewer  gate.gui.wordnet.WordNetViewer

Other contributed plugins

Multi-lingual Noun Phrase Extractor (MuNPEx)
website download
Munpex MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). en-np_main.jape
Durm German lemmatizer
website download
Durm German lemmatizer The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns.
XCES tools
website download
Tools to handle documents conforming to the XML Corpus Encoding Standard (XCES) format, used by the American National Corpus. XCES is a way of encoding texts with standoff markup in XML.
ANC Document An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. org.xces.gate.XCESDocument
ANC Load Standoff Loads standoff annotations into an existing document. org.xces.creole.LoadStandoff
ANC Save Content Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. org.xces.creole.SaveContent
ANC Save Standoff Saves annotations from a Document to an XCES-compliant standoff markup file. org.xces.creole.SaveStandoff
Sen wrapper (Japanese morphological analyser)
website (in Japanese)
Sen is a morphological analyser for Japanese. This is a wrapper to allow it to be used from GATE, you must also install sen itself. See the documentation (in Japanese) for details.
Sen Wrapper Morphological analyser for Japanese jp.co.ditlab.jgate.SenWrapper
Russian morph tagger
website download
Provides GATE wrapper for mystem russian morphological parser. Allows to execute native analyzer, parse output, and assign morpho features as GATE annotations for the document.
Russian MorphTagger MorphTagger for russian language, based on MyStem Yandex' parser ru.itbrains.gate.morph.MorphTagger
OFAI List Gazetteer
website download
This plugin provides an extended version of the original GATE ListGazetteer. In addition to the features of the original, built-in version of the List Gazetteer, this version provides features for more powerful matching of partial words.
OFAI List Gazetteer Extended version of the transducer-based List Gazetteer at.ofai.gate.ListGazetteer
BWP Gazetteer
website download
This plugin provides an approximate gazetteer for GATE, based on Levenshtein's Edit Distance for strings. Its goal is to handle texts with noise and errors, in which GATE's default gazetteers may have difficulties.
BWP Gazetteer Extended version of the transducer-based List Gazetteer
Apolda
website download
Annotates documents like a gazetteer, but takes the terms from OWL annotation properties in an ontology, rather than from a separate list of terms.
Apolda Ontology Annotator Ontology-based lookup taking terms from properties in the ontology. telin.Apolda
Reported Speech Tagger
website download
Automatically detects and tags reported speech constructs, in particular the source, reporting verb and content.
Reporting Verb marker JAPE transducer which tags reporting verbs
Reported Speech finder JAPE transducer which tags reported speech
Keyphrase Extraction Module
website download
Keyphrase extraction and language identification.
FrequencyAnalyser
KeywordAnalyser
LanguageIdentification identifies the language of a document using character n-grams
POSTagMapper
SimpleNounChunking
StopwordMarker

NLP group NLP Group
kewl red line