This page lists some of the plugins that are currently available with GATE:
For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".
To submit a plugin, please contact us via the gate-users mailing list.
Plugins included in the GATE distribution
- Alignment
- ANNIE
- Annotation_Merging
- Copy_Annots_Between_Docs
- Gazetteer_LKB
- Gazetteer_Ontology_Based
- GENIA
- Groovy
- Information_Retrieval
- Inter_Annotator_Agreement
- JAPE_Plus
- Keyphrase_Extraction_Algorithm
- Language_Identification
- Lang_Arabic
- Lang_Cebuano
- Lang_Chinese
- Lang_Hindi
- Lang_Romanian
- Learning
- LingPipe
- Machine_Learning
- Ontology
- Ontology_BDM_Computation
- Ontology_Tools
- OpenNLP
- Parser_Minipar
- Parser_RASP
- Parser_Stanford
- Parser_SUPPLE
- Schema_Annotation_Editor
- Schema_Tools
- Stemmer_Snowball
- Tagger_Abner
- Tagger_Boilerpipe
- Tagger_Chemistry
- Tagger_DateNormalizer
- Tagger_Framework
- Tagger_Measurements
- Tagger_MetaMap
- Tagger_MutationFinder
- Tagger_NormaGene
- Tagger_NP_Chunking
- Tagger_Numbers
- Tagger_OpenCalais
- Tagger_PennBio
- Teamware_Tools
- Tools
- UIMA
- Web_Crawler_Websphinx
- WordNet
Alignment | ||
---|---|---|
Compound Document | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentImpl |
Compound Document From Xml | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentFromXml |
Compound Document Editor | Editor for compound documents. (docs) | gate.compound.gui.CompoundDocumentEditor |
GATE Composite document | GATE Composite document. (docs) | gate.composite.impl.CompositeDocumentImpl |
Switch Member PR | Sets the focus of a compound document to a specified member document. (docs) | gate.compound.impl.SwitchMemberPR |
Delete Member PR | Deletes one member document from a compound doc. (docs) | gate.compound.impl.DeleteMemberPR |
Combine Members PR | Combines documents in a composite document. (docs) | gate.composite.impl.CombineMembersPR |
Segment Processing PR | Processes individual segments as separate documents (docs) | gate.composite.impl.SegmentProcessingPR |
ExportAlignmentPR | A PR to export alignment information in an xml file. | gate.alignment.ExportAlignmentPR |
ANNIE | ||
Annotation Schema | An annotation type and its features. (docs) | gate.creole.AnnotationSchema |
GATE Unicode Tokeniser | A customisable Unicode tokeniser. (docs) | gate.creole.tokeniser.SimpleTokeniser |
ANNIE English Tokeniser | A customisable English tokeniser. (docs) | gate.creole.tokeniser.DefaultTokeniser |
ANNIE Gazetteer | A list lookup component. (docs) | gate.creole.gazetteer.DefaultGazetteer |
Sharable Gazetteer | A sharable list lookup component. (docs) | gate.creole.gazetteer.SharedDefaultGazetteer |
Hash Gazetteer | A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs) | com.ontotext.gate.gazetteer.HashGazetteer |
JAPE Transducer | A module for executing Jape grammars. (docs) | gate.creole.Transducer |
ANNIE NE Transducer | ANNIE named entity grammar. (docs) | gate.creole.ANNIETransducer |
ANNIE Sentence Splitter | ANNIE sentence splitter. (docs) | gate.creole.splitter.SentenceSplitter |
RegEx Sentence Splitter | A sentence splitter based on regular expressions. (docs) | gate.creole.splitter.RegexSentenceSplitter |
ANNIE POS Tagger | Mark Hepple's Brill-style POS tagger. (docs) | gate.creole.POSTagger |
ANNIE OrthoMatcher | ANNIE orthographical coreference component. (docs) | gate.creole.orthomatcher.OrthoMatcher |
ANNIE Pronominal Coreferencer | Pronominal Coreference resolution component. (docs) | gate.creole.coref.Coreferencer |
ANNIE Nominal Coreferencer | Nominal Coreference resolution component (docs) | gate.creole.coref.NominalCoref |
Document Reset PR | Document cleaner. (docs) | gate.creole.annotdelete.AnnotationDeletePR |
Jape Viewer | A JAPE grammar file viewer. (docs) | gate.gui.jape.JapeViewer |
Gazetteer Editor | Gazetteer viewer and editor. (docs) | gate.gui.GazetteerEditor |
Annotation_Merging | ||
Annotation Merging PR | Merge Annotations from different annotators. (docs) | gate.merger.AnnotationMergingMain |
Copy_Annots_Between_Docs | ||
Copy Anns to Another Doc PR | Copy the annotations from one document to another document. (docs) | gate.copyAS2AnoDoc.CopyAS2AnoDocMain |
Gazetteer_LKB | ||
Large KB Gazetteer | com.ontotext.kim.gate.KimGazetteer | |
Semantic Enrichment PR | The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories. (docs) | com.ontotext.kim.gate.SesameEnrichment |
Gazetteer_Ontology_Based | ||
Onto Root Gazetteer | A ontology lookup component (docs) | gate.clone.ql.OntoRootGaz |
Fake Sentence Splitter | Fake Sentence Splitter is used by Onto Root Gazetteer internally as it creates 'fake' annotation type 'Sentence' without analysing the text by a proper Sentence Splitter. The reason for doing this is enabling the POS Tagger to work properly, as the input text is usually not a proper sentence (i.e. ontology resource name or label). 'Faking' sentence splitting optimises the processing as Onto Root Gazetteer usually does not process internally any multisentence text. | gate.clone.ql.FakeSentenceSplitter |
GENIA | ||
GENIA Sentence Splitter | A processing resource that takes document and corpus parameters (docs) | gate.creole.genia.splitter.GENIASentenceSplitter |
Groovy | ||
Groovy support for GATE | gate.groovy.GroovySupport | |
Groovy scripting PR | Runs a Groovy script as a processing resource (docs) | gate.groovy.ScriptPR |
Scriptable Controller | A controller whose execution strategy is controlled by a Groovy script (docs) | gate.groovy.ScriptableController |
Control Script | Editor for the Groovy script controlling a scriptable controller | gate.groovy.gui.ControllerScriptEditor |
Script Editor | Editor for the Groovy script behind this PR | gate.groovy.gui.ScriptPREditor |
Information_Retrieval | ||
SearchPR | Provides IR functionality. (docs) | gate.creole.ir.SearchPR |
Search Results | Viewer for IR search results | gate.gui.SearchPRViewer |
Inter_Annotator_Agreement | ||
IAA Computation PR | Compute inter-annotator agreement (IAA). (docs) | gate.iaaplugin.IaaMain |
JAPE_Plus | ||
JAPE-Plus Viewer | A JAPE grammar file viewer (docs) | gate.gui.jape.plus.Viewer |
JAPE-Plus Transducer | An optimised, JAPE-compatible transducer. | gate.jape.plus.Transducer |
Keyphrase_Extraction_Algorithm | ||
KEA Keyphrase Extractor | A Keyphrase Extractor by Eibe Frank. (docs) | gate.creole.kea.Kea |
KEA Corpus Importer | Imports a KEA-style corpus into GATE | gate.creole.kea.CorpusImporter |
Language_Identification | ||
TextCat Fingerprint Generator | Generate language fingerprints for use with the TextCat Language Indentification PR (docs) | org.knallgrau.utils.textcat.FingerprintGenerator |
TextCat Language Identification | Recognizes the document language using TextCat (docs) | org.knallgrau.utils.textcat.LanguageIdentifier |
Lang_Arabic | ||
Arabic Tokeniser | A customisable Arabic tokeniser. | arabic.ArabicTokeniser |
Arabic Gazetteer | A list lookup component. | arabic.ArabicGazetteer |
Arabic Infered Gazetteer | A list lookup component. | arabic.ArabicInferedGazetteer |
Arabic Main Grammar | A module for executing Jape grammars | arabic.ArabicTransducer |
Arabic OrthoMatcher | Arabic Orthomatcher | arabic.ArabicOrthoMatcher |
Lang_Cebuano | ||
Cebuano Tokeniser | A customisable Cebuano tokeniser. | cebuano.CebuanoTokeniser |
Cebuano Gazetteer | A list lookup component. | cebuano.CebuanoGazetteer |
Cebuano Gazetteer Tokeniser | A list lookup component. | cebuano.CebuanoGazetteerTokeniser |
Cebuano Transducer | A module for executing Jape grammars | cebuano.CebuanoTransducer |
Cebuano Transducer Postprocessor | A module for executing Jape grammars | cebuano.CebuanoTransducerPost |
Cebuano POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
Lang_Chinese | ||
Chinese Segmenter PR | Segment the Chinese text into words, based on the PAUM learning algorithm. (docs) | gate.chineseSeg.ChineseSegMain |
Chinese IE System | chinese.ChineseIE | |
Lang_Hindi | ||
Hindi Tokeniser | A customisable Hindi tokeniser. | hindi.HindiTokeniser |
Hindi Gazetteer | A list lookup component. | hindi.HindiGazetteer |
Hindi Splitter | A Sentence Splitter. | hindi.HindiSplitter |
Hindi Tokeniser Gazetteer | A list lookup component. | hindi.HindiTokeniserGazetteer |
Hindi Main Grammar | A module for executing Jape grammars | hindi.HindiTransducer |
Hindi Tokeniser Postprocessor | A module for executing Jape grammars | hindi.HindiTokeniserPostprocessor |
Hindi OrthoMatcher | Hindi Orthomatcher | hindi.HindiOrthoMatcher |
Hindi POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
Lang_Romanian | ||
Romanian Tokeniser | A customisable Romanian tokeniser. | romanian.RomanianTokeniser |
Romanian Gazetteer | A list lookup component. | romanian.RomanianGazetteer |
Romanian Transducer | A module for executing Jape grammars | romanian.RomanianTransducer |
Learning | ||
Batch Learning PR | Supports training, application and evaluation of machine learning models for NLP tasks (docs) | gate.learning.LearningAPIMain |
LingPipe | ||
LingPipe Tokenizer PR | Provides a LingPipe tokenizer. (docs) | gate.lingpipe.TokenizerPR |
LingPipe NER PR | LingPipe Named Entity Recognizer (docs) | gate.lingpipe.NamedEntityRecognizerPR |
LingPipe Language Identifier PR | GATE PR for language identification using LingPipe (docs) | gate.lingpipe.LanguageIdentifierPR |
LingPipe POS Tagger PR | Provides a LingPipe part of speech tagger. (docs) | gate.lingpipe.POSTaggerPR |
LingPipe Sentence Splitter PR | Provides an interface to LingPipe sentence splitter API. (docs) | gate.lingpipe.SentenceSplitterPR |
Machine_Learning | ||
Machine Learning PR | Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs) | gate.creole.ml.MachineLearningPR |
Ontology | ||
ConnectSesameOntology | Connect to a repository containing and ontology (docs) | gate.creole.ontology.impl.sesame.ConnectSesameOntology |
CreateSesameOntology | Create a ontology from a Sesame configuration file for a repository (docs) | gate.creole.ontology.impl.sesame.CreateSesameOntology |
OWLIM Ontology | Ontology created as a temporary OWLIM3 in-memory repository (docs) | gate.creole.ontology.impl.sesame.OWLIMOntology |
OWLIM Ontology DEPRECATED | Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs) | gate.creole.ontology.owlim.OWLIMOntologyLR |
Ontology_BDM_Computation | ||
BDM Computation PR | Compute BDM score for each pair of concepts in the given ontology. (docs) | gate.bdmComp.BDMCompMain |
Ontology_Tools | ||
OntoGazetteer | A list lookup component based on mapping between ontology classes and gazetteer lists. (docs) | gate.creole.gazetteer.OntoGazetteerImpl |
GATE Ontology Editor | Ontology editing tool. (docs) | gate.gui.ontology.OntologyEditor |
OAT | Ontology Annotation Tool. (docs) | gate.creole.ontology.ocat.OntologyViewer |
RAT-C | Relation Annotation Tool Class view. (docs) | gate.gui.docview.OntologyClassView |
RAT-I | Relation Annotation Tool Instance view. (docs) | gate.gui.docview.OntologyInstanceView |
GAZE | Gazetteer viewer and editor (docs) | com.ontotext.gate.vr.Gaze |
OpenNLP | ||
OpenNlpSentenceSplit | Gate wrapper of the OpenNlp Sentence Splitter. (docs) | gate.opennlp.OpenNlpSentenceSplit |
OpenNlpTokenizer | Implementation of the OpenNlp Token Splitter. (docs) | gate.opennlp.OpenNlpTokenizer |
OpenNlpPOS | Implementation of the OpenNlp POS Tagger. (docs) | gate.opennlp.OpenNlpPOS |
OpenNlpChunker | Implementation of the OpenNlp Chunker. (docs) | gate.opennlp.OpenNlpChunker |
OpenNlpNameFinder | Implementation of the OpenNlp Name Finder. (docs) | gate.opennlp.OpenNLPNameFin |
Parser_Minipar | ||
Minipar Wrapper | MiniPar is a shallow parser. It determines the dependency relationships between the words of a sentence. (docs) | minipar.Minipar |
Parser_RASP | ||
RASP2 Tokenizer | RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs) | com.digitalpebble.rasp2.token.RASPTokenizer |
RASP POS Converter | Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs) | com.digitalpebble.rasp2.tagger.C2Transducer |
RASP2 POS Tagger | RASP part-of-speech tagger, creating WordForm annotations (docs) | com.digitalpebble.rasp2.tagger.PosTagger |
RASP2 Morphological Analyser | RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs) | com.digitalpebble.rasp2.morph.MorphoAnnotator |
RASP2 Parser | RASP dependency parser (docs) | com.digitalpebble.rasp2.parser.ParserAnnotator |
Parser_Stanford | ||
StanfordParser | Stanford parser wrapper (docs) | gate.stanford.Parser |
English Dependency Parser | gate.stanford.apps.EnglishDependencies | |
English POS Tagger and Dependency Parser | gate.stanford.apps.EnglishPOSDependencies | |
Parser_SUPPLE | ||
SUPPLE Parser | SUPPLE bottom-up chart parser. (docs) | shef.nlp.supple.SUPPLE |
Schema_Annotation_Editor | ||
Schema Annotations Editor | An annotation editor restricted by schemas. (docs) | gate.gui.annedit.SchemaAnnotationEditor |
Schema_Tools | ||
Annotation Schema | An annotation type and its features. (docs) | gate.creole.AnnotationSchema |
Schema Enforcer | Produces an annotation set whose content is restricted by the specified set of schemas (docs) | gate.creole.schema.SchemaEnforcer |
Simple Schema Viewer | A Simple Annotation Schema Viewer | gate.gui.schema.SimpleSchemaViewer |
Stemmer_Snowball | ||
Stemmer PR | Wrapper for the Snowball stemmer. (docs) | stemmer.SnowballStemmer |
Tagger_Abner | ||
ABNER Tagger | GATE wrapper over ABNER (docs) | gate.abner.AbnerTagger |
Tagger_Boilerpipe | ||
Boilerpipe Content Detection | Uses boilerpipe to determine which sections of a document are interesting content and which are just boilerplate (docs) | gate.creole.boilerpipe.BoilerPipe |
Tagger_Chemistry | ||
Chemistry Tagger | A tagger for chemical names. (docs) | mark.chemistry.Tagger |
Tagger_DateNormalizer | ||
Date Normalizer | provides normalized values for all known dates | gate.creole.dates.DateNormalizer |
Tagger_Framework | ||
GenericTagger | The Generic Tagger is Generic! (docs) | gate.taggerframework.GenericTagger |
Tagger_Measurements | ||
ANNIE+Measurements | gate.creole.measurements.ANNIEMeasurements | |
Measurement Tagger | A measurement tagger based upon GNU Units | gate.creole.measurements.MeasurementsTagger |
Tagger_MetaMap | ||
MetaMap Annotator | This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port (docs) | gate.metamap.MetaMapPR |
Tagger_MutationFinder | ||
MutationFinder | GATE MutationFinder Wrapper (docs) | gate.creole.mutationfinder.MutationFinderPR |
Tagger_NormaGene | ||
NormaGene Tagger | A processing resource that takes document and corpus parameters | gate.creole.normagene.NormaGene |
Tagger_NP_Chunking | ||
Noun Phrase Chunker | Implementation of the Ramshaw and Marcus base noun phrase chunker (docs) | mark.chunking.GATEWrapper |
Tagger_Numbers | ||
Numbers Tagger | Finds numbers in (both words and digits) and annotates them with their numeric value (docs) | gate.creole.numbers.NumbersTagger |
Roman Numerals Tagger | Finds and annotates Roman numerals (docs) | gate.creole.numbers.RomanNumeralsTagger |
Tagger_OpenCalais | ||
OpenCalais Tagger | An OpenCalais based semantic annotator (docs) | gate.opencalais.OpenCalais |
Tagger_PennBio | ||
Penn BioTagger: Genes | A processing resource that takes document and corpus parameters (docs) | gate.creole.pennbio.GeneTagger |
Penn BioTagger: Malignancy | A processing resource that takes document and corpus parameters (docs) | gate.creole.pennbio.MalignancyTagger |
Penn BioTokenizer | A processing resource that takes document and corpus parameters (docs) | gate.creole.pennbio.Tokenizer |
Penn BioTagger: Variation | A processing resource that takes document and corpus parameters (docs) | gate.creole.pennbio.VariationTagger |
Teamware_Tools | ||
QA Summariser for Teamware | The Quality Assurance PR for teamware (docs) | gate.qa.QAForTeamwarePR |
Tools | ||
Gazetteer List Collector | Gazetteer lists collector. (docs) | gate.creole.GazetteerListsCollector |
ANNIE VP Chunker | ANNIE VP Chunker component. (docs) | gate.creole.VPChunker |
Annotation Set Transfer | Annotation set transfer component. (docs) | gate.creole.annotransfer.AnnotationSetTransfer |
Flexible Exporter | Exports a document with GATE annotations to its original format. (docs) | gate.creole.dumpingPR.DumpingPR |
GATE Morphological analyser | Morphological Analyzer for the English Language. (docs) | gate.creole.morph.Morph |
Flexible Gazetteer | A more flexible list lookup component. (docs) | gate.creole.gazetteer.FlexibleGazetteer |
Syntax tree viewer | Viewer for syntax trees generated by a parser. (docs) | gate.gui.SyntaxTreeViewer |
Quality Assurance PR | The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer | gate.qa.QualityAssurancePR |
UIMA | ||
UIMA Analysis Engine | Wrapper for a Text Analysis Engine from UIMA. (docs) | gate.uima.AnalysisEnginePR |
Web_Crawler_Websphinx | ||
Crawler PR | GATE implementation of the Websphinx crawling API (docs) | crawl.CrawlPR |
WordNet | ||
WordNet 1.6 | Princeton WordNet 1.6. (docs) | gate.wordnet.IndexFileWordNetImpl |
WordNet | WordNet (docs) | gate.wordnet.JWNLWordNetImpl |
WordNet Viewer | WordNet viewer | gate.gui.wordnet.WordNetViewer |
Other contributed plugins
- OrganismTagger
- Multi-lingual Noun Phrase Extractor (MuNPEx)
- Durm German lemmatizer
- XCES tools
- Sen wrapper (Japanese morphological analyser)
- Russian morph tagger
- String Annotation
- GATE Application Documentation
- VirtualCorpus - Directory- and JDBC Corpus LR
- BWP Gazetteer
- Apolda
- Reported Speech Tagger
- Keyphrase extraction module from SmILE
OrganismTagger website download |
||
---|---|---|
OrganismTagger | The OrganismTagger is a hybrid rule-based/machine-learning system that extracts organism mentions from the biomedical literature, normalizes them to their scientific name, and provides grounding to the NCBI Taxonomy database. | |
Multi-lingual Noun Phrase Extractor (MuNPEx) website download |
||
Munpex | MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). | en-np_main.jape |
Durm German lemmatizer website download |
||
Durm German lemmatizer | The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns. | |
XCES tools website download |
||
ANC Document | An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. | org.xces.gate.XCESDocument |
ANC Load Standoff | Loads standoff annotations into an existing document. | org.xces.creole.LoadStandoff |
ANC Save Content | Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. | org.xces.creole.SaveContent |
ANC Save Standoff | Saves annotations from a Document to an XCES-compliant standoff markup file. | org.xces.creole.SaveStandoff |
Sen wrapper (Japanese morphological analyser) website (in Japanese) |
||
Sen Wrapper | Morphological analyser for Japanese | jp.co.ditlab.jgate.SenWrapper |
Russian morph tagger website download |
||
Russian MorphTagger | MorphTagger for russian language, based on MyStem Yandex' parser | ru.itbrains.gate.morph.MorphTagger |
String Annotation Plugin website download |
||
Extended List Gazetteer | Extended version of the GATE Default List Gazetteer. In addition to the features of the original, built-in version of the List Gazetteer, this version provides features for more powerful matching of partial words and annotating prefixes and suffixes as well as more versatile handling of word boundaries and whitespace. | at.ofai.gate.extendedgazetteer.ExtendedGazetteer |
Simple Regexp Annotator | Use rules based on Java regular expressions to annotate the document content. | at.ofai.gate.regexpannotator.SimpleRegexpAnnotator |
AppDoc — GATE Application Documentation website download |
||
AppDoc | Visual resource for adding author/version/comment to pipelines and processing resources | at.ofai.gate.appdoc.AppDoc |
AppDocGen | Visual resource for selecting a documentation template and generating documentation files | at.ofai.gate.appdoc.AppDocGen |
VirtualCorpus — Directory- and JDBC Corpus LRs website download |
||
DirectoryCorpus | Language resource for accessing GATE XML files in a directory directly via a corpus resource | at.ofai.gate.virtualcorpus.DirectoryCorpus |
JDBCCorpus | Language resource for accessing GATE XML documents stored in a field of a JDBC database table directly via a corpus resource | at.ofai.gate.virtualcorpus.JDBCCorpus |
BWP Gazetteer website download |
||
BWP Gazetteer | Extended version of the transducer-based List Gazetteer | |
Apolda website download |
||
Apolda Ontology Annotator | Ontology-based lookup taking terms from properties in the ontology. | telin.Apolda |
Reported Speech Tagger website download |
||
Reporting Verb marker | JAPE transducer which tags reporting verbs | |
Reported Speech finder | JAPE transducer which tags reported speech | |
Keyphrase Extraction Module website download |
||
FrequencyAnalyser | ||
KeywordAnalyser | ||
LanguageIdentification | identifies the language of a document using character n-grams | |
POSTagMapper | ||
SimpleNounChunking | ||
StopwordMarker |
|