This page lists some of the plugins that are currently available with GATE:
For more information on how the plugins work, see the online user guide "Developing Language Processing Components with GATE".
To submit a plugin, please contact us via the gate-users mailing list.
Plugins included in the GATE distribution
- AlchemyAPI
- Alignment
- ANNIE
- Annotation_Merging
- Copy_Annots_Between_Docs
- Coref_Tools
- Crowd_Sourcing
- Developer_Tools
- DocumentNormalizer
- Format_CSV
- Format_DataSift
- Format_FastInfoset
- Format_HTML5Microdata
- Format_MediaWiki
- Format_PubMed
- Format_Twitter
- Gazetteer_LKB
- Gazetteer_Ontology_Based
- GENIA
- Groovy
- Information_Retrieval
- Inter_Annotator_Agreement
- JAPE_Plus
- Keyphrase_Extraction_Algorithm
- Language_Identification
- Lang_Arabic
- Lang_Bulgarian
- Lang_Cebuano
- Lang_Chinese
- Lang_Danish
- Lang_French
- Lang_German
- Lang_Hindi
- Lang_Romanian
- Lang_Russian
- Lang_Welsh
- Learning
- LingPipe
- Linguistic_Simplifier
- Machine_Learning
- Ontology
- Ontology_BDM_Computation
- Ontology_Tools
- OpenNLP
- Parser_RASP
- Parser_SUPPLE
- Schema_Annotation_Editor
- Schema_Tools
- Stanford_CoreNLP
- Stemmer_Snowball
- Tagger_Abner
- Tagger_Boilerpipe
- Tagger_Chemistry
- Tagger_DateNormalizer
- Tagger_Framework
- Tagger_GATE-Time
- Tagger_Lupedia
- Tagger_Measurements
- Tagger_MetaMap
- Tagger_MutationFinder
- Tagger_NP_Chunking
- Tagger_Numbers
- Tagger_PennBio
- Tagger_TextRazor
- Teamware_Tools
- TermRaider
- Text_Categorization
- Tools
- UIMA
- Web_Crawler_Websphinx
- WordNet
AlchemyAPI | ||
---|---|---|
AlchemyAPI: Entity Extraction | Runs the AlchemyAPI Entity Extraction service on a GATE document | gate.alchemyAPI.EntityExtraction |
AlchemyAPI: Keyword Extraction | Runs the AlchemyAPI Keyword Extraction service on a GATE document | gate.alchemyAPI.KeywordExtraction |
Alignment | ||
Compound Document | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentImpl |
Compound Document From Xml | GATE Compound Document. (docs) | gate.compound.impl.CompoundDocumentFromXml |
Compound Document Editor | Editor for compound documents. (docs) | gate.compound.gui.CompoundDocumentEditor |
GATE Composite document | GATE Composite document. (docs) | gate.composite.impl.CompositeDocumentImpl |
Switch Member PR | Sets the focus of a compound document to a specified member document. (docs) | gate.compound.impl.SwitchMemberPR |
Delete Member PR | Deletes one member document from a compound doc. (docs) | gate.compound.impl.DeleteMemberPR |
Combine Members PR | Combines documents in a composite document. (docs) | gate.composite.impl.CombineMembersPR |
Segment Processing PR | Processes individual segments as separate documents (docs) | gate.composite.impl.SegmentProcessingPR |
ExportAlignmentPR | A PR to export alignment information in an xml file. | gate.alignment.ExportAlignmentPR |
ANNIE | ||
GATE Unicode Tokeniser | A customisable Unicode tokeniser. (docs) | gate.creole.tokeniser.SimpleTokeniser |
ANNIE English Tokeniser | A customisable English tokeniser. (docs) | gate.creole.tokeniser.DefaultTokeniser |
ANNIE Gazetteer | A list lookup component. (docs) | gate.creole.gazetteer.DefaultGazetteer |
Sharable Gazetteer | A list lookup component. (docs) | gate.creole.gazetteer.SharedDefaultGazetteer |
Hash Gazetteer | A list lookup component implemented by OntoText Lab. The licence information is also available in licence.ontotext.html in the lib folder of GATE (docs) | com.ontotext.gate.gazetteer.HashGazetteer |
JAPE Transducer | A module for executing Jape grammars. (docs) | gate.creole.Transducer |
ANNIE NE Transducer | ANNIE named entity grammar. (docs) | gate.creole.ANNIETransducer |
ANNIE Sentence Splitter | ANNIE sentence splitter. (docs) | gate.creole.splitter.SentenceSplitter |
RegEx Sentence Splitter | A sentence splitter based on regular expressions. (docs) | gate.creole.splitter.RegexSentenceSplitter |
ANNIE POS Tagger | Mark Hepple's Brill-style POS tagger (docs) | gate.creole.POSTagger |
ANNIE OrthoMatcher | ANNIE orthographical coreference component. (docs) | gate.creole.orthomatcher.OrthoMatcher |
ANNIE Pronominal Coreferencer | Pronominal Coreference resolution component. (docs) | gate.creole.coref.Coreferencer |
ANNIE Nominal Coreferencer | Nominal Coreference resolution component (docs) | gate.creole.coref.NominalCoref |
Document Reset PR | Remove named annotation sets or reset the default annotation set (docs) | gate.creole.annotdelete.AnnotationDeletePR |
Jape Viewer | A JAPE grammar file viewer (docs) | gate.gui.jape.JapeViewer |
Gazetteer Editor | Gazetteer viewer and editor. (docs) | gate.gui.GazetteerEditor |
Annotation_Merging | ||
Annotation Merging PR | Merge Annotations from different annotators. (docs) | gate.merger.AnnotationMergingMain |
Copy_Annots_Between_Docs | ||
Copy Anns to Another Doc PR | Copy the annotations from one document to another document. (docs) | gate.copyAS2AnoDoc.CopyAS2AnoDocMain |
Coref_Tools | ||
Legacy Coref Data Writer | A simple PR that converts co-reference data from the Relations-based model to the legacy format (based on 'matches' annotation and document features). | gate.creole.coref.LegacyCorefDataWriter |
OrthoRef | An orthographic coreferencer | gate.creole.coref.OrthoRef |
Crowd_Sourcing | ||
Entity Classification Job Builder | Build a CrowdFlower job asking users to select the right label for entities (docs) | gate.crowdsource.classification.EntityClassificationJobBuilder |
Entity Classification Results Importer | Import judgments from a CrowdFlower job created by the Entity Classification Job Builder as GATE annotations. (docs) | gate.crowdsource.classification.EntityClassificationResultsImporter |
Majority-vote consensus builder (classification) | Process results of a crowd annotation task to find where annotators agree and disagree. (docs) | gate.crowdsource.classification.MajorityVoteClassificationConsensus |
Entity Annotation Job Builder | Build a CrowdFlower job asking users to annotate entities within a snippet of text (docs) | gate.crowdsource.ne.EntityAnnotationJobBuilder |
Entity Annotation Results Importer | Import judgments from a CrowdFlower job created by the Entity Annotation Job Builder as GATE annotations. (docs) | gate.crowdsource.ne.EntityAnnotationResultsImporter |
Majority-vote consensus builder (annotation) | Process results of a crowd annotation task to find where annotators agree and disagree. (docs) | gate.crowdsource.ne.MajorityVoteAnnotationConsensus |
Developer_Tools | ||
EDT Monitor | Warns whenever an AWT component is updated from anywhere other than the event dispatch thread (docs) | gate.creole.EDTMonitor |
Java Heap Dumper | Dumps the Java heap to the specified file (docs) | gate.creole.HeapDumper |
Log4J Level: ALL | Allows the Log4J log level to be set to ALL from within the GUI (docs) | gate.creole.Log4JALL |
Show/Hide Resources | Show resources that would otherwise be hidden, e.g. resources created for internal use by other resources (docs) | gate.creole.Reveal |
The Duplicator | Duplicate any resource with a right click menu option (docs) | gate.creole.TheDuplicator |
Unload Unused Plugins | Unloads all plugins for which we cannot find any loaded instances (docs) | gate.creole.UnusedPluginUnloader |
DocumentNormalizer | ||
Document normalizer | Normalize document content to remove "smart quotes" etc. (docs) | gate.creole.DocumentNormalizer |
Format_CSV | ||
CSV Corpus Populater | Populate a corpus from CSV files (docs) | gate.corpora.CSVImporter |
Format_DataSift | ||
GATE DataSift JSON Document Format | Format parser for DataSift JSON files | gate.corpora.DataSiftFormat |
Format_FastInfoset | ||
Fast Infoset Document Format | Format parser for GATE XML stored in the binary Fast Infoset format (docs) | gate.corpora.FastInfosetDocumentFormat |
Fast Infoset Exporter | Export GATE documents to GATE XML stored in the binary Fast Infoset format (docs) | gate.corpora.FastInfosetExporter |
Format_HTML5Microdata | ||
HTML5 Microdata Exporter | Exports Annotations as HTML5 Microdata | gate.creole.microdata.MicrodataExporter |
Format_MediaWiki | ||
MediaWiki Document Format | Document format for parsing MediaWiki markup (docs) | gate.corpora.MediaWikiDocumentFormat |
MediaWiki Corpus Populater | Populate a corpus from a MediaWiki XML dump (docs) | gate.corpora.MediaWikiPopulater |
MediaWiki XML Document Format | Deprecated MediaWiki importer | gate.corpora.MediaWikiXMLDocumentFormat |
Format_PubMed | ||
GATE .cochrane.txt document format | Load this to allow the opening of Cochrane text documents, and choose the mime type "text/x-cochrane", or use the correct file extension. | gate.corpora.CochraneTextDocumentFormat |
GATE .pubMed.txt document format | Load this to allow the opening of PubMed text documents, and choose the mime type "text/x-pubmed"or use the correct file extension. | gate.corpora.PubmedTextDocumentFormat |
Format_Twitter | ||
GATE JSON Tweet Document Format | Format parser for Twitter JSON files (docs) | gate.corpora.JSONTweetFormat |
GATE JSON Exporter | Export documents and corpora in JSON format | gate.corpora.export.GATEJsonExporter |
Twitter Corpus Populator | Populate a corpus from Twitter JSON containing multiple Tweets (docs) | gate.corpora.twitter.Population |
Gazetteer_LKB | ||
Large KB Gazetteer | KIM KB based alias-lookup commponent (docs) | com.ontotext.kim.gate.KimGazetteer |
Semantic Enrichment PR | The Semantic Enrichment PR allows adding new data to semantic annotations by querying external RDF (Linked Data) repositories. (docs) | com.ontotext.kim.gate.SesameEnrichment |
Gazetteer_Ontology_Based | ||
Onto Root Gazetteer | An ontology lookup component (docs) | gate.clone.ql.OntoRootGaz |
GENIA | ||
GENIA Sentence Splitter | A processing resource that takes document and corpus parameters (docs) | gate.creole.genia.splitter.GENIASentenceSplitter |
Groovy | ||
Groovy support for GATE | gate.groovy.GroovySupport | |
Groovy scripting PR | Runs a Groovy script as a processing resource (docs) | gate.groovy.ScriptPR |
Scriptable Controller | A controller whose execution strategy is controlled by a Groovy script (docs) | gate.groovy.ScriptableController |
Control Script | Editor for the Groovy script controlling a scriptable controller | gate.groovy.gui.ControllerScriptEditor |
Script Editor | Editor for the Groovy script behind this PR | gate.groovy.gui.ScriptPREditor |
Information_Retrieval | ||
SearchPR | Provides IR functionality. (docs) | gate.creole.ir.SearchPR |
Search Results | Viewer for IR search results | gate.gui.SearchPRViewer |
Corpus Indexing Support | gate.creole.ir.CorpusIndexingSupport | |
Lucene IR Engine | gate.creole.ir.lucene.LuceneIREngine | |
Inter_Annotator_Agreement | ||
IAA Computation PR | Compute inter-annotator agreement (IAA). (docs) | gate.iaaplugin.IaaMain |
JAPE_Plus | ||
JAPE-Plus Viewer | A JAPE grammar file viewer (docs) | gate.gui.jape.plus.Viewer |
JAPE-Plus Transducer | An optimised, JAPE-compatible transducer. | gate.jape.plus.Transducer |
Keyphrase_Extraction_Algorithm | ||
KEA Keyphrase Extractor | A Keyphrase Extractor by Eibe Frank. (docs) | gate.creole.kea.Kea |
KEA Corpus Importer | Imports a KEA-style corpus into GATE | gate.creole.kea.CorpusImporter |
Language_Identification | ||
TextCat Fingerprint Generator | Generate language fingerprints for use with the TextCat Language Indentification PR (docs) | org.knallgrau.utils.textcat.FingerprintGenerator |
TextCat Language Identification | Recognizes the document language using TextCat (docs) | org.knallgrau.utils.textcat.LanguageIdentifier |
Lang_Arabic | ||
Arabic Gazetteer Collector | arabic.ArabicGazCollector | |
Arabic Gazetteer | A list lookup component. (docs) | arabic.ArabicGazetteer |
Arabic IE System | Ready-made Arabic IE application | arabic.ArabicIE |
Arabic Infered Gazetteer | A list lookup component. (docs) | arabic.ArabicInferedGazetteer |
Arabic OrthoMatcher | ANNIE orthographical coreference component. (docs) | arabic.ArabicOrthoMatcher |
Arabic Tokeniser | A customisable English tokeniser. (docs) | arabic.ArabicTokeniser |
Arabic Main Grammar | A module for executing Jape grammars. (docs) | arabic.ArabicTransducer |
Lang_Bulgarian | ||
BulStem | This plugin is an implementation of the BulStem stemmer algorithm for Bulgarian developed by Preslav Nakov. (docs) | gate.bulstem.BulStemPR |
Lang_Cebuano | ||
Cebuano POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
Cebuano Gazetteer | A list lookup component. (docs) | cebuano.CebuanoGazetteer |
Cebuano Gazetteer Tokeniser | A list lookup component. (docs) | cebuano.CebuanoGazetteerTokeniser |
Cebuano IE System | Ready-made Cebuano IE application | cebuano.CebuanoIE |
Cebuano Tokeniser | A customisable English tokeniser. (docs) | cebuano.CebuanoTokeniser |
Cebuano Transducer | A module for executing Jape grammars. (docs) | cebuano.CebuanoTransducer |
Cebuano Transducer Postprocessor | A module for executing Jape grammars. (docs) | cebuano.CebuanoTransducerPost |
Lang_Chinese | ||
Chinese Segmenter PR | Segment the Chinese text into words, based on the PAUM learning algorithm. (docs) | gate.chineseSeg.ChineseSegMain |
Chinese IE System | Ready-made Chinese IE application | chinese.ChineseIE |
Lang_Danish | ||
Danish Gazetteer | Person, Location and Organisation gazetteers for Danish (docs) | danish.DanishGazetteer |
Danish IE System | Ready-made Danish IE application | danish.DanishIE |
Danish Sentence Splitter | ANNIE sentence splitter. (docs) | danish.DanishSentenceSplitter |
Danish Tokeniser | PAROLE tokeniser, for Danish (docs) | danish.DanishTokeniser |
Lang_French | ||
French IE System | Ready-made French IE application | french.FrenchIE |
Lang_German | ||
German IE System | Ready-made German IE application | german.GermanIE |
Lang_Hindi | ||
Hindi Tokeniser | A customisable Hindi tokeniser. | hindi.HindiTokeniser |
Hindi Gazetteer | A list lookup component. | hindi.HindiGazetteer |
Hindi Splitter | A Sentence Splitter. | hindi.HindiSplitter |
Hindi Tokeniser Gazetteer | A list lookup component. | hindi.HindiTokeniserGazetteer |
Hindi Main Grammar | A module for executing Jape grammars | hindi.HindiTransducer |
Hindi Tokeniser Postprocessor | A module for executing Jape grammars | hindi.HindiTokeniserPostprocessor |
Hindi OrthoMatcher | Hindi Orthomatcher | hindi.HindiOrthoMatcher |
Hindi POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for languages where entries are multiword | cebtag.postag.CebuanoPOSTagger |
Lang_Romanian | ||
Romanian Tokeniser | A customisable Romanian tokeniser. | romanian.RomanianTokeniser |
Romanian Gazetteer | A list lookup component. | romanian.RomanianGazetteer |
Romanian Transducer | A module for executing Jape grammars | romanian.RomanianTransducer |
Romanian IE System | Ready-made Romanian IE application | romanian.RomanianIE |
Lang_Russian | ||
RussIE | Basic version of the RussIE application (docs) | com.ontotext.russie.apps.RussIE |
RussIE + Inflectional Gazetter | RussIE application with inflexional gazetteer (docs) | com.ontotext.russie.apps.RussIEInflex |
RussIE + OrthoMatcher | RussIE application with orthomatcher (docs) | com.ontotext.russie.apps.RussIEOrtho |
RussIE + Inflectional Gazetteer & OrthoMatcher | RussIE application with orthomatcher and inflexional gazetteer (docs) | com.ontotext.russie.apps.RussIEOrthoInflex |
Inflectional gazetteer | Gazetteer with support for inflectional morphology (docs) | com.ontotext.russie.gazetteer.InflectionalGazetteer |
Russian Gazetteer | Customised version of the hash gazetteer (docs) | com.ontotext.russie.gazetteer.RussGazetteer |
POS Mapper | Map complex Russian morphology tags into simpler POS categories (docs) | com.ontotext.russie.morph.POSMapper |
Russian POS Tagger | Part-of-speech tagger for Russian (docs) | com.ontotext.russie.morph.POSTagger |
Lang_Welsh | ||
CYMRIE | Welsh Information Extraction Application | wnlt.CYMRIE |
Welsh Gazetteer | Welsh Gazetteer (docs) | wnlt.WelshGazetteer |
Welsh NE Transducer | Welsh named entity grammar (docs) | wnlt.WelshNE |
Welsh POS Tagger | Mark Hepple's Brill-style POS tagger, adapted for Welsh (docs) | wnlt.WelshPOSTagger |
Welsh Sentence Splitter | ANNIE sentence splitter. (docs) | wnlt.WelshSentenceSplitter |
Welsh Tokeniser | A customisable English tokeniser. (docs) | wnlt.WelshTokeniser |
Welsh Morphological Analyser | Morphological Analyzer of the Welsh Natural Language Toolkit | wnlt.morph.WelshMorph |
Learning | ||
Batch Learning PR | Supports training, application and evaluation of machine learning models for NLP tasks (docs) | gate.learning.LearningAPIMain |
LingPipe | ||
LingPipe Tokenizer PR | Provides a LingPipe tokenizer. (docs) | gate.lingpipe.TokenizerPR |
LingPipe NER PR | LingPipe Named Entity Recognizer (docs) | gate.lingpipe.NamedEntityRecognizerPR |
LingPipe Language Identifier PR | GATE PR for language identification using LingPipe (docs) | gate.lingpipe.LanguageIdentifierPR |
LingPipe POS Tagger PR | Provides a LingPipe part of speech tagger. (docs) | gate.lingpipe.POSTaggerPR |
LingPipe Sentence Splitter PR | Provides an interface to LingPipe sentence splitter API. (docs) | gate.lingpipe.SentenceSplitterPR |
Linguistic_Simplifier | ||
Simplified Text Exporter | Simplified text exporter (HTML output) | gate.corpora.export.ExportSimplifiedHTML |
Simplified Text Exporter | Simplified text exporter (plain text output) | gate.corpora.export.ExportSimplifiedText |
Linguistic Simplifier | A processing resource that takes document and corpus parameters | gate.creole.summarization.linguistic.Simplifier |
Linguistic Simplifier | Example application for the linguistic simplifier | gate.creole.summarization.linguistic.SimplifierApplication |
Machine_Learning | ||
Machine Learning PR | Trains a machine learning algorithm from a corpus. For new code, consider using the "learning" plugin instead. (docs) | gate.creole.ml.MachineLearningPR |
Ontology | ||
ConnectSesameOntology | Connect to a repository containing and ontology (docs) | gate.creole.ontology.impl.sesame.ConnectSesameOntology |
CreateSesameOntology | Create a ontology from a Sesame configuration file for a repository (docs) | gate.creole.ontology.impl.sesame.CreateSesameOntology |
OWLIM Ontology | Ontology created as a temporary OWLIM3 in-memory repository (docs) | gate.creole.ontology.impl.sesame.OWLIMOntology |
OWLIM Ontology DEPRECATED | Ontology created as a temporary OWLIM3 in-memory repository, for backwards compatibility only (docs) | gate.creole.ontology.owlim.OWLIMOntologyLR |
Ontology_BDM_Computation | ||
BDM Computation PR | Compute BDM score for each pair of concepts in the given ontology. (docs) | gate.bdmComp.BDMCompMain |
Ontology_Tools | ||
OntoGazetteer | A list lookup component based on mapping between ontology classes and gazetteer lists. (docs) | gate.creole.gazetteer.OntoGazetteerImpl |
GATE Ontology Editor | Ontology editing tool. (docs) | gate.gui.ontology.OntologyEditor |
OAT | Ontology Annotation Tool. (docs) | gate.creole.ontology.ocat.OntologyViewer |
RAT-C | Relation Annotation Tool Class view. (docs) | gate.gui.docview.OntologyClassView |
RAT-I | Relation Annotation Tool Instance view. (docs) | gate.gui.docview.OntologyInstanceView |
GAZE | Gazetteer viewer and editor (docs) | com.ontotext.gate.vr.Gaze |
OpenNLP | ||
OpenNLP NER | NER PR using a set of OpenNLP maxent models (docs) | gate.opennlp.OpenNLPNameFin |
OpenNLP Chunker | Chunker using an OpenNLP maxent model (docs) | gate.opennlp.OpenNlpChunker |
OpenNLP POS Tagger | POS Tagger using an OpenNLP maxent model (docs) | gate.opennlp.OpenNlpPOS |
OpenNLP Parser | Syntactic parser from Apache OpenNLP (docs) | gate.opennlp.OpenNlpParser |
OpenNLP Sentence Splitter | Sentence splitter using an OpenNLP maxent model (docs) | gate.opennlp.OpenNlpSentenceSplit |
OpenNLP Tokenizer | Tokenizer using an OpenNLP maxent model (docs) | gate.opennlp.OpenNlpTokenizer |
Parser_RASP | ||
RASP2 Tokenizer | RASP2 Tokenizer. Faster than the original GATE component but generates Tokens which have only a 'string' feature. Requires annotations of type Sentence. See RASP package for platform restrictions. (docs) | com.digitalpebble.rasp2.token.RASPTokenizer |
RASP POS Converter | Converts from PennTreebank POS tags to the C2 tagset used by RASP. Generates annotations of type MorphObj which hold the tag and lemma (docs) | com.digitalpebble.rasp2.tagger.C2Transducer |
RASP2 POS Tagger | RASP part-of-speech tagger, creating WordForm annotations (docs) | com.digitalpebble.rasp2.tagger.PosTagger |
RASP2 Morphological Analyser | RASP morphological analyser, which adds lemma and suffix to the WordForm annotations produced by the RASP POS tagger (or the ANNIE POS tagger plus the RASP converter) (docs) | com.digitalpebble.rasp2.morph.MorphoAnnotator |
RASP2 Parser | RASP dependency parser (docs) | com.digitalpebble.rasp2.parser.ParserAnnotator |
Parser_SUPPLE | ||
SUPPLE Parser | SUPPLE bottom-up chart parser. (docs) | shef.nlp.supple.SUPPLE |
Schema_Annotation_Editor | ||
Schema Annotations Editor | An annotation editor restricted by schemas. (docs) | gate.gui.annedit.SchemaAnnotationEditor |
Schema_Tools | ||
Schema Enforcer | Produces an annotation set whose content is restricted by the specified set of schemas (docs) | gate.creole.schema.SchemaEnforcer |
Simple Schema Viewer | A Simple Annotation Schema Viewer | gate.gui.schema.SimpleSchemaViewer |
Stanford_CoreNLP | ||
Stanford NER | Stanford Named Entity Recogniser (docs) | gate.stanford.NER |
StanfordParser | Stanford parser wrapper (docs) | gate.stanford.Parser |
Stanford POS Tagger | Stanford Part-of-Speech Tagger (docs) | gate.stanford.Tagger |
Stanford PTB Tokenizer | Stanford Penn Treebank v3 Tokenizer, for English (docs) | gate.stanford.Tokenizer |
English Dependency Parser | Ready-made application for Stanford English parser | gate.stanford.apps.EnglishDependencies |
English POS Tagger and Dependency Parser | Ready-made application for Stanford English POS tagger and parser | gate.stanford.apps.EnglishPOSDependencies |
Stemmer_Snowball | ||
Stemmer PR | Wrapper for the Snowball stemmer. (docs) | stemmer.SnowballStemmer |
Tagger_Abner | ||
ABNER Tagger | GATE wrapper over ABNER (docs) | gate.abner.AbnerTagger |
Tagger_Boilerpipe | ||
Boilerpipe Content Detection | Uses boilerpipe to determine which sections of a document are interesting content and which are just boilerplate (docs) | gate.creole.boilerpipe.BoilerPipe |
Tagger_Chemistry | ||
Chemistry Tagger | A tagger for chemical names. (docs) | mark.chemistry.Tagger |
Tagger_DateNormalizer | ||
Date Annotation Normalizer | provides normalized values for all existing date annotations (docs) | gate.creole.dates.DateAnnotationNormalizer |
Date Normalizer | provides normalized values for all known dates (docs) | gate.creole.dates.DateNormalizer |
Tagger_Framework | ||
GenericTagger | The Generic Tagger is Generic! (docs) | gate.taggerframework.GenericTagger |
Tagger_GATE-Time | ||
DCTParser | DCTParser finds DCTs so that HeidelTime GATE wrapper can be run on news-style corpora with differing DCTs. (docs) | de.mpii.nlp.gate.heideltime.DCTParser |
HeidelTime | HeidelTime GATE wrapper, i.e., HeidelTime plugin for gate. (docs) | de.mpii.nlp.gate.heideltime.HeideltimeWrapper |
TimeML Event Detection | TimeML Event Detection Application | gate.creole.time.TimeMLEventDetection |
Tagger_Lupedia | ||
Lupedia Service PR | Runs a lupedia annotation service on a GATE document (docs) | gate.lupedia.LupediaServicePR |
Tagger_Measurements | ||
ANNIE+Measurements | Ready-made application for ANNIE plus the measurement tagger | gate.creole.measurements.ANNIEMeasurements |
Measurements | Ready-made application for measurement annotator | gate.creole.measurements.MeasurementsApplication |
Measurement Tagger | A measurement tagger based upon GNU Units | gate.creole.measurements.MeasurementsTagger |
Tagger_MetaMap | ||
MetaMap Annotator | This plugin uses the MetaMap Java API to send GATE document content to MetaMap skrmedpostctl server and PrologBeans mmserver instances running on the given machine/port (docs) | gate.metamap.MetaMapPR |
Tagger_MutationFinder | ||
MutationFinder | GATE MutationFinder Wrapper (docs) | gate.creole.mutationfinder.MutationFinderPR |
Tagger_NP_Chunking | ||
Noun Phrase Chunker | Implementation of the Ramshaw and Marcus base noun phrase chunker (docs) | mark.chunking.GATEWrapper |
Noun Phrase Chunker | Ready-made NP chunking application | mark.chunking.ChunkingApp |
Tagger_Numbers | ||
Numbers Tagger | Finds numbers in (both words and digits) and annotates them with their numeric value (docs) | gate.creole.numbers.NumbersTagger |
Roman Numerals Tagger | Finds and annotates Roman numerals (docs) | gate.creole.numbers.RomanNumeralsTagger |
Tagger_PennBio | ||
Penn BioTagger | Ready-made application for the Penn BioTagger | gate.creole.pennbio.BioTagger |
Penn BioTagger: Genes | Penn BioTagger for Genes (docs) | gate.creole.pennbio.GeneTagger |
Penn BioTagger: Malignancy | Penn BioTagger for malignancy types (docs) | gate.creole.pennbio.MalignancyTagger |
Penn BioTokenizer | Tokenizer for biomedical text (docs) | gate.creole.pennbio.Tokenizer |
Penn BioTagger: Variation | Penn BioTagger for variations (docs) | gate.creole.pennbio.VariationTagger |
Tagger_TextRazor | ||
TextRazor Service PR | Runs the TextRazor annotation service (http://textrazor.com) on a GATE document (docs) | gate.textrazor.TextRazorServicePR |
Teamware_Tools | ||
QA Summariser for Teamware | The Quality Assurance PR for teamware (docs) | gate.qa.QAForTeamwarePR |
TermRaider | ||
PMI Example (English) | Example application for the PMI (pointwise mutual information) tool | gate.termraider.PMIExample |
TermRaider English Term Extraction | Example application showing typical set-up for the TermRaider tools | gate.termraider.TermRaiderEnglish |
Termbank Score Copier | Copy scores from Termbanks back to their source annotations (docs) | gate.termraider.apply.TermScoreCopier |
AnnotationTermbank | TermRaider Termbank derived from document annotations (docs) | gate.termraider.bank.AnnotationTermbank |
DocumentFrequencyBank | Document frequency counter derived from corpora and other DFBs (docs) | gate.termraider.bank.DocumentFrequencyBank |
HyponymyTermbank | TermRaider Termbank derived from head/string hyponymy (docs) | gate.termraider.bank.HyponymyTermbank |
PMI Bank | Pointwise Mutual Information from corpora (docs) | gate.termraider.bank.PMIBank |
TfIdfTermbank | TermRaider Termbank derived from vectors in document features (docs) | gate.termraider.bank.TfIdfTermbank |
Pairbank Viewer | viewer for the TermRaider Pairbank (docs) | gate.termraider.gui.PairbankViewer |
Termbank Viewer | viewer for the TermRaider Termbank (docs) | gate.termraider.gui.TermbankViewer |
Text_Categorization | ||
Text Categorization PR | Classify text based on a semantic space | gate.ml.categorization.TextCategorizationPR |
Tools | ||
Gazetteer List Collector | Gazetteer lists collector. (docs) | gate.creole.GazetteerListsCollector |
ANNIE VP Chunker | ANNIE VP Chunker component. (docs) | gate.creole.VPChunker |
Annotation Set Transfer | Annotation set transfer component. (docs) | gate.creole.annotransfer.AnnotationSetTransfer |
Flexible Exporter | Exports a document with GATE annotations to its original format. (docs) | gate.creole.dumpingPR.DumpingPR |
GATE Morphological analyser | Morphological Analyzer for the English Language. (docs) | gate.creole.morph.Morph |
Flexible Gazetteer | A more flexible list lookup component. (docs) | gate.creole.gazetteer.FlexibleGazetteer |
Syntax tree viewer | Viewer for syntax trees generated by a parser. (docs) | gate.gui.SyntaxTreeViewer |
Configurable Exporter | Allows annotations to be exported according to a specified format. | gate.configurableexporter.ConfigurableExporter |
Quality Assurance PR | The Quality Assurance PR provides a functionality of the Corpus QA Tool in GATE Developer | gate.qa.QualityAssurancePR |
Hashtag Tokenizer | Tokenizes Multi-Word Hashtags (docs) | gate.twitter.HashtagTokenizer |
Tweet Normaliser | Normalise texts in tweets (convert into standard English spelling mistakes, colloquialisms, typing variations and so on) (docs) | gate.twitter.Normaliser |
TwitIE (EN) | English TwitIE application | gate.twitter.apps.TwitIEEN |
TwitIE PoS Tagger | English TwitIE part-of-speech tagger | gate.twitter.apps.TwitIEPOS |
Twitter POS Tagger (deprecated) | Transitional compatibility PR for Twitter POS tagger - new applications should use the Stanford tagger directly | gate.twitter.pos.POSTaggerEN |
Twitter Tokenizer (EN) | Tokenizer tuned for Tweets (docs) | gate.twitter.tokenizer.TokenizerEN |
UIMA | ||
UIMA Analysis Engine | Wrapper for a Text Analysis Engine from UIMA. (docs) | gate.uima.AnalysisEnginePR |
Web_Crawler_Websphinx | ||
Crawler PR | GATE implementation of the Websphinx crawling API (docs) | crawl.CrawlPR |
WordNet | ||
WordNet 1.6 | Princeton WordNet 1.6. (docs) | gate.wordnet.IndexFileWordNetImpl |
WordNet | WordNet (docs) | gate.wordnet.JWNLWordNetImpl |
WordNet Viewer | WordNet viewer | gate.gui.wordnet.WordNetViewer |
Other contributed plugins
- OrganismTagger
- Multi-lingual Noun Phrase Extractor (MuNPEx)
- Durm German lemmatizer
- S4 Annotator
- XCES tools
- Sen wrapper (Japanese morphological analyser)
- Russian morph tagger
- String Annotation
- GATE Application Documentation
- VirtualCorpus - Directory- and JDBC Corpus LR
- BWP Gazetteer
- Apolda
- Reported Speech Tagger
- Keyphrase extraction module from SmILE
OrganismTagger website download |
||
---|---|---|
OrganismTagger | The OrganismTagger is a hybrid rule-based/machine-learning system that extracts organism mentions from the biomedical literature, normalizes them to their scientific name, and provides grounding to the NCBI Taxonomy database. | |
Multi-lingual Noun Phrase Extractor (MuNPEx) website download |
||
Munpex | MuNPEx is a multi-lingual noun phrase (NP) extraction component developed for the GATE architecture, implemented in JAPE. It currently supports English, German, French, and Spanish (in beta). | en-np_main.jape |
Durm German lemmatizer website download |
||
Durm German lemmatizer | The Durm German Lemmatization System consists of a number of GATE components and resources that perform morphological analysis and lemmatization for German nouns. | |
GATE Plugin for S4 website (download via the update site) |
||
S4 Annotator | Provides access to the text analytics services of Ontotext's Self Service Semantic Suite (S4) directly from the GATE platform, via their RESTful APIs. The PR can be integrated in any GATE processing pipeline regardless of the context and it does not have any requirements or assumptions about the type of pre-processing or post-processing of the textual data being annotated. | com.ontotext.s4.api.S4Plugin |
XCES tools website download |
||
ANC Document | An XCES document. Allows loading of the document text, plus some or all of the sets of standoff markup associated with the document. | org.xces.gate.XCESDocument |
ANC Load Standoff | Loads standoff annotations into an existing document. | org.xces.creole.LoadStandoff |
ANC Save Content | Saves just the text content of a document to a file. This will work for any document - it is not specific to ANC/XCES documents. | org.xces.creole.SaveContent |
ANC Save Standoff | Saves annotations from a Document to an XCES-compliant standoff markup file. | org.xces.creole.SaveStandoff |
Sen wrapper (Japanese morphological analyser) website (in Japanese) |
||
Sen Wrapper | Morphological analyser for Japanese | jp.co.ditlab.jgate.SenWrapper |
Russian morph tagger website download |
||
Russian MorphTagger | MorphTagger for russian language, based on MyStem Yandex' parser | ru.itbrains.gate.morph.MorphTagger |
String Annotation Plugin website download |
||
Extended List Gazetteer | Extended version of the GATE Default List Gazetteer. In addition to the features of the original, built-in version of the List Gazetteer, this version provides features for more powerful matching of partial words and annotating prefixes and suffixes as well as more versatile handling of word boundaries and whitespace. | at.ofai.gate.extendedgazetteer.ExtendedGazetteer |
Simple Regexp Annotator | Use rules based on Java regular expressions to annotate the document content. | at.ofai.gate.regexpannotator.SimpleRegexpAnnotator |
AppDoc — GATE Application Documentation website download |
||
AppDoc | Visual resource for adding author/version/comment to pipelines and processing resources | at.ofai.gate.appdoc.AppDoc |
AppDocGen | Visual resource for selecting a documentation template and generating documentation files | at.ofai.gate.appdoc.AppDocGen |
VirtualCorpus — Directory- and JDBC Corpus LRs website download |
||
DirectoryCorpus | Language resource for accessing GATE XML files in a directory directly via a corpus resource | at.ofai.gate.virtualcorpus.DirectoryCorpus |
JDBCCorpus | Language resource for accessing GATE XML documents stored in a field of a JDBC database table directly via a corpus resource | at.ofai.gate.virtualcorpus.JDBCCorpus |
BWP Gazetteer website download |
||
BWP Gazetteer | Extended version of the transducer-based List Gazetteer | |
Apolda website download |
||
Apolda Ontology Annotator | Ontology-based lookup taking terms from properties in the ontology. | telin.Apolda |
Reported Speech Tagger website download |
||
Reporting Verb marker | JAPE transducer which tags reporting verbs | |
Reported Speech finder | JAPE transducer which tags reported speech | |
Keyphrase Extraction Module website download |
||
FrequencyAnalyser | ||
KeywordAnalyser | ||
LanguageIdentification | identifies the language of a document using character n-grams | |
POSTagMapper | ||
SimpleNounChunking | ||
StopwordMarker |
|