Books about GATE

Natural Language Processing for the Semantic Web

Diana Maynard, Kalina Bontcheva, Isabelle Augenstein, December 2016

Introduces the main NLP components, giving step-by-step examples of NLP pipelines from GATE and other NLP toolkits, plus a number of applications and wider discussion about GATE and similar tools in real life use. An excellent introduction to the basics of NLP in a practical setting.

Link on Morgan and Claypool website

This book introduces core natural language processing (NLP) technologies to non-experts in an easily accessible way, as a series of building blocks that lead the user to understand key technologies, why they are required, and how to integrate them into Semantic Web applications. Natural language processing and Semantic Web technologies have different, but complementary roles in data management. Combining these two technologies enables structured and unstructured data to merge seamlessly. Semantic Web technologies aim to convert unstructured data to meaningful representations, which benefit enormously from the use of NLP technologies, thereby enabling applications such as connecting text to Linked Open Data, connecting texts to each other, semantic searching, information visualization, and modeling of user behavior in online networks.

The first half of this book describes the basic NLP processing tools: tokenization, part-of-speech tagging, and morphological analysis, in addition to the main tools required for an information extraction system (named entity recognition and relation extraction) which build on these components. The second half of the book explains how Semantic Web and NLP technologies can enhance each other, for example via semantic annotation, ontology linking, and population. These chapters also discuss sentiment analysis, a key component in making sense of textual data, and the difficulties of performing NLP on social media, as well as some proposed solutions. The book finishes by investigating some applications of these tools, focusing on semantic search and visualization, modeling user behavior, and an outlook on the future.

Text Processing with GATE (Version 6)

The GATE Team, 2011

Revised and expanded version of the GATE user and developer guide. On Amazon.

The blurb: "GATE is a free open-source infrastructure for developing and deploying software components that process human language. It is more than 15 years old and is in active use for all types of computational tasks involving language (frequently called natural language processing, text analytics, or text mining). GATE excels at text analysis of all shapes and sizes. From large corporations to small startups, from multi-million research consortia to undergraduate projects, our user community is the largest and most diverse of any system of this type, and is active world-wide. This book contains a highly accessible introduction to GATE Version 6 and is the first port of call for all GATE-related questions. It includes a guide to using GATE Developer and GATE Embedded, and chapters on all major areas of functionality, such as processing multiple languages and large collections of unstructured text. It also includes complete plugin documentation (e.g. named entity recognition, parsing, semantic analysis, , as well as details on other members of the GATE Family: GATECloud.net, Teamware, and Mimir. To join the GATE community visit GATE.ac.uk."

Building Search Applications with Lucene, LingPipe and GATE

Manu Konchady, 2008

"A practical guide to building search applications using open source software.

Lucene, LingPipe, and Gate are popular open source tools to build powerful search applications. Building Search Applications describes functions from GATE that include entity extraction, part of speech tagging, sentence extraction, and text tokenization.

The book also explains spell check, phrase extraction, index and search, sentiment analysis, clustering, and categorization using Lucene and LingPipe."

On Amazon.

Linguistic Annotation and Text Analytics

Graham Wilcock, 2009

"The maturity of the software means that GATE is a robust and reliable platform..." [p.95]

Linguistic annotation and text analytics are active areas of research and development, with academic conferences and industry events such as the Linguistic Annotation Workshops and the annual Text Analytics Summits. This book provides a basic introduction to both fields, and aims to show that good linguistic annotations are the essential foundation for good text analytics. After briefly reviewing the basics of XML, with practical exercises illustrating in-line and stand-off annotations, a chapter is devoted to explaining the different levels of linguistic annotations. The reader is encouraged to create example annotations using the WordFreak linguistic annotation tool. The next chapter shows how annotations can be created automatically using statistical NLP tools, and compares two sets of tools, the OpenNLP and Stanford NLP tools. The second half of the book describes different annotation formats and gives practical examples of how to interchange annotations between different formats using XSLT transformations. The two main text analytics architectures, GATE and UIMA, are then described and compared, with practical exercises showing how to configure and customize them. The final chapter is an introduction to text analytics, describing the main applications and functions including named entity recognition, coreference resolution and information extraction, with practical examples using both open source and commercial tools.

Publisher site; on Amazon.

The GATE User Guide

The GATE Team, 2010

522 pages of the raciest text processing plots, subplots and appendices that the publishing world has ever known. PDF; HTML; printed version coming soon.