New GATE stuff, spring 2011
We're now releasing three new GATE products into the wild:
- GATE Mimir multiparadigm indexing:
Concept search, full-text search and annotation structure search in one scaleable index: "Mimir is a multi-paradigm information management index and repository which can be used to index and search over text, annotations, semantic schemas (ontologies), and semantic meta-data (instance data). It allows queries that arbitrarily mix full-text, structural, linguistic and semantic queries and that can scale to gigabytes of text. A typical semantic annotation project deals with large quantities of data of different kinds. Mimir provides a framework for implementing indexing and search functionality across all these data types."
- GATE Teamware 1.4:
Workflow-based manual and semi-automatic annotation projects with distributed collaborating teams, QA, and process reporting: "Teamware is a web-based management platform for collaborative annotation & curation. It is a cost-effective environment for annotation and curation projects, enabling you to harness a broadly distributed workforce and monitor progress & results remotely in real time. It’s also very easy to use. A new project can be up and running in less than five minutes. (As far as we know, there is nothing else like it in this field.)" See also this manual annotation guide.
Highly parallel GATE applications; GATE on the Amazon cloud; software as service (for Teamware) and platform as service (EC2)...: "The GATE Cloud Paralelliser is a software tool aimed at parallel execution of automatic [semantic] annotation processes. It is intended to process batches comprising large numbers of documents in a robust, long-running process. We currently apply it to datasets in the 10s of millions of (long) documents. (For 100s of millions of documents we combine it with cloud distribution mechanisms.)"
We also have several new facilities for GATE Developer/Embedded, which has reached version 6.1:
- sophisticated measurements and numbers taggers
- a new JAPE backend which delivers 2-3x speedup
- a new version of the alignment tools for dealing with parallel data in multiple languages
Everything is now in our SourceForge open source repository; release announcement coming soon...