Groovy recipes
- Groovy is a scripting language
- Groovy support in GATE is provided by the Groovy plugin, which provides:
- The Groovy scripting console in the GATE Developer GUI
- A Groovy script PR, which lets you use an arbitrary Groovy script in a GATE application pipeline.
- A collection of extra methods on various GATE core types that can be used from Groovy code
- see the user guide for details of all the above
- These pages are intended to be a repository of sample scripts
- There are just a few for now, add yours whenever you write something useful
1. Filter by document feature
This one is in the user guide
Factory.newCorpus("fredsDocs").addAll(
docs.findAll{
it.features.annotator == "fred"
}
)
2. Filter if annotation sets exist - e.g. double annotated
Useful for separating out all double annotated docs from a corpus
Factory.newCorpus("doubleDocs").addAll(
docs.findAll{
(it.annotationSetNames.contains("annotator1")
&& it.annotationSetNames.contains("annotator2"))
}
)
3. Choose an app to execute
You can already conditionally execute PRs based on a document feature. By placing two pipelines as PRs in a third conditional pipeline, you can extend this to execute a pipeline based on a document feature. This Groovy script goes one further, and chooses a pipeline to execute based on some other aspect of the document - in this case, the existence of a particular annotation set.
app1 = apps.find{it.name.equals("app1")}
app2 = apps.find{it.name.equals("app2")}
Factory.newCorpus("tempCorpus").withResource { tempCorpus ->
docs.findAll{
app = (it.annotationSetNames.contains("annotator1")) ? app1 : app2
tempCorpus.add(it)
app.setCorpus(tempCorpus)
app.execute()
tempCorpus.clear()
}
}
println "done"
4. How many annotations?
sum = 0
docs.findAll{
def filteredAnnots = it.getAnnotations("Filtered")
num = filteredAnnots["Anatomy"].size()
sum += num
println it.name + " " + num
}
println "total:" + " " + sum
5. Rename annotations
This one is for the Groovy PR, but could easily be adapted to the console using the above ideas. You could also parameterise it if needed - see the user guide for details.
inputAS.findAll{
it.type == "OldName"
}.each{
outputAS.add(it.start(), it.end(),
"NewName",
it.features.toFeatureMap()) // clone the feature map
}.each{
inputAS.remove(it)
}




