Groovy recipes
- Groovy is a scripting language
- Groovy is not available by default in GATE, and is not used in the core of GATE
- The Groovy scripting console is available as an add on
- There is also a Groovy script PR, which lets you use an arbitrary Groovy script in a GATE application pipeline.
- see the user guide for details of both the above
- These pages are intended to be a repository of sample scripts
- There are just a few for now, add yours whenever you write something useful
1. Filter by document feature
This one is in the user guide
factory.newCorpus("fredsDocs").addAll(
docs.findAll{
it.getFeatures().get("annotator").equals("fred")
}
)
2. Filter if annotation sets exist - e.g. double annotated
Useful for separating out all double annotated docs from a corpus
factory.newCorpus("doubleDocs").addAll(
docs.findAll{
(it.getAnnotationSetNames().contains("annotator1")
&& it.getAnnotationSetNames().contains("annotator2"))
}
)
3. Choose an app to execute
You can already conditionally execute PRs based on a document feature. By placing two pipelines as PRs in a third conditional pipeline, you can extend this to execute a pipeline based on a document feature. This Groovy script goes one further, and chooses a pipeline to execute based on some other aspect of the document - in this case, the existence of a particular annotation set.
app1 = apps.find{it.name.equals("app1")}
app2 = apps.find{it.name.equals("app2")}
tempCorpus = factory.newCorpus("tempCorpus")
docs.findAll{
app = (it.getAnnotationSetNames().contains("annotator1")) ? app1 : app2
tempCorpus.add(it)
app.setCorpus(tempCorpus)
app.execute()
tempCorpus.clear()
}
factory.deleteResource(tempCorpus)
println "done"
4. How many annotations?
sum = 0
docs.findAll{
num = it.getAnnotations("Filtered").get("Anatomy").size()
sum += num
println it.getName() + " " + num
}
println "total:" + " " + sum
5. Rename annotations
This one is for the Groovy PR, but could easily be adapted to the console using the above ideas. You could also parameterise it if needed - see the user guide for details.
inputAS.findAll{
it.getType() == "OldName"
}.each{
outputAS.add(it.getStartNode().getOffset(),
it.getEndNode().getOffset(),
"NewName",
it.getFeatures())
}.each{
inputAS.remove(it)
}



