Working with documents
Contents
- 1. Add a new feature to a document
- 2. Does the document already have this feature?
- 3. Remove a feature from a document
- 4. Get the value of an existing feature on a document and add it as the value of a new feature
- 5. Get string content and add it as a feature value
- 6. Alternatively, you can just do:
- 7. Get string content and add it as a feature value (if you're working with a set not an annotation)
- 8. Or alternatively,
- 9. Or the easy way (add an annotation called Fish with feature called "text" whose value is the string)
- 10. You can also do
1. Add a new feature to a document
doc.getFeatures().put("genre", "email");
2. Does the document already have this feature?
boolean genreAlreadyAssigned = doc.getFeatures().containsKey("genre");
3. Remove a feature from a document
doc.getFeatures().remove("genre");
4. Get the value of an existing feature on a document and add it as the value of a new feature
Rule: pick
(
{Token.category==000}
):pat
-->
{
// gets the annotationSets and annotations for each label
gate.AnnotationSet patSet = (gate.AnnotationSet)bindings.get("pat");
gate.Annotation patAnn = (gate.Annotation)patSet.iterator().next();
// create FeatureMap to hold new features
gate.FeatureMap features = Factory.newFeatureMap();
// get value of feature from Token.category and add it
// as the value to a new feature Cat
features.put("Cat", patAnn.getFeatures().get("category"));
// add a rule feature (just in case)
features.put("rule", "pick");
// create new annotation Hebbes with the features created
annotations.add(patSet.firstNode(), patSet.lastNode(), "Hebbes",
features);
}
5. Get string content and add it as a feature value
try{
String content = doc.getContent().getContent(ann.getStartNode().getOffset(),
ann.getEndNode().getOffset()).toString();
// add the string as the value of the feature "string"
features.put("string", content);
// create a new annotation called "Annotation"
annotations.add(annSet.firstNode(), annSet.lastNode(), "Annotation",
features);
}catch(InvalidOffsetException ioe){
//this should never happen
throw new GateRuntimeException(ioe);
}
6. Alternatively, you can just do:
String content = gate.Utils.stringFor(doc, ann);
features.put("string", content);
7. Get string content and add it as a feature value (if you're working with a set not an annotation)
Replace the first part with:
String content = doc.getContent().getContent(annSet.firstNode().getOffset(),
annSet.lastNode().getOffset()).toString();
8. Or alternatively,
String content = gate.Utils.stringFor(doc, annSet);
features.put("string", content);
9. Or the easy way (add an annotation called Fish with feature called "text" whose value is the string)
({Lookup}):match
-->
:match =Fish = {text = :match.Fish@cleanString}
10. You can also do
({Lookup}):match
-->
:match =Fish = {text = :match@cleanString}




