Working with documents
Contents
- 1. Add a new feature to a document
- 2. Does the document already have this feature?
- 3. Remove a feature from a document
- 4. Get the value of an existing feature on a document and add it as the value of a new feature
- 5. Get string content and add it as a feature value
- 6. Alternatively, you can just do:
- 7. Get string content and add it as a feature value (if you're working with a set not an annotation)
- 8. Or alternatively,
- 9. Or the easy way (add an annotation called Fish with feature called "text" whose value is the string)
- 10. You can also do
1. Add a new feature to a document
doc.getFeatures().put("genre", "email");
2. Does the document already have this feature?
boolean genreAlreadyAssigned = doc.getFeatures().containsKey("genre");
3. Remove a feature from a document
doc.getFeatures().remove("genre");
4. Get the value of an existing feature on a document and add it as the value of a new feature
Rule: pick ( {Token.category==000} ):pat --> { // gets the annotationSets and annotations for each label gate.AnnotationSet patSet = (gate.AnnotationSet)bindings.get("pat"); gate.Annotation patAnn = (gate.Annotation)patSet.iterator().next(); // create FeatureMap to hold new features gate.FeatureMap features = Factory.newFeatureMap(); // get value of feature from Token.category and add it // as the value to a new feature Cat features.put("Cat", patAnn.getFeatures().get("category")); // add a rule feature (just in case) features.put("rule", "pick"); // create new annotation Hebbes with the features created annotations.add(patSet.firstNode(), patSet.lastNode(), "Hebbes", features); }
5. Get string content and add it as a feature value
try{ String content = doc.getContent().getContent(ann.getStartNode().getOffset(), ann.getEndNode().getOffset()).toString(); // add the string as the value of the feature "string" features.put("string", content); // create a new annotation called "Annotation" annotations.add(annSet.firstNode(), annSet.lastNode(), "Annotation", features); }catch(InvalidOffsetException ioe){ //this should never happen throw new GateRuntimeException(ioe); }
6. Alternatively, you can just do:
String content = gate.Utils.stringFor(doc, ann); features.put("string", content);
7. Get string content and add it as a feature value (if you're working with a set not an annotation)
Replace the first part with:
String content = doc.getContent().getContent(annSet.firstNode().getOffset(), annSet.lastNode().getOffset()).toString();
8. Or alternatively,
String content = gate.Utils.stringFor(doc, annSet); features.put("string", content);
9. Or the easy way (add an annotation called Fish with feature called "text" whose value is the string)
({Lookup}):match --> :match =Fish = {text = :match.Fish@cleanString}
10. You can also do
({Lookup}):match --> :match =Fish = {text = :match@cleanString}