Log in Help
Print
Homewikijape-repository 〉 documents.html
 

Working with documents

Contents

1. Add a new feature to a document

doc.getFeatures().put("genre", "email");

2. Does the document already have this feature?

boolean genreAlreadyAssigned = doc.getFeatures().containsKey("genre");

3. Remove a feature from a document

doc.getFeatures().remove("genre");

4. Get the value of an existing feature on a document and add it as the value of a new feature

Rule: pick
(
 {Token.category==000}
):pat
-->
{
// gets the annotationSets and annotations for each label
gate.AnnotationSet patSet = (gate.AnnotationSet)bindings.get("pat");
gate.Annotation patAnn = (gate.Annotation)patSet.iterator().next();

// create FeatureMap to hold new features
gate.FeatureMap features = Factory.newFeatureMap();

// get value of feature from Token.category and add it 
// as the value to a new feature Cat
features.put("Cat", patAnn.getFeatures().get("category"));

// add a rule feature (just in case)
features.put("rule", "pick");

// create new annotation Hebbes with the features created
annotations.add(patSet.firstNode(), patSet.lastNode(), "Hebbes",
features);
}

5. Get string content and add it as a feature value

try{
String content = doc.getContent().getContent(ann.getStartNode().getOffset(),
                 ann.getEndNode().getOffset()).toString();

// add the string as the value of the feature "string"
features.put("string", content);

// create a new annotation called "Annotation"
annotations.add(annSet.firstNode(), annSet.lastNode(), "Annotation",
features);

}catch(InvalidOffsetException ioe){
      //this should never happen
      throw new GateRuntimeException(ioe);
    }

6. Alternatively, you can just do:

String content = gate.Utils.stringFor(doc, ann);
features.put("string", content);

7. Get string content and add it as a feature value (if you're working with a set not an annotation)

Replace the first part with:

String content = doc.getContent().getContent(annSet.firstNode().getOffset(),
                 annSet.lastNode().getOffset()).toString();

8. Or alternatively,

String content = gate.Utils.stringFor(doc, annSet);
features.put("string", content);