Homewikijape-repository 〉 strings.html

JAPE String Matching


1. Check that the first two digits of a number match a specified string

Note that this could be generalised in many ways, e.g. match the first 3 letters of a word. In the example below, we're trying to check if the first two digits are "07".


// get the offsets
 Long phoneStart = tagAnnots.firstNode().getOffset();
 Long phoneEnd = tagAnnots.lastNode().getOffset();
// check the number is longer than or equal to 2 characters (just in case)
 if(phoneEnd - phoneStart >= 2) {
   try {
     String firstTwoChars = doc.getContent()
                     tagAnnots.firstNode().getOffset() + 2).toString();

// check it matches 07
     if("07".equals(firstTwoChars)) {
       // create the new annotation

    gate.FeatureMap features = Factory.newFeatureMap();
    features.put("kind", "mobile");
                           tagAS.lastNode(), "Phone", features);
   catch(InvalidOffsetException e) {
     // not possible
     throw new LuckyException("Invalid offset from annotation");

2. Convert a string to an integer

int x = Integer.parseInt(string);

3. Using a regular expression on content of selected annotation to add new annotation

Rule: ExtractAuthor
  {Reference.type == "Literature"}
  |{Reference.type == "Patent"}
  AnnotationSet set = (AnnotationSet)bindings.get("reference");
  Annotation ann = set.iterator().next();
  FeatureMap fm = (FeatureMap)
  fm.put("postprocessing.rule", "reference-extract-author.ExtractAuthor");
  try {
  String text = doc.getContent().getContent(
    set.firstNode().getOffset(), set.lastNode().getOffset()).toString();
  text = text.replaceAll("\\s", " "); // replace new line with space
  String lastName =
     "\\b" // beginning of a word
    +"(?:\\p{Ll}{0,3} )?" // particle ?
    +"\\p{Lu}[\\p{L}-]{1,13}" // Name
    +"(?: \\p{Ll}{0,3})?"; // particle ?
  String initials = "(?: \\p{Lu}\\.){1,3}";
  java.util.regex.Matcher matcher = java.util.regex.Pattern.compile(
    lastName+"(:?(?:,"+initials+")|(?:,? and "+lastName+")|(?:,? et al\\.?))"
  while (matcher.find()) {
                 "Author", fm);
  } catch(InvalidOffsetException e) {
      throw new GateRuntimeException(e);

4. Using meta-properties to get the string of an annotation

:label.New = {somefeat = :label.X@string } 

5. Using meta-properties to get the string of all annotations bound by the match

:label.New = { somefeat = :ys@string }