gate.configurableexporter
Class ConfigurableExporter

java.lang.Object
  extended by gate.util.AbstractFeatureBearer
      extended by gate.creole.AbstractResource
          extended by gate.creole.AbstractProcessingResource
              extended by gate.creole.AbstractLanguageAnalyser
                  extended by gate.configurableexporter.ConfigurableExporter
All Implemented Interfaces:
ANNIEConstants, Executable, LanguageAnalyser, ProcessingResource, Resource, FeatureBearer, NameBearer, Serializable

@CreoleResource(name="Configurable Exporter",
                comment="Allows annotations to be exported according to a specified format.")
public class ConfigurableExporter
extends AbstractLanguageAnalyser
implements ProcessingResource, Serializable

Configurable Exporter takes a configuration file specifying the format of the output file. The configuration file consists of a single line specifying output format with annotation names surrounded by three angle brackets. E.g.

  {index}, {class}, "{content}"
  
might result in an output file something like
  10000004, A, "Some text .."
  10000005, A, "Some more text .."
  10000006, B, "Further text .."
  10000007, B, "Additional text .."
  10000008, B, "Yet more text .."
  
Annotation features can also be specified using dot notation, for example;
  {index}, {instance.class}, "{content}"
  
The PR is useful for outputting data for use in machine learning, and so each line is considered an "instance". Instance is specified at run time, and by default is a document, but might be an annotation type. Instances are output one per line and the config file specifies how to output each instance. Annotations included in the output file are the first incidence of the specified type in the instance. If there is ever a need for it I might fix it so you can output more than one incidence of the same annotation type.

See Also:
Serialized Form

Nested Class Summary
 
Nested classes/interfaces inherited from class gate.creole.AbstractProcessingResource
AbstractProcessingResource.InternalStatusListener, AbstractProcessingResource.IntervalProgressListener
 
Field Summary
 
Fields inherited from class gate.creole.AbstractLanguageAnalyser
corpus, document
 
Fields inherited from class gate.creole.AbstractProcessingResource
interrupted
 
Fields inherited from class gate.creole.AbstractResource
name
 
Fields inherited from class gate.util.AbstractFeatureBearer
features
 
Fields inherited from interface gate.creole.ANNIEConstants
ANNOTATION_COREF_FEATURE_NAME, DATE_ANNOTATION_TYPE, DATE_POSTED_ANNOTATION_TYPE, DEFAULT_FILE, DOCUMENT_COREF_FEATURE_NAME, JOB_ID_ANNOTATION_TYPE, LOCATION_ANNOTATION_TYPE, LOOKUP_ANNOTATION_TYPE, LOOKUP_CLASS_FEATURE_NAME, LOOKUP_INSTANCE_FEATURE_NAME, LOOKUP_LANGUAGE_FEATURE_NAME, LOOKUP_MAJOR_TYPE_FEATURE_NAME, LOOKUP_MINOR_TYPE_FEATURE_NAME, LOOKUP_ONTOLOGY_FEATURE_NAME, MONEY_ANNOTATION_TYPE, ORGANIZATION_ANNOTATION_TYPE, PERSON_ANNOTATION_TYPE, PERSON_GENDER_FEATURE_NAME, PLUGIN_DIR, PR_NAMES, SENTENCE_ANNOTATION_TYPE, SPACE_TOKEN_ANNOTATION_TYPE, TOKEN_ANNOTATION_TYPE, TOKEN_CATEGORY_FEATURE_NAME, TOKEN_KIND_FEATURE_NAME, TOKEN_LENGTH_FEATURE_NAME, TOKEN_ORTH_FEATURE_NAME, TOKEN_STRING_FEATURE_NAME
 
Constructor Summary
ConfigurableExporter()
           
 
Method Summary
 void execute()
           
 URL getConfigFileURL()
           
 String getInputASName()
           
 String getInstanceName()
           
 URL getOutputURL()
           
 Resource init()
           
 void interrupt()
           
 void setConfigFileURL(URL configFileURL)
           
 void setInputASName(String iasn)
           
 void setInstanceName(String inst)
           
 void setOutputURL(URL output)
           
 
Methods inherited from class gate.creole.AbstractLanguageAnalyser
getCorpus, getDocument, setCorpus, setDocument
 
Methods inherited from class gate.creole.AbstractProcessingResource
addProgressListener, addStatusListener, cleanup, fireProcessFinished, fireProgressChanged, fireStatusChanged, getRuntimeParameterValues, getRuntimeParameterValues, isInterrupted, reInit, removeProgressListener, removeStatusListener
 
Methods inherited from class gate.creole.AbstractResource
checkParameterValues, flushBeanInfoCache, getBeanInfo, getInitParameterValues, getInitParameterValues, getName, getParameterValue, getParameterValue, getParameterValues, removeResourceListeners, setName, setParameterValue, setParameterValue, setParameterValues, setParameterValues, setResourceListeners
 
Methods inherited from class gate.util.AbstractFeatureBearer
getFeatures, setFeatures
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 
Methods inherited from interface gate.ProcessingResource
reInit
 
Methods inherited from interface gate.Resource
cleanup, getParameterValue, setParameterValue, setParameterValues
 
Methods inherited from interface gate.util.FeatureBearer
getFeatures, setFeatures
 
Methods inherited from interface gate.util.NameBearer
getName, setName
 
Methods inherited from interface gate.Executable
isInterrupted
 

Constructor Detail

ConfigurableExporter

public ConfigurableExporter()
Method Detail

setConfigFileURL

@CreoleParameter(comment="The configuration file specifying output format.",
                 defaultValue="resources/configurableexporter/example.conf",
                 suffixes=".conf")
public void setConfigFileURL(URL configFileURL)

getConfigFileURL

public URL getConfigFileURL()

setOutputURL

@RunTime
@Optional
@CreoleParameter(comment="The file to which data will be output. Leave blank for output to messages tab or standard out.")
public void setOutputURL(URL output)

getOutputURL

public URL getOutputURL()

setInputASName

@RunTime
@Optional
@CreoleParameter(comment="The name for annotation set used as input to the exporter.")
public void setInputASName(String iasn)

getInputASName

public String getInputASName()

setInstanceName

@RunTime
@Optional
@CreoleParameter(comment="The annotation type to be treated as instance. Leave blank to use document as instance.")
public void setInstanceName(String inst)

getInstanceName

public String getInstanceName()

init

public Resource init()
              throws ResourceInstantiationException
Specified by:
init in interface Resource
Overrides:
init in class AbstractProcessingResource
Throws:
ResourceInstantiationException

execute

public void execute()
             throws ExecutionException
Specified by:
execute in interface Executable
Overrides:
execute in class AbstractProcessingResource
Throws:
ExecutionException

interrupt

public void interrupt()
Specified by:
interrupt in interface Executable
Overrides:
interrupt in class AbstractProcessingResource