|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
Corpora are lists of Document. TIPSTER equivalent: Collection.
Field Summary | |
static String |
CORPUS_DOCLIST_PARAMETER_NAME
|
static String |
CORPUS_NAME_PARAMETER_NAME
|
Method Summary | |
void |
addCorpusListener(CorpusListener l)
Registers a new CorpusListener with this corpus. |
String |
getDocumentName(int index)
Gets the name of a document in this corpus. |
List |
getDocumentNames()
Gets the names of the documents in this corpus. |
boolean |
isDocumentLoaded(int index)
This method returns true when the document is already loaded in memory. |
void |
populate(URL directory,
FileFilter filter,
String encoding,
boolean recurseDirectories)
Fills this corpus with documents created on the fly from selected files in a directory. |
void |
removeCorpusListener(CorpusListener l)
Removes one of the listeners registered with this corpus. |
void |
unloadDocument(Document doc)
Unloads the document from memory. |
Methods inherited from interface gate.LanguageResource |
getDataStore, getLRPersistenceId, getParent, isModified, setDataStore, setLRPersistenceId, setParent, sync |
Methods inherited from interface gate.Resource |
cleanup, getParameterValue, init, setParameterValue, setParameterValues |
Methods inherited from interface gate.util.FeatureBearer |
getFeatures, setFeatures |
Methods inherited from interface gate.util.NameBearer |
getName, setName |
Methods inherited from interface java.util.List |
add, add, addAll, addAll, clear, contains, containsAll, equals, get, hashCode, indexOf, isEmpty, iterator, lastIndexOf, listIterator, listIterator, remove, remove, removeAll, retainAll, set, size, subList, toArray, toArray |
Field Detail |
public static final String CORPUS_NAME_PARAMETER_NAME
public static final String CORPUS_DOCLIST_PARAMETER_NAME
Method Detail |
public List getDocumentNames()
List
of Strings representing the names of the documents
in this corpus.public String getDocumentName(int index)
index
- the index of the document
public void unloadDocument(Document doc)
Transient Corpus objects do nothing, because there would be no way to get the document back again afterwards.
public void populate(URL directory, FileFilter filter, String encoding, boolean recurseDirectories) throws IOException, ResourceInstantiationException
ExtensionFileFilter
).
directory
- the directory from which the files will be picked. This
parameter is an URL for uniformity. It needs to be a URL of type file
otherwise an InvalidArgumentException will be thrown.
An implementation for this method is provided as a static method at
gate.corpora.CorpusImpl#populate(Corpus,URL,FileFilter,boolean)
.filter
- the file filter used to select files from the target
directory. If the filter is null all the files will be accepted.encoding
- the encoding to be used for reading the documentsrecurseDirectories
- should the directory be parsed recursively?. If
true all the files from the provided directory and all its
children directories (on as many levels as necessary) will be picked if
accepted by the filter otherwise the children directories will be ignored.
IOException
ResourceInstantiationException
public boolean isDocumentLoaded(int index)
public void removeCorpusListener(CorpusListener l)
l
- the listener to be removed.public void addCorpusListener(CorpusListener l)
CorpusListener
with this corpus.
l
- the listener to be added.
|
|||||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |