Workshop on Persisting, Indexing and Querying Multi-Paradigm Text Models

Matrixware / IRF, Vienna, May 2008


Consider the following three types of retrieval systems:

Systems for high-value content retrieval are likely to combine elements of all three styles, which poses difficult problems of representation, persistence, indexing and querying. Current systems often combine three quite different engines, perhaps putting an augmented full-text index in Lucene, Terrier or Lemur, and an ontology in OWLIM or Sesame. Search over annotation graphs is not normally exposed to end-users at present, but we believe that information professionals will benefit from this type of facility in the near future. Antecedents in the research world include CWB (the Corpus Workbench), GATE's ANNIC (Annotations in Context), the BNC's SARA system. XML engines are certainly relevant, but a difficulty remains in expressing graph-structured data in a tree-oriented language.

This workshop is intended to cross-fertilise research streams from each of the three types and from the XML and database worlds, with a view to improving the state-of-the-art for multi-paradigm systems.

Thursday 15th May

Friday 16th May