Bio-YODIE is a named entity linking system derived from GATE YODIE. It links mentions in biomedical text to their referents in the UMLS, a large and popular medical terminology. Bio-YODIE has been developed as part of the KConnect project.
The screenshot below shows French Bio-YODIE having been run on a test document in the GATE GUI. The most important thing to note is that the selected annotations (in yellow) have been added to the document by the application, and features added to it, as shown in the popup window with reference to the mention "utilisateurs de drogues". The output annotations appear in the "bio" annotation set.
Many of the features relate to experimental approaches to disambiguation, and are in flux, but features of value to the end user include among others "inst" (the CUI) and "STY" (the semantic type, according to UMLS). CUI could for example be used to link to UMLS-based knowledge sources for semantic enrichment applications. STY might be used to find, for example, all disease names in the text.
The application works by finding all mentions in the text that correspond to a label in UMLS. In some cases, more than one possible interpretation may be found for a piece of text. In that case, various knowledge sources are used to choose the best interpretation.
Bio-YODIE is also available as a service; contact Angus Roberts (first name dot surname at sheffield dot ac dot uk) for more information.
The Bio-YODIE pipeline contains a structured directory providing a GATE master (modular) pipeline xgapp with sub-pipelines and resources. You can run it as you would any GATE application. See the README contained within for more information.
You will need to provide Bio-YODIE with a resources directory based on your own UMLS download. The machinery for creating this directory is provided below, again with a README.
Bio-YODIE is a research project in active development, and minor issues may arise. If you have any problems, contact Genevieve Gorrell (g dot gorrell at sheffield dot ac dot uk).
License files are included that indicate licensing status of the components. The plugins directory in the pipeline download contains all the plugins required to run the application; note that some of these have a different licensing status to the main application.