GATE Information Extraction Sample Task
The task is to extract information about succession events from this text from the Wall Street Journal:
<DOC> <DOCID> wsj93_050.0203 </DOCID> <DOCNO> 930219-0013. </DOCNO> <HL> Marketing Brief: @ Noted.... </HL> <DD> 02/19/93 </DD> <SO> WALL STREET JOURNAL (J), PAGE B5 </SO> <CO> NYTA </CO> <IN> MEDIA (MED), PUBLISHING (PUB) </IN> <TXT> <p> New York Times Co. named Russell T. Lewis, 45, president and general manager of its flagship New York Times newspaper, responsible for all business-side activities. He was executive vice president and deputy general manager. He succeeds Lance R. Primis, who in September was named president and chief operating officer of the parent. </p> </TXT> </DOC>
Output from an Information Extraction system might look like this: (Note: the information is not supposed to be easily human-readable in this form -- to render it readily comprehensible is the job of a text-summarisation programme -- this format may be entered into an on-line databese for future access and analysis).
<TEMPLATE-9302190013-1> :=
DOC_NR: "9302190013"
CONTENT: <SUCCESSION_EVENT-9302190013-1>
<SUCCESSION_EVENT-9302190013-2>
<SUCCESSION_EVENT-9302190013-3>
<SUCCESSION_EVENT-9302190013-4>
<SUCCESSION_EVENT-9302190013-5>
<SUCCESSION_EVENT-9302190013-6>
<SUCCESSION_EVENT-9302190013-1> :=
SUCCESSION_ORG: <ORGANIZATION-9302190013-2>
POST: "president"
IN_AND_OUT: <IN_AND_OUT-9302190013-1>
<IN_AND_OUT-9302190013-2>
VACANCY_REASON: REASSIGNMENT
<SUCCESSION_EVENT-9302190013-2> :=
SUCCESSION_ORG: <ORGANIZATION-9302190013-2>
POST: "general manager"
IN_AND_OUT: <IN_AND_OUT-9302190013-3>
<IN_AND_OUT-9302190013-4>
VACANCY_REASON: REASSIGNMENT
<SUCCESSION_EVENT-9302190013-3> :=
SUCCESSION_ORG: <ORGANIZATION-9302190013-2>
POST: "executive vice president"
IN_AND_OUT: <IN_AND_OUT-9302190013-5>
VACANCY_REASON: REASSIGNMENT
<SUCCESSION_EVENT-9302190013-4> :=
SUCCESSION_ORG: <ORGANIZATION-9302190013-2>
POST: "deputy general manager"
IN_AND_OUT: <IN_AND_OUT-9302190013-7>
VACANCY_REASON: REASSIGNMENT
<SUCCESSION_EVENT-9302190013-5> :=
SUCCESSION_ORG: <ORGANIZATION-9302190013-1>
POST: "president"
IN_AND_OUT: <IN_AND_OUT-9302190013-9>
VACANCY_REASON: OTH_UNK
<SUCCESSION_EVENT-9302190013-6> :=
SUCCESSION_ORG: <ORGANIZATION-9302190013-1>
POST: "chief operating officer"
IN_AND_OUT: <IN_AND_OUT-9302190013-10>
VACANCY_REASON: OTH_UNK
<IN_AND_OUT-9302190013-1> :=
IO_PERSON: <PERSON-9302190013-1>
NEW_STATUS: IN
ON_THE_JOB: UNCLEAR
OTHER_ORG: <ORGANIZATION-9302190013-2>
REL_OTHER_ORG: SAME_ORG
<IN_AND_OUT-9302190013-2> :=
IO_PERSON: <PERSON-9302190013-2>
NEW_STATUS: OUT
ON_THE_JOB: NO
OTHER_ORG: <ORGANIZATION-9302190013-1>
REL_OTHER_ORG: RELATED_ORG
<IN_AND_OUT-9302190013-3> :=
IO_PERSON: <PERSON-9302190013-1>
NEW_STATUS: IN
ON_THE_JOB: UNCLEAR
OTHER_ORG: <ORGANIZATION-9302190013-2>
REL_OTHER_ORG: SAME_ORG
<IN_AND_OUT-9302190013-4> :=
IO_PERSON: <PERSON-9302190013-2>
NEW_STATUS: OUT
ON_THE_JOB: NO
OTHER_ORG: <ORGANIZATION-9302190013-1>
REL_OTHER_ORG: RELATED_ORG
<IN_AND_OUT-9302190013-5> :=
IO_PERSON: <PERSON-9302190013-1>
NEW_STATUS: OUT
ON_THE_JOB: NO
OTHER_ORG: <ORGANIZATION-9302190013-2>
REL_OTHER_ORG: SAME_ORG
<IN_AND_OUT-9302190013-7> :=
IO_PERSON: <PERSON-9302190013-1>
NEW_STATUS: OUT
ON_THE_JOB: NO
OTHER_ORG: <ORGANIZATION-9302190013-2>
REL_OTHER_ORG: SAME_ORG
<IN_AND_OUT-9302190013-9> :=
IO_PERSON: <PERSON-9302190013-2>
NEW_STATUS: IN
ON_THE_JOB: YES
OTHER_ORG: <ORGANIZATION-9302190013-2>
REL_OTHER_ORG: RELATED_ORG
<IN_AND_OUT-9302190013-10> :=
IO_PERSON: <PERSON-9302190013-2>
NEW_STATUS: IN
ON_THE_JOB: YES
OTHER_ORG: <ORGANIZATION-9302190013-2>
REL_OTHER_ORG: RELATED_ORG
<ORGANIZATION-9302190013-1> :=
ORG_NAME: "New York Times Co."
ORG_DESCRIPTOR: "the parent"
ORG_TYPE: COMPANY
<ORGANIZATION-9302190013-2> :=
ORG_NAME: "New York Times"
ORG_DESCRIPTOR: "its flagship New York Times newspaper"
/ "flagship New York Times newspaper"
/ "the newspaper"
/ "the paper"
ORG_TYPE: COMPANY
<PERSON-9302190013-1> :=
PER_NAME: "Russell T. Lewis"
<PERSON-9302190013-2> :=
PER_NAME: "Lance R. Primis"




