GATE

Discovering, inter-relating and navigating cross-media campaign knowledge

Summary

Knowledge about which competitor company has invested how much money in a specific media campaign is very important for the highest management level of companies. Such information is gathered through global advertisement expenditure measurement, which is performed by media monitoring companies. This type of business intelligence is a very complex task, which is currently performed manually and therefore is very expensive.

A media campaign is defined as the universe of measures in order to fulfill a specific objective. MediaCampaign's scope is on discovering, inter-relating and navigating cross-media campaign knowledge and to automate a large degree of the detection and tracking of media campaigns on television, Internet and in the press. For the pilot system developed within the project the project focus is on a concrete example for a media campaign: advertisement campaigns. However, the approach taken and the component implementations will allow easy setup of a system for monitoring and analysis of other campaigns (e.g. political and social ones).

The main user groups of MediaCampaign are executives and analysts who need up-to-date information on how much money their competitors invest in different media and countries. Having this information fast at hands enables decision makers to measure the competitor's activity, their market performance and last but not least to budget and forecast more accurately own advertisement campaigns.

In order to get this information as fast as possible the currently manual process to acquire the necessary data will be significantly accelerated by the MediaCampaign framework. Scientifically there are a number of high-risk research objectives to meet: (1) creation of a knowledge model for semantic description of media campaigns in general, (2) identification & tracking of new media campaigns in different media and (3) modeling of domain specific ontologies which relate media campaigns over different media and countries.

Media-Campaign is funded as a Specific Targeted Research Project under European Commission 6th FP with a budget around 2.5 million euros. It starts from April 2006 and runs for 2.5 years. Media-Campaign is coordinated by Dr. Herwig Rehatschek, Joanneum Research, Austria.

Contact: Hamish Cunningham (PI).


Project Objectives:

MediaCampaign's scope is on discovering, inter-relating and navigating cross-media campaign knowledge. A media campaign is defined as the universe of measures in order to fulfill a specific objective, e.g. the objective is to introduce a new car then measures can be advertisements in TV, press, Internet, radio next to magazine articles, and interviews.

The project’s main goal is to automate to a large degree the detection and tracking of media campaigns on television, Internet and in the press. This will lead to new business cases in media monitoring and analysis, and positively impact the European advertising sector. In support of this goal, the project will address the following objectives:

For the pilot system developed within the project we will focus on a concrete example for a media campaign: advertisement campaigns e.g. a new car model is introduced into the market, however, the system will be designed in such a way that it can be extended to monitor and analyze also other campaigns such as political campaigns. Hence a major technical objective is to design the system architecture as flexible as possible by using well defined interfaces and utilizing standards for all components where possible.

Scientifically there are a number of high-risk research objectives to meet:

Our Role

Text analysis (press, Internet, speech transcript)
Text analysis in MediaCampaign will deploy GATE to the results of televisual and image analyses, using the following algorithms :

  1. Speech recognition / OCR techniques derives text from the audiovisual and image materials.
  2. Term finding / unsupervised clustering decides which elements from 1. are most significant.
  3. Significant terms from 2. used for part of a focussed web search targeting campaign-related materials.
  4. IE processes the results from 3. to spot related text-borne facts on whether the campaign is new or not.

The IE software itself is a hybrid system that combines a finite state transduction engine with an SVM-based probabilistic engine. The components have been entered in many of the leading evaluation forums for this technology over the last decade or so, including: MUC, the Message Understanding Conference; TREC QA, the Text Retrieval Conference Question Answering track; DUC, the Document Understanding Conference (summarisation); the Pascal challenge on learning IE for the semantic web. Recent work has focused on using IE to populate ontologies (within the SEKT project), and this new approach will form the basis of the work in this project. For MediaCampaign the ontology-based IE system will be tailored to the domain of advertising campaigns and the materials from ASR, video and image OCR and captioning, and to target the MediaCampaign ontologies.

Product knowledge interlinking
The semantic analysis tools from WP4 produce results showing all mentions of ontology concepts in the different media (e.g., brand names appearing in newspaper, TV, and radio adverts). The next step, to be carried out in this task, is to interlink all these different mentions via their link to the ontology. In other words, this task will make available the composite knowledge model of the various campaign materials across media. This will be based on the preservation of links back into sources through the analysis stages and their linking to the shared domain ontology. We will use a hybrid statistical and rule-based system, which applies several SVM-based models to gain a probabilistic view of the significance of the various evidence sources, then selects which model to apply in different circumstances based on domain-specific heuristics.

Unification of two or more (partial) descriptions of an instance
For instance, when there are four descriptions of one and the same car model: two of them coming from two different product databases (e.g. from UK and Germany); one extracted from a press advertisement, and one from a TV spot.

Knowledge fusion for campaign discovery (WP5)
Tracking of campaigns based on, association of media presence events (e.g. a particular ad) with already running campaigns. A combination of "hard" ontology-based retrieval and "soft" SVM classification will be used. In more detail the approach will be a combination of formal reasoning on the basis of the semantic information on one hand and context-based evidence from the media on the other. Formal reasoning will be applied on the semantic content in the media repository and will use identity criteria and axioms. Context-based evidence from the media will be used to build a model of the co-occurring entities and terms, then context-based similarity metric will be computed to determine identity. As we are dealing both with text and images, the context model will include features derived from both.


Funding: