Social media is fast becoming a crucial part of our everyday lives, not only as a fun and practical way to share our interests and activities with geographically distributed networks of friends, but as an important part of our business lives also. On the one hand, it affords a highly effective means of promotion and advertising for companies, as well as market watch activities to keep an eye on competitors and collaborators, while on the other hand, it enables companies and institutions to acquire valuable feedback by analysing what their customers have to say. All kinds of predictions can be made based on such knowledge, from gauging current political opinions and predicting stock price movements relative to public mood, to the more frivolous but still highly lucrative predictions about Oscar winners and film revenues. Processing social media is particularly problematic for NLP tools, firstly because it is a strong departure from the tradition of newswire that many tools were developed with and evaluated against, and also due to the terse and low-context language it typically comprises. Effective opinion mining from social media is still very much a hot research topic rather than a solved problem.
This tutorial will address these needs by introducing some of the problems faced by using NLP tools on social media, and solutions to these problems. It will demonstrate techniques for extracting the relevant information from unstructured text in social media, so that participants will be equipped with the necessary building blocks of knowledge to build their own tools and tackle complex issues. The tutorial will cover state-of-the-art research for important subtasks. Since all of the NLP tools to be presented are open source, the tutorial will provide the attendees with skills which are easy to apply and do not require special software or licenses.
The tutorial will be divided into 3 sections, as follows:
The tutorial will run from 9am-1.30pm. There will be a coffee break around 11am.
The target audience will consist of researchers from any background looking to perform analysis of social media. No particular skills or knowledge are necessary, but an understanding of basic natural language processing concepts and techniques is useful, as is a general familiarity with Twitter, Facebook and social media.
Dr Diana Maynard is a Research Fellow at the University of Sheffield, UK. She has a PhD in Automatic Term Recognition from Manchester Metropolitan University, and has been involved in research in NLP since 1994. Her main interests are in Information Extraction, opinion mining, social media and Semantic Web technology. Since 2000 she has led the development of USFD;s opensource multilingual Information Extraction tools, and has led research teams on a number of UK and EU projects. She is chair of the annual GATE training courses, teaches modules on Advanced Information Extraction, opinion mining and social media analysis, and leads the GATE consultancy on Information Extraction and opinion mining. She has published extensively, organised a number of national and international conferences, workshops and tutorials, given keynote speeches, invited talks, tutorials, lectures and courses on a number of NLP topics, including information extaction, opinion mining and social media analysis at international NLP and Semantic Web conferences.
Dr. Leon Derczynski is a Research Associate in the Natural Language Processing Group at the University of Sheffield, where he has worked for five years. He holds a PhD in Computational Linguistics from the University of Manchester, and has worked in computational linguistics and natural language processing for 10 years. He has published scientific papers and reviewed conference and journal papers, and regularly teaches at GATE courses and tutorials. He has worked on the implementation of linguistic annotation standards and the development of NLP applications for information extraction, opinion mining and the semantic web, particularly in national and international research projects.