Log in Help
Print
Homesaleedmo 〉 anglo-irish.html
 

Anglo-Irish Politics Monitor

This service is designed to enrich text with entities related to Anglo-Irish politics. The application builds on previous work which focused on UK politics but has been extended significantly.

This version currently recognises the following political concepts/entities:

The application also includes the entities from the ANNIE enrichment service, but has again been expanded to include extra entities relevant to Anglo-Irish politics, specifically:

The application also builds upon our pre-existing TwitIE service to enable sensible processing of social media text as well as more normal prose. This means we recognise URLs, Hashtags, and Twitter UserIDs. This has again been augmented so that the accounts of politicians (those outlined in the list above) can be recognised and tagged appropriately.

The complete set of annotations available are as follows:

This works to extend the service to cover Irish politics has been produced as part of the EDMO Ireland project: https://edmohub.ie/

1. New Information Sources

As discussed above the text enrichment envisaged for the EDMO Ireland project has required the collection of a number of new data sets from which we can build the relevant tools. These new data sets fall into one of two categories. Either the data set is relatively static (e.g. the names of towns) or is more dynamic (e.g. the names of the currently elected politicians). The rest of this section briefly discusses both types of data set and the specific sources we have included within the new text enrichment application.

1.1. Static Data Sets

Static data sets are usually the easier type to collect; often as someone else has already done the hard work. For this application, we’ve collected three different sets of data which fall into this category: place names, political institutions, and far-right hate groups.

1.1.1. Place Names

The application already includes ANNIE which contains a pretty good gazetteer for British place names, but only covers a few major towns/cities in Northern Ireland and the Republic of Ireland.

Extending this to cover more of the places in Northern Ireland is very easy as the government provides such a gazetteer (https://www.opendatani.gov.uk/@land-property/osni-open-data-gazetteer-place-names1) under the Open Government License (https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/).

Unfortunately, there does not seem to be an equivalent gazetteer for the Republic of Ireland (at least not one that is free and easily accessible). Fortunately Wikipedia has a single page listing the towns and villages within the Republic of Ireland (https://en.wikipedia.org/wiki/List_of_towns_and_villages_in_the_Republic_of_Ireland) which could be easily turned into a gazetteer.

1.1.2. Political Organizations

ANNIE already had a “government” category for Organizations, but this didn’t include the Irish parliament. For this we’ve taken the structure from this Wikipedia page: https://en.wikipedia.org/wiki/Oireachtas Also missing was a comprehensive set of EU institution names. Fortunately the EU provide a number of web pages which document all the main institutions and agencies: https://european-union.europa.eu/institutions-law-budget/institutions-and-bodies/types-institutions-and-bodies_en

1.1.3. Far-Right Hate Groups

Whilst new far-right hate groups do appear, it is fortunate that this only happens periodically, and as such we are viewing this as a static data set. Wikipedia maintains a long list of British based groups which it was trivial to turn into a gazetteer for use within the app. It was hard to find any details on Irish groups, partially as there aren’t very many. What details we have been able to include comes from a report by Civic Nation, which also discusses why there are so few groups: https://civic-nation.org/ireland/society/radical_right-wing_political_parties_and_groups/

1.2. Dynamic Data Sets

The majority of the dynamic data sets we are interested in relate to people. Specifically, politicians change on a regular basis via elections (both scheduled and by elections) and we would like to keep our application up to date in the easiest way possible. Previously we had been taking data from EveryPolitician (https://everypolitician.org) but the project was frozen in June of 2019. Since then a lot of the data has been added to Wikidata. Whilst it is great that the data has not been lost it is no longer quite as easy to access. EveryPolitician provided easy to use CSV files for each parliamentary session which could be turned into a gazetteer with little to no work. Given Wikidata’s more general UI, such files are no longer available.

Fortunately as well as being able to browse Wikidata you can run SPARQL queries against it to select subsets of data which you can then download as CSV files. So for our purposes we simply need to build a query per gazetteer we wish to generate.

It wouldn’t be appropriate to include every SPARQL query within this description, but here is one example, to explain how we get the information, as well as how minor tweaks can easily allow us to collect data for other sets of politicians should the need arise. This specific example is for locating all the current members of the 33rd sitting of the Dáil (Ireland’s parliament).

 SELECT DISTINCT ?itemLabel ?groupLabel ?districtLabel ?twitter_handle ?twitter_id ?genderLabel WHERE { 
  ?item p:P39 ?mem .
  ?mem ps:P39 wd:Q654291 ; pq:P2937 wd:Q85677302 . 
  OPTIONAL { ?item wdt:P21 ?gender }
  OPTIONAL { ?mem pq:P768  ?district }
  OPTIONAL { ?mem pq:P4100 ?group }
  OPTIONAL { ?item p:P2002 ?twitter .
             ?twitter ps:P2002 ?twitter_handle .
             ?twitter pq:P6552 ?twitter_id}
  SERVICE wikibase:label {
    bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". 
  }
  FILTER NOT EXISTS { ?mem pq:P582 ?end }
} ORDER BY ?itemLabel 

Whilst this query may look rather cryptic (especially given the use of numeric predicates) it can essentially be broken down into three main parts. It first selects all the people who are members of the 33rd sitting (ps:P39 wd:Q654291 means they are a member of the Irish parliament, and we limit to the 33rd sitting which is wd:Q85677302). Having found the correct set of people we then extract all the information about them that we want, such as gender, political party, which constituency they represent, and their twitter account. This information is all optional so the resulting sheet is only guaranteed to have the person's name in the first column. Finally we state that we want English labels (rather than any translations) and that we want to ignore anyone who is no longer in the parliament (i.e. an end date has been given for them leaving).

Whilst this query is clearly not easy to read, the Wikidata query UI (https://query.wikidata.org/) does help by providing tooltips over the different identifiers. Hopefully you can imagine how given this query we could modify it for a different session of the Dáil or even for a different legislative body; for example, we use an almost identical query to extract the set of current MEPs into a gazetteer as well. The specific queries used for each gazetteer list generated this way is saved along with the app and is available on request.