Anglo-Irish Politics Monitor
- Service Description: https://cloud.gate.ac.uk/shopfront/displayItem/anglo-irish-politics
- REST API Endpoint: https://cloud-api.gate.ac.uk/process/anglo-irish-politics
- REST API Documentation: https://cloud.gate.ac.uk/info/help/online-api.html
This service is designed to enrich text with entities related to Anglo-Irish politics. The application builds on previous work which focused on UK politics but has been extended significantly.
This version currently recognises the following political concepts/entities:
- Members of the 2015, 2017, and 2019 UK parliaments
- Candidates for the 2017 and 2019 UK parliamentary elections
- Members of the 2020 Irish Dáil Éireann
- Members of the 2022 Northern Ireland Assembly
- Members of the 2014 and 2019 EU parliaments
- Irish political institutions
- EU institutions
- Political topics
- UK far-right groups
- Irish far-right groups
The application also includes the entities from the ANNIE enrichment service, but has again been expanded to include extra entities relevant to Anglo-Irish politics, specifically:
- Place names for both Northern Ireland and the Republic of Ireland
The application also builds upon our pre-existing TwitIE service to enable sensible processing of social media text as well as more normal prose. This means we recognise URLs, Hashtags, and Twitter UserIDs. This has again been augmented so that the accounts of politicians (those outlined in the list above) can be recognised and tagged appropriately.
The complete set of annotations available are as follows:
- :Topic — Mentions of topics relevant to politics. Was initially based largely on the topic classification used on gov.uk although the terms within each topic are no longer solely UK focused.
- :Politician — This is a general catch all annotation that covers any politician recognised by the service. Each annotation will have a minorType feature recording the type of politician. For example, someone elected to the UK parliament in 2019 would have the value mp58, as they are members of the 58th sitting of the UK parliament.
- :MP —The subset of people annotated Politician who are a UK member of parliament. This includes both current and former MPs.
- :DailMP — The subset of people annotated Politician who are a member of the Irish Dáil Éireann.
- :StormontMLA — The subset of people annotated Politician who are a member of the Northern Ireland Assembly.
- :EuroparlMEP — The subset of people annotated Politician who are a member of the EU parliament.
- :Party — Political parties: currently includes all parties from both the UK (including Northern Ireland) and the Republic of Ireland
- :Group — Groups which are political in nature but are not political parties. This currently includes far-right groups from both the UK and the Republic of Ireland
- :Organization — Entities found by GATE's ANNIE named entity recogniser. This has been extended to cover Irish and EU institutions and these will have the orgType feature ‘government’.
- :Location — Entities found by GATE's ANNIE named entity recogniser, extended to further cover places within Northern Ireland and the Republic of Ireland.
- :Person — Entities found by GATE's ANNIE named entity recogniser.
- :Date — Entities found by GATE's ANNIE named entity recogniser.
- :Hashtag — Entities found by the TwitIE enrichment service.
- :UserID — Entities found by the TwitIE enrichment service.
- :URL — Entities found by the TwitIE enrichment service.
- :Sentence — Entities found by GATE's ANNIE named entity recogniser.
- :Tweet — If documents are loaded from Tweet JSON this will include the original metadata.
This works to extend the service to cover Irish politics has been produced as part of the EDMO Ireland project: https://edmohub.ie/
1. New Information Sources
As discussed above the text enrichment envisaged for the EDMO Ireland project has required the collection of a number of new data sets from which we can build the relevant tools. These new data sets fall into one of two categories. Either the data set is relatively static (e.g. the names of towns) or is more dynamic (e.g. the names of the currently elected politicians). The rest of this section briefly discusses both types of data set and the specific sources we have included within the new text enrichment application.
1.1. Static Data Sets
Static data sets are usually the easier type to collect; often as someone else has already done the hard work. For this application, we’ve collected three different sets of data which fall into this category: place names, political institutions, and far-right hate groups.
1.1.1. Place Names
The application already includes ANNIE which contains a pretty good gazetteer for British place names, but only covers a few major towns/cities in Northern Ireland and the Republic of Ireland.
Extending this to cover more of the places in Northern Ireland is very easy as the government provides such a gazetteer (https://www.opendatani.gov.uk/@land-property/osni-open-data-gazetteer-place-names1) under the Open Government License (https://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/).
Unfortunately, there does not seem to be an equivalent gazetteer for the Republic of Ireland (at least not one that is free and easily accessible). Fortunately Wikipedia has a single page listing the towns and villages within the Republic of Ireland (https://en.wikipedia.org/wiki/List_of_towns_and_villages_in_the_Republic_of_Ireland) which could be easily turned into a gazetteer.
1.1.2. Political Organizations
ANNIE already had a “government” category for Organizations, but this didn’t include the Irish parliament. For this we’ve taken the structure from this Wikipedia page: https://en.wikipedia.org/wiki/Oireachtas Also missing was a comprehensive set of EU institution names. Fortunately the EU provide a number of web pages which document all the main institutions and agencies: https://european-union.europa.eu/institutions-law-budget/institutions-and-bodies/types-institutions-and-bodies_en
1.1.3. Far-Right Hate Groups
Whilst new far-right hate groups do appear, it is fortunate that this only happens periodically, and as such we are viewing this as a static data set. Wikipedia maintains a long list of British based groups which it was trivial to turn into a gazetteer for use within the app. It was hard to find any details on Irish groups, partially as there aren’t very many. What details we have been able to include comes from a report by Civic Nation, which also discusses why there are so few groups: https://civic-nation.org/ireland/society/radical_right-wing_political_parties_and_groups/
1.2. Dynamic Data Sets
The majority of the dynamic data sets we are interested in relate to people. Specifically, politicians change on a regular basis via elections (both scheduled and by elections) and we would like to keep our application up to date in the easiest way possible. Previously we had been taking data from EveryPolitician (https://everypolitician.org) but the project was frozen in June of 2019. Since then a lot of the data has been added to Wikidata. Whilst it is great that the data has not been lost it is no longer quite as easy to access. EveryPolitician provided easy to use CSV files for each parliamentary session which could be turned into a gazetteer with little to no work. Given Wikidata’s more general UI, such files are no longer available.
Fortunately as well as being able to browse Wikidata you can run SPARQL queries against it to select subsets of data which you can then download as CSV files. So for our purposes we simply need to build a query per gazetteer we wish to generate.
It wouldn’t be appropriate to include every SPARQL query within this description, but here is one example, to explain how we get the information, as well as how minor tweaks can easily allow us to collect data for other sets of politicians should the need arise. This specific example is for locating all the current members of the 33rd sitting of the Dáil (Ireland’s parliament).
SELECT DISTINCT ?itemLabel ?groupLabel ?districtLabel ?twitter_handle ?twitter_id ?genderLabel WHERE { ?item p:P39 ?mem . ?mem ps:P39 wd:Q654291 ; pq:P2937 wd:Q85677302 . OPTIONAL { ?item wdt:P21 ?gender } OPTIONAL { ?mem pq:P768 ?district } OPTIONAL { ?mem pq:P4100 ?group } OPTIONAL { ?item p:P2002 ?twitter . ?twitter ps:P2002 ?twitter_handle . ?twitter pq:P6552 ?twitter_id} SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } FILTER NOT EXISTS { ?mem pq:P582 ?end } } ORDER BY ?itemLabel
Whilst this query may look rather cryptic (especially given the use of numeric predicates) it can essentially be broken down into three main parts. It first selects all the people who are members of the 33rd sitting (ps:P39 wd:Q654291 means they are a member of the Irish parliament, and we limit to the 33rd sitting which is wd:Q85677302). Having found the correct set of people we then extract all the information about them that we want, such as gender, political party, which constituency they represent, and their twitter account. This information is all optional so the resulting sheet is only guaranteed to have the person's name in the first column. Finally we state that we want English labels (rather than any translations) and that we want to ignore anyone who is no longer in the parliament (i.e. an end date has been given for them leaving).
Whilst this query is clearly not easy to read, the Wikidata query UI (https://query.wikidata.org/) does help by providing tooltips over the different identifiers. Hopefully you can imagine how given this query we could modify it for a different session of the Dáil or even for a different legislative body; for example, we use an almost identical query to extract the set of current MEPs into a gazetteer as well. The specific queries used for each gazetteer list generated this way is saved along with the app and is available on request.