This document outlines an approach for getting basic information about all members of the European Parliament (MEPs). The data collected will then be fed to the Quote Finder.

First, let’s get the list of all MEPs, provided by the European Parliament in a convenient xml format.

all_MEPs <- XML::xmlToDataFrame(doc = "http://www.europarl.europa.eu/meps/en/full-list/xml",
                                stringsAsFactors = FALSE)

There are the fields provided:

colnames(all_MEPs)
## [1] "fullName"               "country"                "politicalGroup"        
## [4] "id"                     "nationalPoliticalGroup"

And this is how the actual data looks like:

pander::pander(head(all_MEPs))
Table continues below
fullName country politicalGroup id
Magdalena ADAMOWICZ Poland Group of the European People’s Party (Christian Democrats) 197490
Asim ADEMOV Bulgaria Group of the European People’s Party (Christian Democrats) 189525
Isabella ADINOLFI Italy Non-attached Members 124831
Matteo ADINOLFI Italy Identity and Democracy Group 197826
Alex AGIUS SALIBA Malta Group of the Progressive Alliance of Socialists and Democrats in the European Parliament 197403
Mazaly AGUILAR Spain European Conservatives and Reformists Group 198096
nationalPoliticalGroup
Independent
Citizens for European Development of Bulgaria
Movimento 5 Stelle
Lega
Partit Laburista
VOX

This is useful information, but we still do not have reference to their social media accounts, which are instead given on the individual page of each of them. They can be extracted from there, even if some minor data cleaning is needed.

library("castarter")

SetCastarter(project = "ep", website = "meps")
index_links <- paste0("http://www.europarl.europa.eu/meps/en/", all_MEPs$id)
CreateFolders()
DownloadContents(links = index_links, type = "index")

Now, as it’s both more common and more practical to refer to EP groups by the shortened version of their name, we’ll need to match the long version provided on the official website with the standard short version.

GroupLong GroupShort
Group of the European People’s Party (Christian Democrats) EPP
Non-attached Members NA
Identity and Democracy Group ID
Group of the Progressive Alliance of Socialists and Democrats in the European Parliament S&D
European Conservatives and Reformists Group ECR
Group of the Greens/European Free Alliance Greens–EFA
Renew Europe Group RE
Confederal Group of the European United Left - Nordic Green Left GUE-NGL
readr::write_rds(x = as.list(groups$GroupShort), path = "EPGroupShort.rds")

It’s now time to combine all these data in tabular format to facilitate further use.

Does this adequately capture all the MEPs who are on Twitter?

On Twitter n
FALSE 318
TRUE 432

Unfortunately, not quite: more than 300 MEPs have not included information on their Twitter account on their official page on the website of the European Parliament, perhaps preferring to give more visibility to their Facebook account or for other unknown reasons.

Where else would it be possible to find a complete list of MEPs on Twitter?

The EP’s Newshub collects information from social media accounts of MEPs and other EP-related figures, but does not seem to offer anything like an actual list of accounts.

The press service of the European Parliament conveniently offers a Twitter list of MEPs of current MEPs.

library("rtweet")

twitter_token <- readRDS(file = "twitter_token.rds")

if (fs::file_exists("mep_df.rds")==FALSE) {
  mep_df <- rtweet::lists_members(slug = "meps-2019-2024",
                                  owner_user = "EuroParlPress",
                                  token = twitter_token)
  saveRDS(object = mep_df, file = "mep_df")
} else {
  mep_df <- readRDS(file = "mep_df.rds")
}

The list includes 527 MEPs, including about a hundred that did not list their account on the official website of the EP. It seems likely, however, that this is not yet a complete list.

## Parsed with column specification:
## cols(
##   NAME = col_character(),
##   TWITTER_URL = col_character(),
##   SCREEN_NAME = col_character(),
##   NATIONALITY = col_character(),
##   GROUP = col_character()
## )

Eliflab maintains a nicely formatted list of MEPs on Twitter, which includes the Twitter account of 748 MEPs.

By matching these three sources, and after some manual checks, it is possible to reach a reasonably complete and updated coverage of MEPs who are on Twitter.

Creating a list

While it is possible to request tweets for each user, to facilitate regular collection of tweets from a relatively large number of users such as in this case it is probably easier to create a Twitter list with all relevant users, and then ask Twitter for new tweets in that list (rather than ask ~600 times for each individual MEP).

The Twitter list is now available at the following link: https://twitter.com/EdjQuoteFinder/lists/current-meps

Get back some data from Twitter

The problem is that, as anyone who has been analysing twitter for a while will know, Twitter users can change their handle. And many MEPs do it, for different reasons: some add “MEP” at the end of their handle, some “EU”, some include some other reference to their political affiliation, or remove something that was ok during the campaign but does not sound serious enough once elected.

To facilitate consistent pairing of the MEP with a Twitter profile, it is useful to retrieve the unchangeable numeric user id that Twitter assigns to every user.