Barcelona Europeana Hackathon!!!

during two days (June the 8th and 9th), Jaume Nualart and myself have been playing around the Europeana API (although we think API is too much a word for such search engine). We were very lucky to be selected as two of the twenty-something participants for the Barcelona hackathon.

More than 19000000 records are waiting to be used in innovative ways more than just searching and browsing. Most hackers in the event tried to visualize records, create mobile apps and other very nice stuff; we adopted a very different approach: we are interested in better understanding the quality of the metadata describing such records and the “aggregated” quality of the data providers. So we took the data set description file and we tried to retrieve a small subset of records for each source (with 384 records we ensure an error of +/- 5% for a 95% level of confidence). Then we analyze the metadata present in such records in order to know, for instance:

  1. What is the size of the original collections / providers?
  2. Which licenses are the most used in Europeana?
  3. What about the quality of the extended record metadata, i.e. including DC terms?

Here you have a link to our working demo. Some trash scripts for retrieving records and data sets can be found here.

Regarding our experience with the Europeana API, here you have some thoughts we would like to share with you:

  1. A simple search engine is not a true API.
  2. There are known issues when searching by provider name or collection name (spaces, html special characters, and so). Even though working around such issues, some searches just fail.

Anyway, it was **cking fun!!!

And thank you to the Museu Picasso for hosting such an event in a very nice place, good food (but no alcohol LOL) and helpful staff!!!

