Last Monday, I had the chance to participate in the workshop ‘Big Data, Better Data’, organised by the Culture Metrics research project. So, here my ramblings inspired by it, but for a precise review of the workshop and other related posts check the project blog.
During this workshop, I presented my work on the spinning statuette, an Egyptian statuette recorded spinning on itself in its case in the Manchester Museum. The video went viral in summer 2013 and I am examining online and on site reactions to the episode in a chapter of my thesis. Presenting this work in a Big Data workshop was an interesting experience, since in the past I had presented it at archaeology conferences, focusing on the influence of popular representation of archaeology emerging from social media comments (at TAG Bournemouth 2013) or on the use of the notion of ‘magic’ to explain the statue’s rotation (at TAG Manchester 2014).
This time, instead, I was supposed to discuss it as a sort of big data case-study: I guess I was totally surprised to be invited to this workshop, since I’d never thought before I was actually dealing with big data. And indeed, this was one of the questions that emerged in the discussion: is my analysis really based on big data?
I don’t think so, since the ‘quantity’ of data (e.g. around 15k tweets and 600 news articles) may seem somehow overwhelming to me, but probably would be considered quite limited by any IT person working in big data. However, this dataset appears comparable with those listed in Trends Watch 2014 report (p.30) – so the question of how big should data be to become ‘big data’ is still open. But probably that’s the wrong question: to me, methodologies and uses through which these data are gathered and analysed seem more relevant than quantity. Therefore, a second more relevant (self-)criticism – but thankfully this was not discussed in the workshop – is that I am still analysing these data with a qualitative approach, whereas big data would require different methods – and probably a much more advanced knowledge of maths and coding! I might have been the ‘mathematician’ of the group in most digs I participated in (i.e. guess who was the one always designated for drawing all the plans and keep track of the Harris matrix?), but this doesn’t avoid a total blank stare when someone starts speaking of logarithms & advanced coding!
Anyway, back to the spinning statuette, the main concern was probably that this is a one-off episode, but usually museums and cultural organisations need to work with other types of data: visitor numbers and demographic data on their audiences, for example. In this sense, the example I discussed after Christmas of the data collected by Dolomiti Superski is probably closer to what museum managers would be interested in. As I discussed in that post, it would be indeed interesting to see something like that applied to the museums – beside the already existing example of use of big data for visitor tracking (see here for few examples).
But, this brings another question: whereas this could be doable in cases were multiple museums are well connected and share common needs, audience analysis frameworks and funding concerns, how could this be applied in practice in a context where everyone is following different segmentations and recording different data with different priorities? I guess that though funding bodies are pushing in the direction of big data, the definition of big data still need a lot of translation and adaptation work before being applied in the cultural sector: what data? for what purposes? what use for the public and what for the funding bodies? – as often, especially within the UK funding system, buzzwords precede and impose the changes.
In the case of big data this is further complicated by the fact that big data so far have been developed by web giants such as Google or the usual social networks mostly for marketing purposes; sometimes, sure they have provided much information also to government bodies or researchers (and I find especially their application in medical research inspiring), but their use is still discussed and debated mostly in its economic implications. The problem with cultural organisations is that they operate on a different scale and are demanded to support an economic growth, but while offering different values and using a different language from that of big multinationals companies. Therefore, the concept of big data is still at its beginning there, and the ‘Big Data, Better Data’ workshop was indeed an interesting experience that surely raised my interest in the potential and challenges of adopting big data within the sector.
A second issue that I found interesting is that while big data originates online, in our everyday online activity – and in my paper I tried to highlight how many different type of data were left behind by who participated in the discussion of this viral video – most museums do not seem to draw a lot on this type of data: as the Museum Analytics platform shows, likes and followers are still the main data collected for evaluating museums on social media. Despite a long series of papers and researches investigating museum engagement via social media (see for example all the various Museums & the Web conferences), these data do not seem to be as much valued as the traditional visitor numbers, etc. – so the application of big data involves a huge preliminary work in putting them high on the managers’ agenda, and secondarily identifying what data could actually be collected and analysed and how they could be used. In conclusion, somehow, I found myself agreeing again with Nathan Jurgenson’s essay on the New Inquiry.
To complicate things further, also open data were added to the discussion. This was interesting because it highlighted again the ambiguity of these concepts in the museum sector: shall we use big data and making them open, as research institutions could do, or shall we use big data for marketing purposes and to evaluate and compare ‘economic’ performances, and keeping them ‘secret’? What about privacy? Or transparency? Do museum need to use big data to develop more stressful benchmarking processes or to collaborate advocating their relevance with funders? Indeed, while it could be said that open big data is the next ‘big thing’, each one of this data implies different challenges. At the moment, I think that big data implies identifying data sets and appropriate research questions; open data implies finding long term repositories, making them accessible (i.e. easy to find and use for others), and accepting that others could see the data.
But I’ve better stop here with my ramblings, and I will hopefully discuss again open data whenever I will manage to participate in an Open Data challenge such as this one.