Ellen Blogs Research: August 2013

My mum has appeared on this blog before. But not, until now, as the hero of a post. She works in the fundraising department of a reasonably large charitable organisation in the medium-sized town where I grew up. Earlier this year, she emailed and asked me whether I had access to a certain publisher’s online website, as she couldn't get hold of an article she wanted to read.

My first reaction was pleasure that my repeated explanations of what I do for a living have actually sunk in. Then, snapping into web-native daughter mode, I rolled my eyes and clicked on the link she sent me, ready to tell her that she would probably have to pay to get at the article. As it turned out, the piece was published way back in 1998 by an author based at an organisation which does research, but isn’t a university. When I requested the PDF, I was taken to a page offering a range of options for access, all of which required an institutional login. No pay-per-view option at all.

So, I took to Twitter. What should she do? Several suggestions came back. How about DeepDyve? Good idea: I had a look. The journal’s not on there. What about walk-in access at the nearest university? Well, discounting the fact that it’s a good half-hour journey to get there plus the time she’d have to spend registering and finding the hard copy of the article, it wouldn't do much good; their holdings of the journal start in 1999. Email the author, said someone else. Maybe – we didn’t try. But a Google search reveals that he’s now left the organisation he was affiliated with and moved to the US. I wonder how easily he'll be able to find that PDF from 1998 – probably three or four computers ago? Of course there’s no repository at his old organisation – it’s not a university.

I had some ideas of my own. Mum’s charity is very loosely connected to the NHS. Perhaps that’s a way in? But – unsurprisingly – the NHS library services don't subscribe to niche journals on fundraising. This is probably good news for the taxpayer, but not for my mother. BL Document Supply? The journal’s not there.

This fairly mundane little anecdote is so interesting to me. We know, from previous studies, that the paywall can be a big, big barrier to researchers outside academia who are seeking to get hold of published information. (I should add that ‘Ask your daughter’ was not a survey response option on that particular piece of RIN research.) In this case, though, the real challenge is that there is no straightforward way for my mother to access that journal article legally. She could ask her employer to take out a multi-thousand dollar subscription which includes back issues, but they don’t have a library or librarian to manage the process and have no need for any other content from the journal. The article doesn't appear in a Copac search but she could scour university library websites in case she finds one which holds the 1998 edition of the journal that she wants, travel there, register as a walk-in user and hope that she’s allowed onto the computers to read it. Neither of these options are likely to be approved by her budget-conscious line manager.

I find myself wondering who this situation benefits. Not the publisher – they’re not making any money out of Mum and in fact are denying themselves the opportunity to make a small profit (not a big one – Mum is working for a charity after all). Not the author of the research, who would presumably be happy to know that it is available and being used. Certainly not my mother, and not the institution that she works for.

As I've said before, arguments for public access to research outputs are often built on a kind of moral foundation: the public have paid for it, so they should be entitled to see it. But there's another argument - maybe even a bigger one - about re-opening an enormous archive; articles which are currently shut off because they're not sufficiently in-demand for publishers to invest in one-off access solutions. That archive might be especially useful to people working outside the fast-moving, detailed world of academia, although in some subjects the half-lives of articles would easily cover the fifteen years I'm discussing here. But in an age of 'impact', it seems sad to limit the reach and availability of articles simply because a publisher hasn't implemented a pay-per-view or rental option. (We can haggle about the price another time...)

I hate to leave a story unfinished, but I'm afraid I can't tell you whether Mum did eventually get her article. As I've said, she couldn't do it legally. Perhaps I emailed a friendly librarian to ask for a copy? Hmmm. You might very well think that; I couldn't possibly comment.

Today, I fulfilled a lifetime ambition by appearing in the Guardian. Well, OK, not lifetime. I've only been reading it for about seventeen years. And when I say 'appearing' what I obviously mean is having some research that I worked on alluded to, without any citation, quotation or link to the findings. But still... you take your victories where you find them, right?

Actually, it was a rather dispiriting experience. The journalist had picked up on one finding from our two-year project and used it as a hook for her piece, on how universities are engaging with big data. The finding was one that I blogged about quite early in the project. At the start of that post, there is a big line in bold type which essentially says 'this finding is dodgy! Don't use it!'. We subsequently did some further analysis and came up with a more nuanced interpretation of the data which told a more ambiguous story.

Guess which one made it into the piece?

This being the Guardian, any Tom, Dick or Harriet can weigh in with his or her two penn'orth in the comments section. This makes for pretty fun and occasionally informative reading on some of the articles. But most comments on our work fell into one of two categories. First: 'well durr! how much time and money went into proving this extremely obvious finding?' and second: 'surely these idiot researchers can see that not using the library is a symptom of failure, not a cause?'.

This whole situation relates to some things I've been considering for a while about public access to research, one of the Government's big arguments in favour of open access. I know that people hold quite strong views about the public's ability to engage with academic outputs. I don't have any evidence on that to sway me either way. But this one experience highlights a few points that I'm not sure we really talk about enough when it comes to openness.

First: research is messy. Being open about this messiness is good, but it carries some heavy risks. Before we blogged the early, flawed but headline-grabbing finding, we had a long conversation about whether it was right to share it. We knew that because it was a hard number telling a positive story about libraries, people would pick it up and use it. I was afraid that the message about its flaws would get lost in re-tellings. But we decided that the project was about being open, and openness means showing your working. Unfortunately, I've been proven right. The later, better, results are ignored, and so is the clear health warning on the early, messy ones, because the simple story is too compelling.

Second: just because we make something open doesn't mean people will actually read it. (Is this the publishing version of horses - water - drinking?). We fell at the first hurdle when the Guardian journalist neglected to link to our blog, showing all the findings. But it's not that hard to find via Google (there it is, result number five). Instead, people simply engaged with the journalist's flawed and partial representation of our results. If they had read - even glanced at - the project blog, they would have seen that the finding about dropping out was one tiny part of a much bigger research project on supporting student library usage, which answers the 'what a waste of time' objection in the comments. And they would also have seen that almost every blog post about findings stresses that correlation is not causation; our findings are indicators to support interventions or areas for further research, not explanations for student outcomes. So they don't need to tell us that the relationship isn't causal - we know. But because people only want or have time to engage with the journalist's interpretation, they have a very incomplete understanding of the research.

Third: what is a researcher supposed to do when this kind of thing happens? I'm trying to clear up some errors in this post but, even on a good day, I can't claim that Ellen Blogs Research has the Guardian's reach. Should I go into the comments section and respond to the same misunderstanding each of the seventeen times it occurs? Should I contact the journalist with a hissy-fit email and demand right of reply in the well-read Corrections and Clarifications column?

Finally: some people are really stupid. What's that? Our findings about undergraduates must be nonsense because you finished your postgraduate degree and got a first without using the library once? Well, thank GOODNESS you were around to clear that one up for us! Our two years of statistical analysis completely fall apart in the face of your single anecdote.

Deep breath. I am aware that this isn't life-or-death stuff. Nobody is going to suffer because a few hundred Guardian readers go away with a misunderstanding about a fairly specialist research project on student library usage. But these are questions we need to consider as we begin to open up all scientific research for public access, because some of it will be life-or-death stuff. Let's consider Andrew Wakefield and the MMR nightmare. In this case, a person may have died because of irresponsible scientific reporting and the public's inability to engage with the messiness of science. People want a clear and simple story, and journalists are happy to provide it. And once that story was in the public domain, it proved extremely difficult to counteract, even among people who, by their own confession, ought to have known better.

Now, we might argue that open access could be a solution to these problems. We no longer have to rely on journalists to interpret the findings, we can go back to them ourselves and see what they actually say. But my experience today suggests that this overestimates the enthusiasm and ability of the general public (or, at least, that bit of it which reads and comments on the Guardian website). And, even if people did go back to the original research, would they understand the findings? I'm pretty sure the chap with the anecdata about his degree success wouldn't.

I believe in open access. I think it is a good thing that the general public should be able to see the results of scientific research. But I think we also need to acknowledge that making this complicated, messy, highly technical content open to people who don't have the expertise - or perhaps even the inclination - to explore it properly, is a risk. And that if we are serious about openness we need to do more to help people find, read, understand and critique the original research outputs. I don't know how we do this. But I'd certainly like to start trying to find out.

Ellen Blogs Research

Monday 19 August 2013

How my mother almost became an international copyright criminal

Monday 5 August 2013

Is openness always best?