Tuesday, November 06, 2012

Scholarly metrics with a heart

I attended last week the PLOS workshop on Article Level Metrics (ALM). As a disclaimer, I am part of  the PLOS ALM advisory Technical Working Group (not sure why :). Alternative article level metrics refer to any set of indicators that might be used to judge the value of a scientific work (or researcher or institution, etc). As a simple example, an article that is read more than average might correlate with scientific interest or popularity of the work. There are many interesting questions around ALMs, starting even with simplest - do we need any metrics ? The only clear observation is that more of the scientific process is captured online and measured so we should at least explore the uses of this information.

Do we need metrics ? What are ALMs good for

As any researcher I dislike the fact that I am often evaluated by the impact factor (IF) of the journals I publish in. When a position has hundreds of applicants it is not practical to read each candidate's research and carefully evaluate them. As a shortcut, the evaluators (wrongly) estimate the quality of a researcher's work by the IFs of the journals. I wont discuss the merit of this practice since even Nature journal has spoken out against the value of IFs. So one of the driving forces behind the development of ALMs is this frustration with the current metrics of evaluation.  If we cannot have a careful peer evaluation of our work then the hope is that we can at least have better metrics that reflect the value/interest/quality of our work. This is really an open research question and as part of the ALMs meeting, PLOS announced a PLOS ONE collection of research articles on ALMs. The collection includes a very useful introduction to ALMs by Jason Priem, Paul Groth and Dario Taraborelli.

Beyond the need for evaluation metrics ALMs should also be more broadly useful to develop filtering tools. A few years ago I noticed that articles that were being bookmarked or mentioned in blog posts had an above average number of citations. This has now being studied in much detail. Even if you are not persuaded by the value of quantitative metrics (number of mentions, PDF downloads, etc) you might be interested instead in referrals from trust-wordy sources. ALM metrics might be useful by tracking the identity of those reading, downloading, bookmarking an article. There are several researchers I follow on social media sites because they mention articles that I consistently find interesting. In relation to identity, I also learned in the meeting that ORCID author ID initiative has finally a (somewhat buggy) website that you can use to claim an ID. Also, ALMs might be useful for filtering if they can be used, along with natural language processing methods, to improve automatic classification of an articles' topic. This last point, on the importance of categorization, was brought up in the meeting by Jevin West who had some very interesting ideas on the topic (e.g. clustering, automatic semantic labeling, tracking ideas over time). If the trend for the growth of mega-journals (PLOS ONE, Scientific Reports, etc) continues, we will need these filtering tools to find the content that matters to us.

Where are we now with ALMs ? 

In order to work with different metrics of impact we need to be able to measure them and these need to made available. From the publishers side PLOS has lead the way in making several metrics available through an API and there is some hope that other publishers will follow PLOS. Nature for example has recently made public a few of the same metrics for 20 of their journals although, as far as I know, they cannot be automatically queried. The availability of this information has allowed for research on the topic (see PLOS ONE collection) and even the creation of several companies/non-profit that develop ALM products (Altmetrics, ImpactStory, Plum Analytics, among others). Other established players have also been in the news recently. For example, the reference management tool Mendeley has recently announced that they have reached 2 million users whose actions can be tracked via their API and Springer announced the acquisition of Mekentosj, the company behind the reference manager Papers. The interest surrounding ALMs is clearly on the rise as publishers, companies and funders try their best to gauge the usefulness of these metrics and position themselves to have an advantage in using them.

The main topics at the PLOS meeting

It was in this context that we got together in San Francisco last week. I enjoyed the meeting format with  a mix of loose topics but strict requirements for deliverables. It was worth attending even just for that and the people I met. After some introductions we got together in groups and quickly jotted down in post-its the sort of questions/problems we though were worth discussing. The post-its were clustered on the walls by commonality and a set of broad problem sets were defined (see the list here).

Problems for discussion included:

  • how do we increase awareness for ALMs ?
  • how to prevent the gaming (i.e. cheating to increase the metrics of my papers) ?
  • what can be and is worth measuring ?
  • how to exchange metrics across providers/users (standards) ?
  • how to give context/meaning/story to the metrics ?

We were then divided into parallel sessions where we further distilled these problems into more specific action lists and very concrete steps that can be taken right now.

Metrics with a heart

From my own subjective view of the meeting it felt like we spent a considerable amount of time discussing how to give more meaning to the metrics. I think it was Ian Mulvany who wrote in the board in one of the sessions: "What does 14 mean ?". The idea of context came up several times and from different view points. We have some understanding of what a citation means and from our own experience we can make some sense of what 10 or 100 citations mean (for different fields etc). We lack a similar sense for any other metric. As far as I know, ImpactStory is the only one trying to give context to the metrics shown by comparing the metrics of your papers with random sets of the same year. Much more can be done along these same lines. We arrived at a similar discussion from the point of view of how we present ourselves as researchers to the rest of the world. Ethan Perlstein talked about how engaging his audience through social media and giving feedback on how his papers were being read and mentioned by others was enough to tell a story that increased interest for his work. The context and story (e.g. who is talking about my work) is more important than the number of views. We reached again to the same sort of discussions when we talked about tracking and using the semantic meaning or identity/source of the metrics. For most use cases of ALMs we can think of we would benefit or downright need more context and this is likely to drive the next developments and research in this area.

The devil we don't know

Heather Piwowar asked me at some point if I had any reservations about ALMs. In particular from the point of view of evaluation (and to a lesser extent filtering) it might turn out that we are substituting a poor evaluation metric (journal impact factor) by an equally poor evaluation criteria - our capacity to project influence online. In this context it is interesting to follow some experiments that are being done in scientific crowdfunding. Ethan Perlstein has one running right now with a very catchy tittle: "Crowdfund my meth lab, yo". Success in crowdfunding should depend mostly on the capacity to project your influence or "brand" online. An exercise in personal marketing. Crowdfunding is an extreme scenario where researchers are trying to side-step the grant system and get funding directly from the wider public. However, I fear that evaluation by ALMs will tend to reward exactly the sort of skills that relate to online branding. Not to say that personal marketing is not important already, this is why researchers network in conferences and get to know editors, but ALMs might reward personal (online) branding to an even higher level.