Cellular Consequences of Genetic variation: publishing

Showing posts with label publishing. Show all posts

Wednesday, July 23, 2025

Why do we still publish in scientific journals ?

We publish in scientific journals to disclose our discoveries, such that others can build upon them. But we now have preprint servers and we can quickly make our discoveries available to others. So maybe we publish in scientific journals because we value the peer review that is organized by them. However, we also have now journal independent peer review systems, like Review Commons, which allow us to perform peer review on top of preprints, in a way that does not require subsequent submission to a scientific journal. So why do we still publish in scientific journals ?

Once in a while someone online complains about the cost of open access publication fees, the so called article processing charge (APC). Looking at this simplistically, it does seem ridiculous that a journal might ask the authors $5-10k USD to publish a paper when all the work is apparently done by scientists that write and review the articles. Of course, this APC cost is a lot more complicated than this and there is an historical context and background knowledge that is needed to discuss these. In reality, a lot of the cost goes into sustaining the editorial salaries of journals with high rejection rates. I covered this in detail in a previous blog post discussing the costs from EMBO Press. In addition to the editorial salary costs for journals with high rejection rates, we also don't have a free market since we don't pick journals based on price and service quality but on how publishing in certain journals will be perceived by others.

So, for many reasons, the major costs of scientific publishing are not the act of peer review and making knowledge public. If I had to guess, the actual costs publishing a peer-review article with near 0% rejection rate would be below $500USD per paper if done in high volume. The main costs of publishing are primarily the costs linked to the system of filtering scientific publications into tiers of perceived "impact". It was, for a long time, nearly impossible to evolve scientific publishing and I have argued for almost 20 years that we needed to split the publishing process into modular bits that would allow for much more innovation. With the rise of preprints, social media and dedicated peer-review services, I think we now could work towards getting rid of scientific journals. Or at least, we now have a clear direction of focus on what is missing in this potential alternative system - a new reward infrastructure.

The reward infrastructure in science

So why do we still publish in scientific journals ? The reality is that people still want to chase high impact journals. Pretending that we don't is not going to change anything. Despite having tenure and secure funding for my group, I feel that I cannot stop trying to publish in some journals because of what it means for the career of my lab members; for how my peers perceive and evaluate our work; for establishing new collaborations and applying for additional funding. So how are we going to change this and what could the consequences be ?

Unfortunately, there is no incentive for any single individual to change the reward system. At least as of now, this would require a large number of labs within a sub-field to jointly commit to a change in practice, perhaps assisted by some external entity. We could assume that social media, conferences and recommendation engines (Google Scholar) are enough to spread knowledge and that within a specific sub-field it is possible to evaluate each other without the need for journal proxies. I am not sure this is really true but if we accepted this, then a number of labs in a field could commit to no longer publishing in scientific journals. This could be assisted by, at the same time, creating an overlay journal of their field where academic editors would select a subset of peer-reviewed preprints that represent some particularly strong advance in the field.

Unfortunately, this idea is unlikely to work because it relies on collective action by a majority of groups within a field. I don't have better ideas but this is for me the last barrier remaining. We still need to work out how we would pay for the peer-review service but ideas that would help change the reward system in a way that do not require collective action are now what is needed.

What could go wrong if it happened

Despite all that we complain about in our current system of tiered journals, they do aim to improve science. They might not work as intended but they aim to filter science by accuracy and perceived value to others. If we managed to get rid of these things, we could have an even worse problem with the sheer number and quality of scientific outputs. As an almost anecdotal evidence, our group has become at lot worst at working through the revisions of our papers in a timely fashion. If our manuscripts were not out as preprints I think we would be much more in a hurry to do the revisions.

The other important caveat around this is that time and attention is always limiting. There will always be a need to filter and evaluate science by proxies. If we didn't have science journals we might be complaining about how attention in social media is being used a bad proxy for the value of research.

I am truly curious to know how scientific interactions would change without scientific journals. Would people still want to apply to our group, want to collaborate on projects, invite us to conferences if our outputs were essentially peer-reviewed preprints? For my lab members that might read this - don't worry, this is not a declaration of intention.

Wednesday, February 02, 2022

A closer look at the costs of EMBO publishing

There has been a lot of discussions on social media about the price that some publishers are coming up for publishing a paper in their journals - the so called article processing charges (APC). With some journals asking for values that are on the order of 10k and many scientists finding these values to be outrageous. Given that journals don't work to produce the research articles and get academics to do the evaluation, how can these journals claim the costs of publishing a paper to be anywhere close to 10k ? While I agree that these are outrageous values, I don't really believe that the price is mostly profit. A good source of information for the costs associated with running a publisher are those that have been disclosed by EMBO Publishing. Before we go into these I need to disclose that I serve on the Publications Advisory Board of EMBO publishing. I don't receive anything from EMBO and this is merely an advisory committee but it has given me some insight into what is a very real attempt from non-profit publisher to come up with an APC that is low and what they could compromise on their current set-up to achieve it.

With that out of the way lets just look at the most recent numbers that EMBO has disclosed which were for 2019 (see here). EMBO has (or had in 2019) 17 professional scientific editors and 6 support staff, that handled a total of 5,766 submissions in 2019. That is on the order of 28 submissions handled per month per editor, 1.3 per working day. I don't know about you but making a call on 1 paper per day plus finding/chasing reviewers is not easy if you try to do it properly, even if you can make some rejections fairly quickly. From these they ended up publishing 472 (8%). This part is not totally transparent, for example maybe some of the submissions included the reviews and news&views articles that were ultimately also published. If that is the case then the total number published would be 681 (12%). It is also not totally clear if the submissions include also revision submissions. Regardless, this shows that the total of EMBO publishing ends up having acceptance rates that are quite low (10-20%). I should stress that I truly don't know the actual number. As we easily see, this rejection rate is really key for the high estimated cost per paper.

The costs that they have disclosed includes ~2,5 million euro for the EMBO Press office, of which around 2 million is listed as salaries and benefits. The number of staff is there as well so you can guestimate the average salary for the 23 staff and you can also look up EMBO editor salary on Glassdoor to get an idea. I truly don't know what the salary is but I guess on average it could be on the order of 4-6k net per month. The other costs include 1,723,639 euro that EMBO Publishing pays to Wiley which in fact does the actual publishing. The majority of this cost is listed as "Wiley publishing services (incl. production, sales and marketing)" (1,281,552 euro). This is certainly a place where costs are not very transparent, at least to me, and where profit to Wiley is included, likely with a decent margin. I certainly don't know enough about finances to figure out but Wiley is claimed to have around 30% of operating profit margin but for the purposes of some later calculations, lets assume that maybe 50% of these costs are profit that could be magically removed (e.g. EMBO sets up their own publishing infrastructure). Finally, EMBO also lists 1,342,374 euro in "surplus" which is re-invested into some publishing related actives like the EMBO Source Data project, other pilots trying to innovate on the publishing side and back to EMBO itself which further supports EMBO program activities (fellowships, etc).

With these numbers then the total cost includes the 4,225,920 of actual cost and the 1,342,374 for EMBO activities (5,568,294 euro total). So if you don't take anything out of this, you would need a price of 11797 euro for each of the 472 paper published in 2019 to finance this. If you exclude the EMBO surplus that would be 8953 per paper and excluding 50% of Wiley costs it would get down to 7127 per paper. Even without anything from Wiley you would only get to 5301 per paper. Of course, you can also argue that the salaries costs could be lower but what can't really be argued is that academic editors can do this for "free" since that is time that most likely is even more expensive and less efficient.

So the 10k APC number certainly contains parts that can be reduced but we are not talking about a 1k per paper cost. For that you would need to change the rejection rates and this is what really starts mattering in the end. If you go to maybe something like 50% acceptance rates which could correspond to something like 2000 papers published in this case, then the APC could be somewhere on the order of 1500-2500 euro. Keep also in mind that submission numbers would tend to decrease over time if the impact factors go down with higher acceptance rates (yes, some people still care about those). Of course, this scales across multiple journals and this is where the big publishers are just taking advantage since the overall acceptance rate across the large portfolio of journals is much higher than 10% and high acceptance rate journals (e.g. Scientific Reports) can cross-subsidise low acceptance rate journals (Nature).

It is important again to keep in mind that all of these prices per paper have been there for decades but were paid via journals subscription charges instead of APCs and therefore they were not transparent and people were not really paying attention. In the end, the discussion for me is not really around the 30% savings we could have by pushing the publishers to lower their prices, but more about how we go about doing the filtering (i.e. target audience) and subjective evaluation of value to science (i.e. impact). Revolutions are not real solutions in academic publishing. If you propose a solution that requires a majority of people to change their habits in the span of 3 years it is dead on arrival.

Thursday, June 10, 2021

A not so bold proposal for the future of scientific publishing

Around 15 years ago I wrote a blog post about how we could open up more of the scientific process. The particular emphasis that I had in mind was to increase the modularity of the process in order to make it easier to change parts of it without needing a revolution. The idea would be that manuscripts would be posted to preprint servers that could accumulate comments and be revised until they are considered suitable for accreditation as a peer review publication. At the time I also though we could even be more extreme and have all of the lab notebooks open to anyone which I no longer consider to be necessarily useful.

Around 15 years have passed and while I was on point with the direction of travel I was very off the mark in terms of how long it would take us to get there. Quite a lot has happened in the last 15 years with the biggest changes being the rise of open access, preprint servers and social media. PLoS One started as a journal that wanted us to do post-publication peer review. It started with peer reviewed focused on accuracy, wanting then to leverage the magic of internet 2.0 to rank articles by how important they were through likes and active commenting by other scientists. The post-publication peer review aspect was a total failure but the journal was an economic success that led to the great PLoS One Clone Wars with consequences that are still being felt today - just go and see how many new journals your favourite publisher opened this year.

The rise of preprint servers has been the real magic for me. We live in each others scientific past by at least 2 years or so. If you sit down and have a science chat with me I can tell you about all of the work that we are doing which won't be public for some 2 years. If I didn't put our group's papers out as preprints you would be waiting at least 6-12 months to know about them. Preprint servers are a time machine, they move everyone forward in time by 12 months and speed up the exchange of ideas as they are being generated around the globe. If you don't post your manuscripts as preprints you are letting others live in the past and you are missing out on increased visibility of your own research.

Preprint servers also serve the crucial need to dissociate the act of making a manuscript public from the process of peer review, certification as a peer-reviewed paper and dissemination. This is important because it allows the whole scientific publishing system to innovate. This is needed because we waste too much money and time on a system that is currently not working to serve the authors or readers efficiently.

So after nearly 15 the updated version of the proposal is almost unchanged:

I no longer think it would be that useful to have lab notebooks freely available to anyone to read. There are parts of research that are too unclear and I suspect that the noise to information ratio would be too high for this to be of value. However, useful datasets that are not yet published could be more readily made available prior to publication. Along these lines, the ideas in the form of funded grant proposals should be disclosed after the funding period has lapsed. As for the flow from manuscript to publication, the main ideas remain and the system already exist to make these more than just ideas. There are already independent peer review systems like Review Commons. Such systems could eventually be paid and could lead to the establishment of professional paid peer reviewers. Such costs would then be deducted from other publishing costs depending on how the accreditation was done. Eventually "traditional" publishing could be replaced by overlay journals, like preLights, whose job would be to identify peer reviewed preprints that are of interest to a certain community.

Social media for me has been the most surprising change in scientific communication. I didn't expect so many scientists to join online discussions via social media. Then again, I didn't foresee the geekification of society. In many ways social media is already acting as a "publishing" system in the sense of distribution. Most of the articles I read today I find through twitter or Google Scholar recommendations. As we are all limited by the attention we can give, I think one day, instead of complaining about how impact factors distort hiring decisions we will be complaining about how social media biases distort what we think is high value science.

So finally, what can you do to move things along if you feel it is important ? If you think we have too many wasteful rounds of peer reviewing across different journals; that the cost of open access publishing is too high or even simply that publicly funded research should be free to read and openly available to mine ? Then the best single thing you can do today is make your manuscripts available via preprint servers.

Thursday, May 30, 2019

PlanS, the cost of publishing, diversity in publishing and unbundling of services

A few days ago I had another conversation about PlanS with someone involved in a non-profit scientific publisher. I am still sometimes surprised that these publishers have been very much reacting to the changes in the landscape. In hindsight I can understand that the flipping of the revenue model to author fees has been threatened for a long time but always seemed to be moving along slowly. Without going into PlanS at all, the issue for many of the smaller publishers is that they simply cannot survive under an author fee model because their revenue from the subscription would translate to an unacceptable cost per article (given that they reject most articles). These smaller publishers typically use their profit to then fund community activities (e.g. EMBO press). The big publishers will do just fine because they have a structure that captures most articles in *some* journal so their average cost per article would end up being acceptable in a world without subscriptions.

I don’t want to go into the specifics of PlanS at all but I see clearly the perspective of the founders and wider society of wanting to have open access and even reducing the costs of publishing. The publishers have been given quite a lot of time to adapt and maybe some amount of disruption is now needed. One potential outcome of fully flipping the paying model might be that we simply lose the smaller publishers and consequently lose also their community activities if they can’t find alternative ways to fund them. There are enough journals in scientific publishing that, to be honest, I think the disruption will not be large.

Less publishers means less innovation in publishing

What I fear we will lose with the reduction in the number of publishers is the potential to generate new ideas in scientific publishing. Publishers like EMBO press, eLife and others have been a great engine for positive change. Examples include more transparent peer review, protection from scooping, cross-commenting among peer-reviewers, checks on image manipulation, and surfacing the data underlying the figures (see SourceData). While this innovation tends to spread across all publishers it is not rewarded by the market. Scientific publishing does not work within a well-functioning economic market. We submit to the journals that have the highest perceived “impact” and such perceived impact is then self-sustaining. It would take an extraordinary amount of innovation to disrupt leaders in the market. For me, this is a core problem of publishing, the fact that the market is not sensitive to innovation.

To resolve this problem we would have to continue the work to reduce the evaluation of scientists by the journals they publish in. Ideas around alt-metrics have not really moved the needle much. Without any data to support this, my intuition is that the culture has changed somewhat due to people discussing the issue but the change is very slow. I still feel that working on article recommendation engines would be a key part of reducing the “power” of journal brands (see previous post). Surprisingly, preprints and twitter are already working for me in terms of getting reasonable recommendations but peer-review is still a critically important aspect of science.

Potential solutions for small publishers

Going back to the small publishers, one thing that has been on my mind is how they can survive the coming change in revenue model. Several years ago I think the recommendation could have been to just grow and find a way to capture more articles across a scale of perceived impact (previous post). However, there might not be space for other PLOS One clones. An alternative to growing in scale would be to merge with other like-minded publishers. This is probably not achievable in practice but some cooperation is being tested, as for example in the Life Science Alliance journal. Another thought I had was then to try to get the market to appreciate the costs around some of the added value of publishing. This is essentially the often discussed idea of unbundling the services provided by publishers (the Ryanair model?).

Maybe the most concrete example of unbundling of a valuable service could be the checks on non-ethical behavior such as image manipulation or plagiarism. These checks are extremely valuable but right now their costs are not really considered as part of the cost of publishing. Publishers could consider developing a package of such checks, that they use internally, as a service that could be sold to institutions that would like to have their outgoing publications checked. Going forward, some journals could start demanding some certification of ethical checks or funding agencies could also demand such checks to be made on articles resulting from their funded research. Other services could be considered for unbundling in the same way (e.g. peer review) but these checks on non-ethical practices seem the most promising.

(disclosures: I currently serve on the editorial board of Life Science Alliance the Publications Advisory Board for EMBO Press)

Monday, December 14, 2015

Replace journals with recommendation engines

There was another round of interesting discussions on twitter after Mike Eisen decided to scrub all journal tittles from his lab's publication list. Part of the discussion was summarized in this Nature news story. The general idea is that our science should not be evaluated by where it is published but should stand by its own merit. We all want this to happen but unfortunately we don't have infinite time to read papers. Megajournal and open access advocates often dismiss this problem. They will often say that journal rankings are not adequate filters and that we should be able to make our own opinions and to search for whatever we want to read. This is the line of argument that just drives me crazy. It is basically implying that any defence of journal rankings is an admission of inability to evaluate science. The biomedical scientific community is producing over 100 thousand articles every month. Any suggestion that we don't need some sort of filtering mechanism is in turn an admission that you are not aware of the extent of science that is being produced. If you are not scanning table of contents yourself, you are being feed suggestions by someone that does.

Imagine a world without any science journals. Just a single pot where all articles are deposited. I think that the spread of knowledge would slow down. I can barely keep track of advances and authors that are closely related to my work by using keyword searches. I would not think one day to just search for "clustered regularly interspaced short palindromic repeats" or to have a curious look into advances in cryoEM. In the absence of good filters we would risk becoming even more isolated in our small little corners of science and miss out on cross-fertilization. We would tend to focus even more on the science of a few labs that we knew from past works or from personal contact. I would not know where to look at for important new discoveries in other fields that could impact my own. The current system of journals serve this role of trying to assign a piece of science to a target audience. If nothing else, journals can filter through self-selection of topics at submission for specific communities. Less specific journals try to promote the advances in science that should reach a broader audience. I think that we are not even aware of how much the current system of journals facilitates the exchange of information within and across fields. In my opinion, the best way and probably the only way to get rid of the current system is to replace it by something that can do the equivalent job.

One way to replace the current system, by something less frustrating, would be to use automated recommendation engines. I have tried Google Scholar recommendations and Pubchase and both work really well. If we want to get rid of journals we need to figure out a way for such automated systems to mimic the journal's transfer of knowledge within and across communities. I can easily imagine the steps needed to come up with article similarity metrics and clustering of users and so on. One can also easily imagine that the recommendation engines can react to user feedback such that a niche community will "bump up" - for example by click-trough counts - the perceived value of a piece of science to such an extent that it get's recommended to a wider community. This would require a hierarchical recommendation engine that is widely used. The biggest advantage of such a system would be that it can work post publication on top of megajournals. Scientists could stop focusing their energy on submitting to journal X and just focus on producing good science that would spread widely. I am convinced that the fastest way to get to a world without journals is to come up with this replacement. If we really want to get rid of impact factors and journal rankings we need to start talking about what we will do instead.

One thing we won't be able to change - we don't have enough time to read all of the science in the world. Unfortunately we don't even have enough time to read all of the articles of job applicants. It is not hard to predict that any other solution that replaces journal rankings will too often used to make hiring decisions.

Thursday, October 16, 2014

Science publishers' pyramid structure and lock-in strategies

It is not recent news that AAAS will start a digital open access multidisciplinary journal. It is called Science Advances and it will be the 4th journal of the AAAS family of journals. As I have described in the past this is part of trend in science publishing to cover wider range of perceived impact. Publishers are aiming to have journals that are highly selective and that drive brand awareness but also have journals that can publish almost any article that pass a fairly low bar of being scientifically sound. This trend was spurred by the success of PLOS One that showed that it is financially viable to have large open access journals. Financially, the best strategy today would be to have some highly selective journals with a subscription model and then a set of journals that are less selective that operate with an author paying model.

The Nature Publishing Group has implemented this strategy very well. They increased their portfolio of open access journals with the addition of Nature Communications, Scientific Data, Scientific Reports, the partner journals initiative and the investment/partnership with the Frontiers journals. NPG has also expanded their set of subscription based Nature branded research and review journals.

This combination approach is not just financially interesting it also protects the publishers from the future imposition of immediate open access via a mandate from funding agencies. Publishers that have such a structure in place would be able to survive the mandate while others that only have subscription based journals would struggle to adjust. This has the useful side-effect of actually speeding up the transition since the bulk of the papers will be increasingly published in the larger, less restrictive open access journals. If most of the research is open access there will be less justification to have subscription journals. This will also be true even for the most well cited papers, as reported recently by google scholar.

So, many publishers are trying to build this pyramid type structure. Even AAAS is doing it albeit (apparently) very reluctantly. One consequence of these changes is that there will be an abundance of large and permissive open access journals. Therefore, we will increasingly need better article filters, such as PubChase, as I previously discussed. These mega-journals will compete on price and features but the publishers as whole will increasingly try to lock authors into the structure. Any submission to the pyramid should be publishable in *some* journal of the publisher. If I was working in one of these publishing houses I would be thinking of ways to use brand power to attract submissions while adding such lock-in mechanisms.

Current practiced lock-in strategies

The best well known lock-in strategy is the transfer of referee reports within journals of the same publishing house. This is a common occurrence that I have experienced before. An editor at Cell might suggest authors to transfer their rejected article and reviewer comments to Molecular Cell or Cell Reports. Nature research journals might suggest Nature Communications or Scientific Reports. Science might suggest Science Signalling or Science Advances. This can be very tempting since it can shorten the time to get the paper accepted. The usefulness of this mechanism is going to work against the idea of having peer review outsourced to an independent accredited entity.

Cell press has an interesting mechanism that allows for co-submission to two journals at the same time (Guide to authors - Cosubmission). I never tried this but apparently one can select two journals and the article will be evaluated jointly by both editor teams. This looks more relevant for articles that fall in between two different topics that are covered by different journals. It is still an interesting way to improve the chances that a given article will find a home within this publisher.

Another more subtle approach might be to issue a topic specific call for articles. Back in February, Nature Genetics had an editorial with a call for data analysis articles. Note that the articles will not be necessarily published in Nature Genetics and the editorial mentions explicitly Nature Communications, Scientific Data and Scientific Reports. This allows NPG to use the Nature brand power to issue a call and then spread the resulting submissions along their pyramid structure according to input from reviewers and editors. The co-publication of a large number of articles on the same topic also almost guarantees a marketing punch.

Other ideas

I am curious to see what other ideas arise and please share other similar mechanisms that you might know of. One potential additional idea that is similar to the co-submission would be to have a chained submission. At submission the publisher could already ask for an ordered list of preference for journals. We might also start to see publishers requesting from reviewers comments with a group of journals in mind from the start.

Obviously, an alternative that would place less of a burden on reviewers and editors would be a mechanism similar to what the Frontiers journals have been promoting. Articles could be initially published at a large PLOS One like journal and then increase in awareness depending on article level metrics. This approach is probably going to take a much longer time spread widely.

Friday, November 01, 2013

Introducing BMC MicroPub – fast, granular and revolutionary

(Caution: Satire ahead)

I am happy to be able to share some exciting science publishing news with you. As you know, in the past few years, there has been a tremendous progress in open access publishing. The author-paying model has been shown to be viable in large part thanks to the pioneering efforts of BMC and PLOS. In particular PLOS One has been an incredible scientific and business success story that many others are trying to copy. Although these efforts are a great step forward they don't do enough to set all of the scientific knowledge free in a timely fashion. Sure you can publish almost anything today such as metadata, datasets, negative results and the occasional scientific advancement but the publishing process still takes too much time. In addition we are forced to build a story around the bits and pieces in some laborious effort to communicate our findings in a coherent fashion. Many of us feel that this publishing format is outdated and does not fit our modern quick-paced internet age. What I am sharing with you today is going to change that.

Introducing BMC MicroPub
In coordination with BMC we are going to launch soon the pilot phase of a new online-only publishing platform. It was though from the ground up to allow for the immediate publishing of granular scientific information. Peer-review happens after online publication of the content and evaluation is not going to be based on trivial and outdated notions of scientific impact. Best of all, it is tightly integrated with the social tools we already use today. In fact, authors are asked to register to the system with their twitter account and to link it to an ORCID author ID. From then on, their twitter feed is parsed and any tweet containing the #micropub tag will be considered a submission. Authors are themselves reviewers and any submission that gets re-tweeted by at least 3 other MicroPub registered scientists is considered to be “peer-reviewed” and a DOI is issued for that content. An author can create a longer communication by replying to a previous #micropub tweet and in this way create a chain that the journal can track and group in MicroPub stacks (TM). What the team involved here has done is nothing short of amazing. We are taking the same platform we use to share cute pictures of cats and revolutionizing scientific publishing. To start using the journal authors pay a one time registration fee followed by a modest publication charge for each published content. However, the journals is waving any charges for the first 100 authors and the first 100 publications. We hit a snag in discussions with Pubmed but with your support we will be tracked by them starting next year.

Pioneering granularity
The project started a few months ago after a first attempt I covered in a previous blog post. Right now we have also an exciting experiment in granular sharing of genome content underway. You can follow the tweets of @GenomeTweet to get an idea of the future of this brave new world. The current front page of the journal gives you an indication of some of the cool science being published by early adopters. The site is currently only available to beta-testers so here is screen-shot of the current version:

I sat down with the open access advocate Dr Mark Izen from UC Merced to discuss the new journal.

Dear Mark, given your enthusiasm for open access what do you think of this initiative?
I think that experimentation in scientific publishing is fantastic. Any attempt to promote open access and get rid of the current established closed access, impact factor driven system is a great thing. One concern I have is that, although the content is published under a CC0 license, the publishing process is currently reliant on Twitter which is a closed proprietary technology. We should really ask ourselves if this goes all the way in terms of open science.

Some would say that they don't have time to read science with this level of granularity so devoid of any significant advances. In the words of an anonymous tenured prof: “You must be joking right!?”. What would you say to these naysayers?
To be blunt, I think they lack vision. Ultimately we owe it to all tax payers to be as transparent as possible about our publicly funded science. Now we can do that, 140 characters at a time. Moreover, the possibility to drive science forward by making information available as quickly as possible is an amazing possibility. Scientists are already using twitter to share information, we are just going one step forward here and going to start sharing science as it happens. You could get corrections on your protocol before your experiments have finished running ! If you blink your eyes you may literally lose the next big discovery.

So you are not concerned that this increasing level of granularity, small bite research, is going to drown everyone in noise and actually slow down the spread of information in science ?
Absolutely not. It is a filter failure and we are sure that someone is bound to come up with a solution. In the future all sorts of different signals will allow us to filter through all this content and weed out the most interesting parts for you. You will be able to get to your computer and get in your email or in some website just the information that an algorithm thinks you should read. I am sure it is doable, it is question of setting it up.

Disclaimer: If you have not noticed by now, this is a fictional post meant to entertain.

Saturday, October 19, 2013

Scientific Data - ultimate salami slicing publishing

Last week a new NPG journal called Scientific Data started accepting submission. Although I discussed this new journal with colleagues a few times I realized that I never argued here why I think this a very strange idea for a journal. So what is Scientific Data ? In short it is a journal that publishes metadata for a dataset with data quality metrics. From the homepage:

Scientific Data is a new open-access, online-only publication for descriptions of scientifically valuable datasets. It introduces a new type of content called the Data Descriptor designed to make your data more discoverable, interpretable and reusable.

So what does that mean ? Is this a journal for large scale data analysis ? For the description of methods ? Not exactly. Reading the guide to authors we can see that an article "should not contain tests of new scientific hypotheses, extensive analyses aimed at providing new scientific insights, or descriptions of fundamentally new scientific methods". So instead one assumes that this journal is some sort of database where articles are descriptors of the data content and data quality. The added value of the journal would be to store the data and provide fancy ways to allow for re-analysis. That is also not the case since the data is meant to be "stored in one or more public, community recognized repositories". Importantly, these publications are not meant to replace and do not preclude future research articles that make use of these data. Here is an example of what these articles would look like. This example more likely represents what the journal hopes to receive as submissions so let's see how this shapes up in a year when people try to test the limits of this novel publication type.

In summary, articles published by this journal are mere descriptions of data with data quality metrics. This is the same information that any publication already should have except that Scientific Data articles are devoid of any insight or interpretation of the data. One argument in favor of this journal would be that this is a step into micro-publication and micro-attribution in science. Once the dataset is published anyone, not just the producers of the data, can make use of this information. A more cynical view would be that NPG wants to squeeze as much money as they can from scientists (and funding agencies) by promoting salami slicing publishing.

Why should we pay $1000 for a service that does not even handle data storage ? That money is much better spent supporting data infrastructures (disclaimer: I work at EMBL-EBI). There is no added value from this journal that is not or cannot be provided by data repository infrastructures. Yet, this journal is probably going to be a reasonable success since authors can essentially publish their research twice for an added $1000. In fact, anyone doing a large-scale data driven project can these days publish something like 4 different papers: the metadata, the main research article, the database article and the stand-alone analysis tool that does 2% better than others. I am not opposed to a more granular approach to scientific publication but we should make sure we don't waste money in this process. Right now I don't see any incentives to limit this waste nor any real progress in updating the way we filter and consume this more granular scientific content.

Saturday, June 08, 2013

Doing away with scientific journals

I got into a bit of an argument with Björn Brembs on twitter last week because of a statement I made in support of professional editors. I was mostly saying that professional editors were no worse than academic editors but our discussion went mostly into the general usefulness of scientific journals. Björn was arguing his positions that journal rankings in the form of the well known impact factor are absolutely useless. I was trying to argue that (unfortunately) we still need journals to act as filters. Having a discussion on Twitter is painful so I am giving my arguments some space in this blog post.

Björn arguments are based on this recently published review regarding the value of journal ranking (see paper and his blog post). The one line summary would be:

"Journal rank (as measured by impact factor, IF) is so weakly correlated with the available metrics for utility/quality/impact that it is practically useless as an evaluation signal (even if some of these measures become statistically significant)."

I covered some of my arguments before regarding the need of journals for filtering here and here. In essence I think we need some way to filter through the continuous stream of scientific literature and the *only* current filter we have available is the journal system. So lets break this argument in parts. Is it true that : we need filters; journals are working as filters; there are no working alternatives ?

We need filters

I hope that few people will try to argue that we have no need for filters in scientific publishing. On pubmed there are 87551 abstract entries for May which is getting close to 2 papers per minute. It is easy to see that the rate of publishing is not going down any time soon. All current incentives on the author and publishing side will keep pushing this rate up. One single unfiltered feed of papers would not work and it is clear we need some way to sort out what to read. The most immediate way to sort would be on topics. Assuming authors would play nice and not try to tag their papers as broadly as possible (yeah right) this would still not solve our problem. For the topics that are very close to what I work on I already have feeds with fairly broad pubmed queries that I go through myself. For topics that might be one or several steps removed from my area of work I still want to be updated on method developments and discoveries that could have an impact on what I am doing. I already spend an average of 1 to 2 hours a day scanning abstracts, I don't want to increase that.

Journals as filters

If you follow me this far then you might agree that we need filtering processes that go beyond simple topic tagging. Without even considering journal "ranking", the journals already do more than topic tagging since journals are also communities that form around areas of research. To give a concrete example both Bioinformatics and PLOS Computational Biology publish papers in bionformatics but while the first tends to publish more methods papers the latter tends to publish more biological discoveries. Subjectively I tend to prefer the papers published in the PLOS journal due to its community and that has nothing to do perceived impact.

What about impact factors and journal ranking ? In reviewing the literature Björn concludes that there is almost no significant association between impact factors and future citations. This is not in agreement with my own subjective evaluation of different journals I pay attention to. To give an example, the average paper on journals of the BMC series are not the same to me as average papers published in Nature journals. Are there many of you that have a different opinion ? Obviously, this could just mean that my subjective perception is biased and incorrect. This would mean also that journal editors are doing a horrible job and the time they spend evaluating papers is useless. I have worked as an editor for a few months and I can tell you that it is hard work and it is not easy to imagine that it is all useless work. In his review Björn points to, for example, the work by Lozano and colleagues. In that work the authors correlated the impact factor of the journal with future citations of each paper in a given year. For biomedical journals the coefficient of determination has been around 0.25 since around 1970. Although the correlation between impact factor and future citations is not high (r ~ 0.5) it is certainly highly significant given that they looked at such large numbers (25,569,603 articles for biomed). Still this also tell us that evaluating the impact/merit of an individual publication by the journal it is published prone to error. However, what I want to know is: given that I have to select what to read, do I improve my chances of finding potentially interesting papers by restricting my attention to subsets of papers based on the impact factor ?

I tried to get my hands on the data used my Lozano and colleagues but unfortunately they could not give me the dataset they used. Over email, Lozano said I would have to pay Thomson Reuters on the order of $250,000 for access (not so much reproducible research). I wanted to test the enrichment over random of highly versus lowly cited papers in relation to impact factors. After a few other emails Lozano pointed me to this other paper where they calculated enrichment for a few journals in their Figure 4, which I am reproducing here under a Don't Sue me licence. For these journals they calculated the fraction of 1% top most cited papers divided by the fraction of top 1% cited papers across all papers. This gives you an enrichment over random expectation that for journals like Science/Cell/Nature turns out to be around 40 to 50. So there you go, high impact factors, on average, tend to be enriched in papers that will be highly cited in the future.

As an author I hate to be evaluated by the journals I publish in instead of the actual merit of my work. As a reader I admit to my limitations and I need some way to direct my attention to subsets of articles. Both the data and my own subjective evaluation tells me that journal impact factors can be used as way to enrich for potentially interesting articles.

But there are better ways ...

Absolutely ! The current publishing system is a waste of everyone's time as we try to submit papers down a ladder or perceived impact. The papers get reviewed multiple times in different journals, reviewers think that articles need to be improved with year long experiments and discoveries stay hidden in this reviewing limbo for too long. We can do better than this but I would argue that the best way to do away with the current journal system is to replace it with something else. Instead of just shouting for the destruction of journal hierarchies and the death of the impact factor talk about how you are replacing it. I try out every filtering approach I can find and I will pay for anything that works well and saves me time. Google Scholar has a reasonably good recommendation system and it is great to see people developing applications like the Recently app. PLOS is doing a great job of promoting the use of article level metrics that might help others to build recommendation systems. There is work to do but the information and technology for building such recommendation systems is all out there already. I might even start using some of my research budget to work on this problem just out of frustration. I have some ideas on how I would go about this but this blog post is already long. If anyone wants to chat about this drop me a line. At the very least we can all start using preprint servers and put our work out before we bury it for a year in the publishing limbo.

Sunday, April 07, 2013

The case for article submission fees

For scientific journal articles the cost of publishing is almost exclusively covered by the articles that are accepted for publication. Either by the published authors or by the libraries. Advertisement and other items like the organization of conferences are probably not a very significant source of income. I don't want to argue here again the value of publishers and how we should be decoupling the costs of publishing (close to zero) from peer-review, accreditation and filtering. Instead I just want to explore the idea for a very obvious form a income that is not used - submission fees. Why don't journals charge all potential authors a fixed cost per submission, even if the article ends up being rejected ? I am sure publishers have considered this option and they have reached the conclusion that this is not viable. I would like to know why and maybe someone reading this can give a strong argument against. Hopefully someone from the publishing side that has crunched the numbers.

The strongest reason against that I can imagine would be a reduction in submission rates. If only some publishers adopt this fee authors will send their papers to journals that don't charge for submission. Would the impact be that significant ? For journals with high-rejection rates this might even be useful since it would preferentially deter authors that are less confident about the value of their work. For journals with lower rejection rates the impact of the fee would be small since authors are less concerned with a rejection. Publishers might even benefit from implementing a submission charge in the from of a lock-in effect if they do not charge when transferring articles between their journals. Publishers already use this practice of transferring articles and peer-review comments between their journals. It already functions as a form of lock-in since authors, wishing to avoid another lengthy round of peer-review, will tend to accept. If the submission fee is only charged once the authors are even more likely to keep the articles within the publisher. Given the current trend of publishers trying to own the full stack of high-to-low rejection rate journals these lock-in effects are going to be increasingly valuable.

The overall benefit would be an increased viability of open access. A submission fee might also accelerate the decoupling of peer-review from the act of publishing. If we get used to paying separately for publishing and for submission/evaluation we might get used to having these activities performed by different entities. Finally, if it results also in less slicing into ever smaller publishable units we might all benefit.

Update: Anna Sharman sent me a link to one of her blog posts where she covers this topic in much more detail.

Photo adapted from: http://www.flickr.com/photos/drh/2188723772/

Thursday, March 28, 2013

The glacial pace of innovation in scientific publishing

Nature made available today a collection of articles about the future of publishing. One of these is a comment by Jason Priem on "Scholarship: Beyond the paper". It is beautifully written and inspirational. It is clear that Jason has a finger on the pulse of the scientific publishing world and is passionate about it. He sees a future of a "decoupled" journal, where modular distributed data streams can be built into stories openly and in real time. Where certification and filtering are not tied to the act of publishing and can happen on the fly by aggregating social peer review. While I was reading I could not contain a sigh of frustration. This is a future that several of us like Neil and Greg debated at Nodalpoint many years ago. Almost 7 years ago I wrote in a blog post:

"The data streams would be, as the name suggests, a public view of the data being produced by a group or individual researcher.(...) The manuscripts could be built in wikis by selection of relevant data bits from the streams that fit together to answer an interesting question. This is where I propose that the competition would come in. Only those relevant bits of data that better answer the question would be used. The authors of the manuscript would be all those that contributed data bits or in some other way contributed for the manuscript creation. (...) The rest of the process could go on in public view. Versions of the manuscript deemed stable could be deposited in a pre-print server and comments and peer review would commence."

I hope Jason wont look back some 10 years from now and feel the same sort of frustration I feel now with how little scientific publishing has changed. So what happened in the past 7 years ? Not much really. Nature had an open peer review trial with no success. Publishers were slow to allow comments on their websites and we have been even slower at making use of them. Euan had a fantastic science blog/news aggregator (Postgenomic) but it did not survive long after he went to Nature. Genome Biology and Nature both tried to create pre-print servers for biomed authors but ended up closing them for lack of users. We had a good run at an online discussion forum with Friendfeed (thank you Deepak) before Facebook took the steam out of that platform. For most publishers we can't even know the total number of times an article we wrote has been seen, something that blog authors have taken for granted for many years. Even some cases where progress has been made, it has taken (or is taking) way too long. The most obvious example is the unique author id where after many (oh so many) years there is a viable solution in sight. All that said, some progress was made in the past few years. Well, mainly two things - PLOS One and Twitter.

Money makes the world go round

PLOS One had a surprising and successful impact in the science publishing world. Its initial stated mission was to change the way peer review was conducted. The importance of a contribution would be judged by how readers would rate or comment on the article. Only it turns out that few people take the time to rate or comment on papers. Nevertheless, thanks to some great management, first by Chris Surridge and then by Peter Binfield, PLOS One was a huge hit as an novel, fast, open access (at a fair price) journal. PLOS One, catch-all approach saw a steady increase in number of articles published (and very healthy profits) and got the attention of all other publishers.

If open-access is suitable as a business model then funding sources might feel that is OK to mandate immediate open-access. If that were to happen then only publishers with a similar structure to PLOS would survive. So, to make a profit and to hedge against a mandate for open access all other publishers are creating (or buying) a PLOS One clone. This change is happening at an amazing pace. This is great for open access and it goes in the direction of a more streamlined and modular system of publishing. It is not so great for filtering and discoverability. I have said in the past that PLOS One should stop focusing on growth and go back to the initial focus on filtering and the related problem of credit attribution. To their credit they are one of the few very actively advocating for the development of these tools. Jason, Heather, Euan and others are doing a great job of developing tools that report these metrics.

1% of the news scrolling by

Of the different tools that scientists could have picked up to be more social Twitter was the last one I would expect to see taking off. 140 characters ?! Seriously ? How geeky is that ? No threaded discussions, no groups, some weird hashtagsomethings. It what world is this picked up by established tenured university professors that don't have time to leave a formal comment on a journal website ? I have no clue how it happened but it did. Maybe the simple interface with a single use case; the asymmetric (i.e. flattering) network structure; the fact that updates don't accumulate like email. Whatever the reason, scientists are flocking to twitter to share articles, discuss academia and science (within the 140 char) and rebel against the Established System. It is not just the young naive students and disgruntled postdocs. Established group leaders are picking up this social media megaphone. Some of them are attracting audiences that might rival some journals so this alone might make them care less about that official seal of approval from a "high-impact" journal.

The future of publishing ?

So after several years of debates about what the web can do for science we have: 1) a growing trend for "bulk" publishing with no solid metrics in place to help us filter and provide credit to authors; and 2) a discussion forum (Twitter) that is clunky for actual discussions but is at least being picked up by a large fraction of scientist. Were are going from here ? I still think that a more open and modular scientific process would be more productive and enjoyable (less scooping). I am just not convinced that scientists in general even care about these things. From my part I am going to continue sharing ideas on this blog and, now that I coordinate a research group, start posting articles to arXiv. I hope that Jason is right and we will all start to take better advantage of the web for science.

Clock image adapted from tinyurl.com/cmy9fn5

Tuesday, November 06, 2012

Scholarly metrics with a heart

I attended last week the PLOS workshop on Article Level Metrics (ALM). As a disclaimer, I am part of the PLOS ALM advisory Technical Working Group (not sure why :). Alternative article level metrics refer to any set of indicators that might be used to judge the value of a scientific work (or researcher or institution, etc). As a simple example, an article that is read more than average might correlate with scientific interest or popularity of the work. There are many interesting questions around ALMs, starting even with simplest - do we need any metrics ? The only clear observation is that more of the scientific process is captured online and measured so we should at least explore the uses of this information.

Do we need metrics ? What are ALMs good for

As any researcher I dislike the fact that I am often evaluated by the impact factor (IF) of the journals I publish in. When a position has hundreds of applicants it is not practical to read each candidate's research and carefully evaluate them. As a shortcut, the evaluators (wrongly) estimate the quality of a researcher's work by the IFs of the journals. I wont discuss the merit of this practice since even Nature journal has spoken out against the value of IFs. So one of the driving forces behind the development of ALMs is this frustration with the current metrics of evaluation. If we cannot have a careful peer evaluation of our work then the hope is that we can at least have better metrics that reflect the value/interest/quality of our work. This is really an open research question and as part of the ALMs meeting, PLOS announced a PLOS ONE collection of research articles on ALMs. The collection includes a very useful introduction to ALMs by Jason Priem, Paul Groth and Dario Taraborelli.

Beyond the need for evaluation metrics ALMs should also be more broadly useful to develop filtering tools. A few years ago I noticed that articles that were being bookmarked or mentioned in blog posts had an above average number of citations. This has now being studied in much detail. Even if you are not persuaded by the value of quantitative metrics (number of mentions, PDF downloads, etc) you might be interested instead in referrals from trust-wordy sources. ALM metrics might be useful by tracking the identity of those reading, downloading, bookmarking an article. There are several researchers I follow on social media sites because they mention articles that I consistently find interesting. In relation to identity, I also learned in the meeting that ORCID author ID initiative has finally a (somewhat buggy) website that you can use to claim an ID. Also, ALMs might be useful for filtering if they can be used, along with natural language processing methods, to improve automatic classification of an articles' topic. This last point, on the importance of categorization, was brought up in the meeting by Jevin West who had some very interesting ideas on the topic (e.g. clustering, automatic semantic labeling, tracking ideas over time). If the trend for the growth of mega-journals (PLOS ONE, Scientific Reports, etc) continues, we will need these filtering tools to find the content that matters to us.

Where are we now with ALMs ?

In order to work with different metrics of impact we need to be able to measure them and these need to made available. From the publishers side PLOS has lead the way in making several metrics available through an API and there is some hope that other publishers will follow PLOS. Nature for example has recently made public a few of the same metrics for 20 of their journals although, as far as I know, they cannot be automatically queried. The availability of this information has allowed for research on the topic (see PLOS ONE collection) and even the creation of several companies/non-profit that develop ALM products (Altmetrics, ImpactStory, Plum Analytics, among others). Other established players have also been in the news recently. For example, the reference management tool Mendeley has recently announced that they have reached 2 million users whose actions can be tracked via their API and Springer announced the acquisition of Mekentosj, the company behind the reference manager Papers. The interest surrounding ALMs is clearly on the rise as publishers, companies and funders try their best to gauge the usefulness of these metrics and position themselves to have an advantage in using them.

The main topics at the PLOS meeting

It was in this context that we got together in San Francisco last week. I enjoyed the meeting format with a mix of loose topics but strict requirements for deliverables. It was worth attending even just for that and the people I met. After some introductions we got together in groups and quickly jotted down in post-its the sort of questions/problems we though were worth discussing. The post-its were clustered on the walls by commonality and a set of broad problem sets were defined (see the list here).

Problems for discussion included:

how do we increase awareness for ALMs ?
how to prevent the gaming (i.e. cheating to increase the metrics of my papers) ?
what can be and is worth measuring ?
how to exchange metrics across providers/users (standards) ?
how to give context/meaning/story to the metrics ?

We were then divided into parallel sessions where we further distilled these problems into more specific action lists and very concrete steps that can be taken right now.

Metrics with a heart

From my own subjective view of the meeting it felt like we spent a considerable amount of time discussing how to give more meaning to the metrics. I think it was Ian Mulvany who wrote in the board in one of the sessions: "What does 14 mean ?". The idea of context came up several times and from different view points. We have some understanding of what a citation means and from our own experience we can make some sense of what 10 or 100 citations mean (for different fields etc). We lack a similar sense for any other metric. As far as I know, ImpactStory is the only one trying to give context to the metrics shown by comparing the metrics of your papers with random sets of the same year. Much more can be done along these same lines. We arrived at a similar discussion from the point of view of how we present ourselves as researchers to the rest of the world. Ethan Perlstein talked about how engaging his audience through social media and giving feedback on how his papers were being read and mentioned by others was enough to tell a story that increased interest for his work. The context and story (e.g. who is talking about my work) is more important than the number of views. We reached again to the same sort of discussions when we talked about tracking and using the semantic meaning or identity/source of the metrics. For most use cases of ALMs we can think of we would benefit or downright need more context and this is likely to drive the next developments and research in this area.

The devil we don't know

Heather Piwowar asked me at some point if I had any reservations about ALMs. In particular from the point of view of evaluation (and to a lesser extent filtering) it might turn out that we are substituting a poor evaluation metric (journal impact factor) by an equally poor evaluation criteria - our capacity to project influence online. In this context it is interesting to follow some experiments that are being done in scientific crowdfunding. Ethan Perlstein has one running right now with a very catchy tittle: "Crowdfund my meth lab, yo". Success in crowdfunding should depend mostly on the capacity to project your influence or "brand" online. An exercise in personal marketing. Crowdfunding is an extreme scenario where researchers are trying to side-step the grant system and get funding directly from the wider public. However, I fear that evaluation by ALMs will tend to reward exactly the sort of skills that relate to online branding. Not to say that personal marketing is not important already, this is why researchers network in conferences and get to know editors, but ALMs might reward personal (online) branding to an even higher level.

Wednesday, May 09, 2012

The Minimal Publishable Unit

What constitutes a minimal publishable unit in scientific publishing ? The transition to online publishing and the proliferation of journals is creating a setting where anything can be published. Every week spam emails almost beg us to submit our next research to some journal. Yes, I am looking at you Bentham and Hindawi. At the same time, the idea of a post-publication peer review system also promotes an increase in number of publications. With the success of PLoS ONE and its many clones we are in for another large increase in the rate of scientific publishing. Publish-then-sort as they say.

With all these outlets for publication and the pressure to build up your CV it is normal that researchers try to slice their work into as many publishable units as possible. One very common trend in high-throughout research is to see two to three publications that relate to the same work: the main paper for the dataset and biological findings and 1 or 2 off-shoots that might include a database paper and/or a data analysis methods paper. Besides these quasi duplicated papers there are the real small bites, specially in bioinformatics research. You know, those that you read and you think to yourself that it must have taken no more than a few days to get it done. So what is an acceptable publishable unit ?

I mapped phosphorylation sites to modbase models of S. cerevisiae proteins and just sent this tweet with a small fact about protein phosphosites and surface accessibility:

Should I add that tweet to my CV ? This relationship is expected and probably already published with a smaller dataset but I could bet that it would not take much more to get a paper published. What is stopping us from adding trivial papers to the flood of publications ? I don't have an actual answer to these questions. There are many interesting and insightful "small-bite" research papers that start from a very creative question that can be quickly addressed. It is also obvious that the amount of time/work spent on a problem is not proportional to the interest and merit of a piece of research. At the same time, it is very clear that the incentives in academia and publishing are currently aligned to increase the rate of publication. This increase is only a problem if we can't cope with it so maybe instead of fighting against these aligned incentives we should be investing heavily in filtering tools.

Thursday, February 23, 2012

Academic value, jobs and PLoS ONE's mission

Becky Ward from the blog "It Takes 30" just posted a thoughtful comment regarding the Elsevier boycott. I like the fact that she adds some perspective as a former editor contributing to the ongoing discussion. This follows also from a recent blog post from Michael Eisen regarding academic jobs and impact factors. The tittle very much summarizes his position: "The widely held notion that high-impact publications determine who gets academic jobs, grants and tenure is wrong". Eisen is trying to play down the value of the "glamour" high impact factor magazines and fighting for the success of open access journal. It should be a no-brainer really. Scientific studies are mostly payed for by public money, they are evaluated by unpaid peers and published/read online. There is really no reason why scientific publishing should be behind pay-walls.

Obviously it is never as simple as it might appear at first glance. If putting science online was the only role publishers played I could just put all my work up on this blog. While I write up some results as blog posts I can guarantee you that I would soon be out of job if I only did that. So there must be other roles that scientific publishing plays and even if these roles might be outdated or performed poorly they are needed and must be replaced for us to have a real change in scientific publishing.

The value of scientific publishing

In my view there are 3 main roles that scientific journals are currently playing: filtering, publishing and providing credit. The act of publishing itself is very straightforward and these days could easily cost near zero if the publishers have access to the appropriate software. If publishing itself has benefited greatly with the shift online, filtering and credit are becoming increasingly complex in the online world.

Filtering
Moving to the digital world created a great attention crash that we are still trying to solve. What great scientific advances happened last year in my field ? What about in unrelated fields that I cannot evaluate myself ? I often hear that we should be able to read the literature and come up with answers to these questions directly without regard to where the papers where published. However, try to just imagine for a second that there were no journals. If PLoS ONE and its clones get what they are aiming for, this might be on the way. A quick check on Pubmed tells me that 87134 abstracts were made available in the past 30 days. That is something like 2900 abstracts per day ! Which ones of these are relevant for me ? The currently filtering system of tiered journals with increasing rejection rates is flawed but I think it is clear that we cannot do away with it until we have another in place.

Credit attribution
The attribution of credit is also intimately linked to the filtering process. Instead of asking about individual articles or research ideas credit is about giving value to researchers, departments or universities. The current system is flawed because it overvalues the impact/prestige of the journals where the research gets published. Michael Eisen claims that impact factors are not taken into account when researchers are picked for group leader positions but honestly this idea does not ring true to me. From my personal experience of applying for PI positions (more on that later), those that I see getting shortlisted for interviews tend to have papers in high-impact journals. On twitter Eisen replied to this comment by saying "you assume interview are because of papers, whereas i assume they got papers & interviews because work is excellent". So either high impact factor journals are being incorrectly used to evaluate candidates or they are working well to filter excellent work. In either case, if we are to replace the current credit attribution system we need some other system in place.

Article level metrics
So how do we do away with the current focus on impact factors for both filtering and credit attribution? Both of those could be solved if we could focus on evaluating articles instead of the journals. The mission of PLoS ONE was exactly to develop article level metrics that would allow for a post-publication evaluation system. As they claim in their webpage they want "to provide new, meaningful and efficient mechanisms for research assessment". To their credit PLoS has been promoting the idea and making some article level indicator easily accessible but I have yet to see a concrete plan to provide the readers with a filtering/recommendation tool. As much as I love PLoS and try to publish in their journals as much as possible, in this regard PLoS ONE has so far been a failure. If PLoS and other open access publishers want to fight Elsevier and promote open access they have to invest heavily in filtering/recommendation engines. Partner with academic groups and private companies with similar goals (ex. Mendeley ?) if need be. With PLoS ONE they are contributing to the attention crash and making (finally) a profit off of it. It is time to change your tune, stop saying how big PLoS ONE is going to be next year and start staying how you are going to get back on track with your mission of post-publication filtering.

Summary
Without replacing the current filtering and credit attribution roles of traditional journals we wont do away with the need for tiered structure in scientific publishing. We could still have open access tiered systems but the current trend for open access journals appears to be the creation of large journals focused on the idea of post-publication peer review since this is economically viable. However, without filtering systems, PLoS ONE and its many clones can only contribute to the attention crash problem and do not solve the issue of credit attribution. PLoS ONE's mission demands it that they work on filtering/recommendation and I hope that if nothing else they can focus their message, marketing efforts and partnerships on this problem.

Friday, January 07, 2011

Why would you publish in Scientific Reports ?

The Nature Publishing Group (NPG) is launching a fully open access journal called Scientific Reports. Like the recently launched Nature Communications, this journal is online only and the authors cover (or can choose to cover for Nat Comm) the cost of publishing the articles in an open access format. Where 'Scientific Reports' differs most is that the journal will not reject papers based on their perceived impact. From their FAQ:
"Scientific Reports publishes original articles on the basis that they are technically sound, and papers are peer reviewed on this criterion alone. The importance of an article is determined by its readership after publication."

If that sounds familiar it should. This idea of post-publication peer reviewing was introduced by PLoS ONE and Nature appears to be essentially copying the format from this successful PLoS journal. Even the reviewing practices are the same whereby the academic editors can choose to accept/reject based on their opinion or consult external peer reviews. In fact, if I was working at PLoS I would have walked into work today with a bottle of champagne and I would have celebrated. As they say, imitation is the sincerest form of flattery. NPG is increasing their portfolio of open access or open choice journals and hopefully they will start working on article level metrics. In all, this is a victory for the open-access movement and to science as a whole.

As I had mentioned in a previous post, PLoS has shown that one way to sustain the costs of open access journals with high rejection rates a publishers needs also to publish higher volume journals. Both BioMedCentral and more recently PLoS have also shown that high-volume open access publishing can be profitable so Nature is now trying to get the best of both worlds. Brand power from high-rejection rate journals with a subscription model and a nice added income with a higher-volume open access journals. If by some chance, founders force a complete move to immediate open access, NPG will have a leg to stand on.

So why would you publish in Scientific Reports ? Seriously, can someone tell me ? Since the journal will not filter on perceived impact, they wont be playing the impact factor game. They did not go as far as naming it Nature X so brand power will not be that high. It is similarly priced (until January 2012) as PLoS ONE and has less author feedback information (i.e. article metrics). I really don't see any compelling reason why I would choose to send a paper to Scientific Reports over PLoS ONE.

Updated June 2013 - Many of you reach this page searching for the impact factor of Scientific Reports. It is now out and it is ~3. Yes, it is lower than PLOS ONE's so you have yet another reason not to publish there.