Cellular Consequences of Genetic variation

Friday, March 27, 2015

Scientific Reports partners with Research Square to unbundle peer-review

Apparently, Scientific Reports has sent out emails to their editorial board members about an upcoming trial for a new peer-review track. The email is available this link for now. According to the email: "a selection of authors submitting a biology manuscript to Scientific Reports will be able to opt-in to a fast-track peer-review service at an additional cost"

They claim that this service will speed up the response time and they commit to have the editorial decision and peer-review comments in 3 weeks from submission. To do this, they will partner with Research Square that offers a third-party peer-review system called Rubriq. This service has been previously covered in the news by The Economist and Nature. Rubriq appears to be essentially a web-based reviewing platform. Scientists can register on the system and get matched to submitted articles. Reviewing is paid on a per-article basis. The process is described well in their website and they have online an example report.

Some of the reactions to this partnership have been negative. See for example this twitter thread. One of the negative comments appears to be that Scientific Reports is trying to sell a fast track on top of their existing peer-review track. I honestly think this was just a bad PR move and the wrong focus on their email to editors. I assume Scientific Reports is working like PLOS ONE with many academic editors and academic reviewers that are not paid at all. They still need some editorial staff to make sure the papers move along the process which is what probably costs them money. Rubriq is currently charging $500-$650 per article although I assume they might have some cheaper deal with Scientific Reports for this trial. If this trial works out I can imagine that Scientific Reports could cut down on costs per paper significantly but in the long run unbundled peer review would probably actually hurt them. If peer-reviewing is external and editorial decisions are based on scientific soundness then the journal becomes just a branded specialized blogging platform.

With the bad PR paid "fast-track" notion out of the way I think that the best discussion is really about the merit of unbundling peer review. What I don't like about it is that it sounds like Amazon Turk for peer review. I don't review articles because I get paid to do it. However, if I needed to make a living out of it I would not reject so many requests as I do now. I think Rubriq is currently paying $100 per referee report which, for now, is just an extra incentive. This extra incentive is apparently still very important has Rubriq found out when running a survey. If we imagine this sort of marketplace scaling up we would need to have some assurance that professional reviewing was up to some required standard. How do we define and evaluate these standards is really worth thinking about.

The positive aspects of third party peer review should be clear to anyone that has gone through the process. There is so much time wasted from having the same paper re-submitted to several different journals and getting reviewed by a different set of reviewers each time. Having the evaluation of the soundness and merit of research separate from dissemination would be a clear innovation in the scientific process. This would also make the publishing costs more transparent and probably would result in lower prices.

So, overall I think it is great the Scientific Reports is doing this trial. Many people have talked about third party peer review and paid peer review. We do want more transparency about the costs of publishing. Maybe it turns out that Amazon Turk for peer review is a bad model but if we don't try new things we won’t find out.

P.S - Dear NPG PR people, feel free to use the tittle in this post when you announce this partnership and put less of an emphasis on speed. You’re welcome.

Tuesday, March 10, 2015

A Borg moment and the end of Friendfeed

Apparently Facebook finally decided to shutdown Friendfeed after several years of declining usage. I only found out because Neil, Deepak and Cameron wrote posts about this. Although I was a heavy user I ended up moving with the crowd after the Facebook acquisition. For those that never used it but are familiar with Twitter or Facebook it might be hard to understand why some people like myself are so disappointed with it's decline. Friendfeed was simply leaps ahead of anything at the time as a mechanism for sharing information and organizing discussions around these shared items. In fact, although there has been no further development for 5 years it is still much better than Twitter for these things. As Neil mentioned in his post, it is hard to understand why this is the case. Maybe because comments were attached to a shared item and not limited to 140 characters so you could actually have meaningful discussions. Unlike forums the shared items were a feed/river so there was the same impression and emphasis on immediacy as twitter. However, recently commented items would jump up on your feed which would tend to foster discussions. It is possible that it only worked because those that joined were the right people at the right time. Maybe it would not scale with the trolls. We will never know.

For those that never used it I want to write down the best experience I ever had on Friendfeed. I was attending the ISMB conference in Toronto in 2008. The number of geeks at this conference is understandably high and there were many Friendfeed users attending. At the time Friendfeed had already introduced the notion of a "room" which was a separate public feed that anyone could join. Similar to tracking a hashtag on twitter. A feed for the conference was set up and many people at the conference joined and started participating. In fact, the feed is still available here so you can go have a look for the time being. This was the first time I really had the impression of connecting to a hive-mind. In this back channel tens of people were taking notes and giving comments about the several simultaneous talks. During keynotes you could even see, as the speaker was changing topics, different people would take up the slack of taking notes and commenting according to their own expertise. Unlike twitter these didn't feel like we were drowning in a sea of uncoordinated messages. You could always focus your attention on just one thread (i.e. a shared item) and its comments at any time. It worked so well that we ended up using the notes to write up a conference report that got published in PLOS Comp Bio.

That community of scientists and other open science advocates moved on to Twitter after the Facebook acquisition. Twitter usage by scientists and in particular by prominent established scientists also really took off at around the same time. Although it serves a similar purpose Twitter really is more of a broadcasting mechanism than a discussion forum. It is a pity that a lesser solution won out. Still, the amount of open scientific discussions that are going on online these days is just phenomenal and a drastic change from my PhD days.

Tuesday, February 10, 2015

Group member profile - Brandon Invergo

I had mentioned previously that we should do a better job of using the web to describe our group and work. As part of this effort I will try to have a recurrent blog post series to introduce the lab members more extensively. The first group member to give this a try is Brandon Invergo (website, twitter, GScholar) who is currently doing a postdoc in the group with an ESPOD fellowship. Here follows Brandon's answers to a few questions I asked him.

What was the path the brought you to the group? Where are you from and what did you work on before arriving in the group?

I originally studied Computer Science, but by the time I was finishing my degree, I was more interested in doing something Biology-related than in working for a software company. Not sure yet what I wanted to do specifically, after receiving my degree I was fortunate to get a job working in the lab of Lawrence H. Pinto at Northwestern University (Evanston, IL, USA). There, I performed electrophysiological and behavioral assays of the mouse visual system, in the context of a functional genomics program. After a few years, I decided that it was time to go back to school and to start on the path towards a career in academic research. So, I moved to the Netherlands to pursue a master's degree in Biology at Leiden University. I specialized in evolutionary and ecological sciences and I did my primary research project under the supervision of Bas Zwaan. I investigated how the dynamics of hormonal signaling during pupal development of a tropical butterfly change in response to environmental conditions (temperature) and how those changes give rise to distinct adult phenotypes (polyphenism).

For my PhD, I wanted to perform research where I could combine my backgrounds in computer science and evolution and a nascent interest in systems biology (bonus points if I could also tie in my background in vision research). For this, I moved to the Institute of Evolutionary Biology (Pompeu Fabra University / Spanish National Research Council) in Barcelona, Spain, where I joined the group of Jaume Bertranpetit and worked under his supervision with the co-supervision of Ludovica Montanucci. My thesis, which I successfully defended in November 2013, was entitled "A system-level, molecular evolutionary analysis of mammalian phototransduction". In it, I combined techniques from bioinformatics and computational biology for molecular evolution with network- and modelling-based tools from systems biology. I sought to uncover the influence of the structure and dynamics of the visual phototransduction pathway on the evolution of the proteins that comprise it. The work also resulted in the improvement of the most comprehensive mathematical model of the system produced to date (currently under review at Biomodels), as well as a Biopython module for working with codeml and other programs from the PAML package (which are notoriously annoying to work with in analysis pipelines).

What are you currently working on?

I joined the EBI and the Sanger Institute in December 2013 as an ESPOD fellow, one week after my thesis defense. Here, I am continuing to explore how complex signaling systems function and evolve, except now I'm working in the context of malarial parasites (Plasmodium spp.).

In particular, I'm studying post-translational modifications (PTMs) on a proteome scale in the parasites, with an eye towards how the parasite uses reversible PTMs (mainly phosphorylation) for cellular signaling during key transitions in its complex lifecycle. This work involves performing both the mass-spectrometry experiments to collect the data and the computational analyses on these and other datasets. I'm finishing up a rather big experiment now and in a few weeks I expect to be neck-deep in data.

What are some of the areas of research that excite you right now?

Really anything at the intersection (well, more generally, the union) of molecular evolution and systems biology immediately catches my attention, such as the evolvability of pathways or the patterning of natural selection across systems. Of course, I'm reading a lot right now about detecting and describing PTMs at the proteomic scale. I'm also excited by developments in biochemical system modelling, particularly right now in methods for bayesian inference of parameters from large-scale datasets. Finally, though it's not directly my field, I like to keep an eye on what's happening in complexity research at the most fundamental, mathematical level.

What sort of things do you like outside of the science?

I'm very active in the Free Software community and within GNU in particular. I help out a lot behind the scenes: working with (read: pestering) GNU software maintainers, evaluating new software that has been offered to us, and being on the advisory board. I also maintain some GNU packages (GSRC & pyconfigure) and some of my own software projects in my free time. My other main passion is music. I have written many mediocre electronic music songs over the years, some of which have even been released, and I was a moderately successful DJ for nearly a decade. Sadly, my music-writing died off as my PhD thesis gained steam and I haven't written anything recently. When I decide to take a break from all that or to be social, I like to play boardgames.

Friday, January 16, 2015

How many referee reports do you write per year ?

If peer-review is a fundamental aspect of how science is done then we in academia are required to act as reviewers as part of our jobs. I assume there is no real argument as to if peer-review is needed. Instead one can argue about when to do it, in the life-time of a research project, and who should do it. Do we do it before the results are made public (pre-publishing) or after (post-publication) ? Do we have dedicated reviewers or should acting scientists do this ? The current dominant form of peer-review is done by active scientists and before disclosure of the results. This is all done anonymously and hidden from view. There are many issues with this process as the many repeated peer-review rounds required when an article bounces from journal to journal. Another drawback is that nobody gets credit from doing the work and reversely nobody gets shunned from not contributing sufficiently to the process.

This week, Alex Bateman (@Alexbateman1) directed me to the Publons company that is attempting to create reviewer profiles. Publons is attacking the problem in a couple of different ways: they mine journals that provide open peer reviews; they curate the journal's confirmation emails sent in by reviewers and they are apparently in talks with publishers to automate this process. The reviewer controls the degree of information displayed by the site. The minimum information shared is the journal name and the month which is what I expect most people will opt for. Alex had apparently kept the journal's emails acknowledging the receipt of his reviews going back for many years and he has created an extensive profile that demonstrates his contributions as a reviewer (and editor).

The company has been profiled last October in a Nature news article where Andrew Preston, one of the co-founders, states the aim of making peer-review a measurable research output. It is useful to have a verified account of our reviewing activities but I am not sure if Publons is the best way of getting there. Given that we have ORCID we could imagine that publishers would be able to jump over Publons and report reviewing activities directly to ORCID. On the other hand, Publons may serve as a focus point to get publishers to provide this information in a standardized and automated fashion. For now, the closed reviews are coming from the authors so that is why I assume the company has a reward program that has been giving out awards worth $3000 for the top reviewers in a given cycle. I do wish that companies like this one, that collect information without an obvious source of income would make their business plans more transparent. Even their terms and privacy statements are currently empty. Are we putting effort into something that will last, that will sell this information ?

So how many referee reports should we do per year ? I guess that we should aim to do at least as many as the articles we publish. The truth is that this probably varies widely and having some feedback and accountability will be good for the system. I keep all my referee reports on file so apparently since 2007 I have done 53 referee reports or about 7.5 per year. This has varied a lot with years where I have done as few as 2. With 24 articles published and only 2 years into a group-leader position I think I have been contributing well. I have set up my profile in Publons and sent in a couple of recent reviews to see how the process works. So far it has all been very straightforward and I will
give it a try for a while.

Saturday, December 20, 2014

State of the lab, year 2 – reaching steady state

CC BY ,Jason Paul Smith

At the end of last year I wrote up a short description of what it was like to start a group at the EMBL-EBI. I though it would be interesting to try to make it an yearly event so here is the second installment. It is always scary how fast a year passes by and it is interesting to note how my perspective of managing a research group is changing.

During this year we said our first goodbyes as Vicky Kostiou (linkedin) finished her internship. We also welcomed several new members including Rahuman Sheriff (postodoc, linkedin) Haruna Imamura (postdoc, pubmed), Marta Strumillo (intern, linkedin) and Juan A Cordero Varela (master student, linkedin). Sheriff is working on a collaboration with Silvia Santos' group at the MRC-CSC in London to study cell-cycle regulation. Haruna came initially on a 1 postdoc fellowship in collaboration with Yasushi Ishihama's lab (Kyoto University, Japan) and she has recently been awarded an EIPOD postdoc fellowship to study post-translational regulation of Salmonella in collaboration with Nassos Typas and Jeroen Krijgsveld at the EMBL-Heidelberg. Marta is studying the functional role of PTMs in the context of protein structural information and Juan is participating in a project lead by Marco Galardini (postdoc, @mgalactus, webpage) to model and predict bacterial phenotypes from sequence. These new members join the group of people that I already mentioned last year: David Ochoa (postdoc, @d0choa, webpage), Romain Studer (postdoc, @RomainStuder, blog), Brandon Invergo (postdoc, webpage) and Omar Wagih (PhD student, @omarwagih).

Shaking off that postdoc feeling

In the first year my concerns were dominated by the stress of facing an empty room that I needed to fill. It was a mistake to take 6 months to find the first person since I felt like I was wasting time. This year I had to come to terms with the fact that I no longer have time to do my own research projects. After over 10 years of measuring my own productivity by the progress in my research projects it is strange to try to let it go. I am certainly doing work that I enjoy. The progress in the group has been fantastic this year but it took me time to accept that the management activities I am doing is something I should count internally as productive work.

Reaching steady-state

Any new group, specially one that starts in a place like EMBL with very generous core funding, will grow to occupy a space in research. Any movement from this position will then only happen with a slower turnover of projects and people. That seems to be one of the trade-offs from managing a research as group versus an individual. Changing directions for a whole group has to be slower than for an individual. However, as as group it is still possible to explore opportunities while maintaining a common theme of research underway. This year I think we have reached this steady-state. Although we got significant new funding starting next year I don't expect the group to grow much larger. I am curios to see how the research theme of the group will change with time.

The bad and the good of 2014

So I will start off by summarizing some of the aspects I wish had been different this year. Above all I had hoped to publish the first article(s) from the group in 2014. I am happy with the progress of the projects so far (see below) but I am still amazed on how long it takes to get a group up-and-running. Most of the group joined towards the end of last year so it has not been that much time objectively. The second aspect I think we could have done better was to communicate more online on what we have been up to. This has been one of the years with fewest blog posts since I started blogging about 11 years ago. We should do better than this, both because we are publicly funded and because the people (and projects) in the group deserve better exposure. So I will try to change this next year.

On a more positive note this has been great a great scientific year for me and the group even if not very visible to the outside. The two last papers that started still at UCSF are finally under revision and should come out next year. One is about studying the function and evolution of X. laevis phosphosites (biorxiv) and the second about conditional genetic interactions in S. cerevisiae. We also have 3 projects that are getting close to being finished from Omar, David and Romain that I hope we will submit early next year. If possible we will put them up on biorxiv as well before submission. It is obviously a great privilege to see this work take shape and I hope some of you will also be excited about it when we make it public.

Regarding funding, I had mentioned already that Haruna got an EIPOD fellowship. In addition we got a 5 year ERC starting grant awarded. I am very excited about the starting grant since this will allow us to start doing yeast genetics work to complement the proteomics and genome analysis we have been doing. This will feed in and complement almost every project in the group so I really have to thank the committee for this opportunity. For this purpose we will be hiring 2 positions (postdoc and/or technician) early next year. Since the EBI does not have lab space, the work will be done at the Genome Biology unit in Heildeberg. This means I will be traveling (even) more to Heidelberg next year. Those hired to these positions will have the opportunity.to interact with the Typas lab that conduct similar genetics studies in bacterial species. If you know anyone looking for jobs with PhD and/or postdoc experience in yeast genetics please do let them know about these positions.

Friday, December 05, 2014

Alumni from a small PhD program you never heard about got 4 ERC grants this year

Last Friday I heard the amazing news that a our group will be awarded an ERC starting grant to support our ongoing studies of the function and evolution of protein phosphorylation. I will write more about this soon. I also got the very exciting news that two other fellow alumni from the GABBA PhD program were also awarded a starting grant this year. Ana Carvalho and Nuno Alves both have their groups at the IBMC in Porto. Earlier in the year, another GABBA alumnus Rui Costa was awarded an ERC consolidator grant to support his neuroscience research at the Champalimaud institute in Lisbon. Rui had previously also received funding international funding from an ERC Starting Grant and from the HHMI international early career program .

I had mentioned the GABBA program before in a previous post. As I had described, this program has, for almost 18 years, allowed Portuguese PhD students to do their work abroad with no return clause or any strings attached. Unfortunately, this has changed a bit recently as the Portuguese government has been revising and seriously cutting science funding. GABBA students can still do their work abroad but they are now required to work between two groups with some time spent in Portugal. The funding was also reduced from 12 to 9 students per year.

This program is not the only one that has been allowing students to do their PhD thesis abroad and it is easy to question if the investment made by the government is worthwhile. Not surprisingly, many PhD students end up doing their postdoctoral work also away from Portugal and even fewer end up setting up their groups there. However, Portugal did create a pool of talented researchers and some do end up returning. Measuring the return on this investment is very hard since most of the benefit is a gain in knowledge and talent from those that return and network possibilities with those that stay abroad. This year's ERC grants are a very obvious demonstration that this investment pays off. The 3 GABBA alumni that have set up labs in Portugal are together going to bring in around 5 million Euro of EU funding. By itself, this funding does not cover the funding costs of the whole life of the GABBA program but it is a very concrete validation of the investment made that hopefully even politicians will understand.

Thursday, October 16, 2014

Science publishers' pyramid structure and lock-in strategies

It is not recent news that AAAS will start a digital open access multidisciplinary journal. It is called Science Advances and it will be the 4th journal of the AAAS family of journals. As I have described in the past this is part of trend in science publishing to cover wider range of perceived impact. Publishers are aiming to have journals that are highly selective and that drive brand awareness but also have journals that can publish almost any article that pass a fairly low bar of being scientifically sound. This trend was spurred by the success of PLOS One that showed that it is financially viable to have large open access journals. Financially, the best strategy today would be to have some highly selective journals with a subscription model and then a set of journals that are less selective that operate with an author paying model.

The Nature Publishing Group has implemented this strategy very well. They increased their portfolio of open access journals with the addition of Nature Communications, Scientific Data, Scientific Reports, the partner journals initiative and the investment/partnership with the Frontiers journals. NPG has also expanded their set of subscription based Nature branded research and review journals.

This combination approach is not just financially interesting it also protects the publishers from the future imposition of immediate open access via a mandate from funding agencies. Publishers that have such a structure in place would be able to survive the mandate while others that only have subscription based journals would struggle to adjust. This has the useful side-effect of actually speeding up the transition since the bulk of the papers will be increasingly published in the larger, less restrictive open access journals. If most of the research is open access there will be less justification to have subscription journals. This will also be true even for the most well cited papers, as reported recently by google scholar.

So, many publishers are trying to build this pyramid type structure. Even AAAS is doing it albeit (apparently) very reluctantly. One consequence of these changes is that there will be an abundance of large and permissive open access journals. Therefore, we will increasingly need better article filters, such as PubChase, as I previously discussed. These mega-journals will compete on price and features but the publishers as whole will increasingly try to lock authors into the structure. Any submission to the pyramid should be publishable in *some* journal of the publisher. If I was working in one of these publishing houses I would be thinking of ways to use brand power to attract submissions while adding such lock-in mechanisms.

Current practiced lock-in strategies

The best well known lock-in strategy is the transfer of referee reports within journals of the same publishing house. This is a common occurrence that I have experienced before. An editor at Cell might suggest authors to transfer their rejected article and reviewer comments to Molecular Cell or Cell Reports. Nature research journals might suggest Nature Communications or Scientific Reports. Science might suggest Science Signalling or Science Advances. This can be very tempting since it can shorten the time to get the paper accepted. The usefulness of this mechanism is going to work against the idea of having peer review outsourced to an independent accredited entity.

Cell press has an interesting mechanism that allows for co-submission to two journals at the same time (Guide to authors - Cosubmission). I never tried this but apparently one can select two journals and the article will be evaluated jointly by both editor teams. This looks more relevant for articles that fall in between two different topics that are covered by different journals. It is still an interesting way to improve the chances that a given article will find a home within this publisher.

Another more subtle approach might be to issue a topic specific call for articles. Back in February, Nature Genetics had an editorial with a call for data analysis articles. Note that the articles will not be necessarily published in Nature Genetics and the editorial mentions explicitly Nature Communications, Scientific Data and Scientific Reports. This allows NPG to use the Nature brand power to issue a call and then spread the resulting submissions along their pyramid structure according to input from reviewers and editors. The co-publication of a large number of articles on the same topic also almost guarantees a marketing punch.

Other ideas

I am curious to see what other ideas arise and please share other similar mechanisms that you might know of. One potential additional idea that is similar to the co-submission would be to have a chained submission. At submission the publisher could already ask for an ordered list of preference for journals. We might also start to see publishers requesting from reviewers comments with a group of journals in mind from the start.

Obviously, an alternative that would place less of a burden on reviewers and editors would be a mechanism similar to what the Frontiers journals have been promoting. Articles could be initially published at a large PLOS One like journal and then increase in awareness depending on article level metrics. This approach is probably going to take a much longer time spread widely.

Friday, September 05, 2014

Collaborative postdoc fellowship opportunities

I interrupt this long blogging hiatus to point out two potential postdoc fellowship opportunities to work with our group at the EMBL-EBI. One is the EIPOD program that is an EMBL wide interdisciplinary program. For this fellowship the project is collaboration with Nassos Typas (genetics) and Jeroen Krijgsveld's (proteomics) groups at the EMBL in Heidelberg. Successful candidates would be studying how Salmonella uses post-translational modification effector proteins to regulate and subvert the host cell. It is important to note that EIPOD applicants must be interested in doing both the computational and experimental aspects of the project. Applicants are only expected to have experience in one of the areas (bioinformatics, proteomics, genetics) and an interest in learning about the others. The deadline for the EIPOD application in 11 of September.

The other fellowship opportunity is the newly created EBPOD program. This is a collaborative program set up between the EMBL‐EBI and the NIHR Cambridge Biomedical Research Centre (BRC). As described in the program webpage this is a program meant to explore and develop computational approaches for translational clinical research involving human subjects. Our project proposal (PDF link) aims to study and identify cell surface markers of primed/activated neutrophils obtained from patients with chronic inflammatory diseases. The project is a collaboration with Paul J Lehner and Edwin Chilvers. The applicant would be in charge of the computational analysis which would focus on proteomics data of protein composition changes in the membrane vs total cell (as in Weekes et al. Cell 2014). Prior expertise in protein related computational research would be ideal.

Wednesday, December 18, 2013

State of the lab, year 1 – setting up

I have used this blog in the past to keep track of my academic life where I can give a less formal perspective on papers I have published or ideas I am working on. Starting a group has made me think a bit about what I blog about. I have more responsibilities towards the people that have decided to work with me, towards the institution that has hired me (EMBL-EBI) and funding sources that support our work. At least for now I have decided to keep on sharing my personal view and in that context I though it could be interesting to write down my path as group leader in academia. This might become a yearly “thing”.

I started at the EMBL-EBI January 7 and in a blink of an eye one year has gone by. I have just arrived in Portugal for a conference and holidays and I said goodbye to four people that very courageously decided to work with a unknown newbie group leader. I could sum-up what happened in this first year by saying that the group-leader tittle now makes sense – I am coordinating an actual group. Most of this year was spent applying for funding, recruiting and trying to know more about the different groups working on campus.

From an empty room to a research group

EMBL-EBI is really a great place to start a group. For those that don't know the EMBL system, group leaders are given very generous core funding to work for 5 years, plus an additional 4 years after a review process. The chances of failing the review are small but there is essentially no tenure. Core funding and additional “internal” postdoc fellowships are sufficient to run a small group without external grants. We are encouraged to apply for funding but money is not the most immediate source of stress. So for me, since I started recruiting only after arriving in January, facing that empty room where a group should be working was the first thing on my mind. Recruiting postdocs for a unknown and empty group is particularly challenging. I tried to do some of the obvious things like emailing related groups that could have people about to finish the PhD and promoting the vacancies at conferences. It is hard to quantify but I do have the impression that my online presence has been an advantage in this. Once the first couple of people started and group meetings made sense the empty room stress went away. I know people starting experimental labs right now and I have to say that computational people have it way to easy. We can buy a few computers and the “lab” is set up.

I spent a considerable amount of time applying for funding which is always somewhat frustrating. I don't mind writing grants but I am happier doing actual research. Around 6 months into the job I managed to re-start doing research and I have managed to keep working on fairly constant basis. I hope I will keep having/making time for research for as long as possible.

Meet the gang

This year we got an HFSP CDA and an ESPOD fellowship which together with the core funding allowed me to grow the group fairly quickly. The first to join was David Ochoa (postdoc, @d0choa, webpage) who will be working initially on PTM dynamics under different conditions. He also introduced me to the amazing BlackMirror series, the best fiction I have seen in a long time. Vicky Kostiou (intern) joined after and is doing a great job of improving the PTMfunc website which should be updated late January (stay tuned). The most recent arrivals were Romain Studer (postdoc, @RomainStuder, blog) and Brandon Invergo (postdoc, webpage). Romain will be using his phylogenetic and structural experience to study PTM evolution and Brandon was awarded the ESPOD fellowship to work with Jyoti Choudhary and malaria groups at Sanger on Plasmodium PTMs. Omar Wagih (@omar wagih) will be the fist PhD student joining in January. Finally, although we have still not signed a contract Marco Galardini (@mgalactus, webpage) will likely join in February to work on a collaborative project with Nassos Typas' group at the EMBL-Heidelberg.

To be, or not to be, an experimental group

One of my concerns when I joined the EMBL-EBI was that, although the Sanger is just next door, EBI is a purely computational institute. Doing computational work is pretty amazing but progress can often be limited by lack of data. High-throughput research is removing somewhat this limitation since there are probably more observations made than we can all analyze. Still, if you are really interested in going in a specific direction then a experimental group simply has more power to make the right observations. My solution for this problem, for now, will be to co-supervise people with experimental groups including Brandon's EIPOD project, Marco's project with Nassos Typas and a future hire with Silvia Santos' lab in London. This is an experiment in itself and I guess in 2 to 3 years I be able to evaluate how practical this is. One alternative is to make use of research services such as the ones listed in Science Exchange. I have discussed with a couple of companies what would be the prices for some of the work I am interested in doing. These are fairly expensive but might be a good complement to the collaborations.

Summary

So overall, the group is off to a good start. It is funded for a few years at a reasonable level and we have collaborations with other groups that share some common interests. There were some things I wish could have gone better. I didn't get all the funding I applied to, which is expected. I also didn't manage to submit the two last manuscripts that still contain work from my postdoc. It would have been great to start the second year with that off my back. Still, I am happy with how things look for the next few years. It is a privilege to be able to coordinate this group of people and level of resources around topics that I find so interesting.

Friday, November 01, 2013

Introducing BMC MicroPub – fast, granular and revolutionary

(Caution: Satire ahead)

I am happy to be able to share some exciting science publishing news with you. As you know, in the past few years, there has been a tremendous progress in open access publishing. The author-paying model has been shown to be viable in large part thanks to the pioneering efforts of BMC and PLOS. In particular PLOS One has been an incredible scientific and business success story that many others are trying to copy. Although these efforts are a great step forward they don't do enough to set all of the scientific knowledge free in a timely fashion. Sure you can publish almost anything today such as metadata, datasets, negative results and the occasional scientific advancement but the publishing process still takes too much time. In addition we are forced to build a story around the bits and pieces in some laborious effort to communicate our findings in a coherent fashion. Many of us feel that this publishing format is outdated and does not fit our modern quick-paced internet age. What I am sharing with you today is going to change that.

Introducing BMC MicroPub
In coordination with BMC we are going to launch soon the pilot phase of a new online-only publishing platform. It was though from the ground up to allow for the immediate publishing of granular scientific information. Peer-review happens after online publication of the content and evaluation is not going to be based on trivial and outdated notions of scientific impact. Best of all, it is tightly integrated with the social tools we already use today. In fact, authors are asked to register to the system with their twitter account and to link it to an ORCID author ID. From then on, their twitter feed is parsed and any tweet containing the #micropub tag will be considered a submission. Authors are themselves reviewers and any submission that gets re-tweeted by at least 3 other MicroPub registered scientists is considered to be “peer-reviewed” and a DOI is issued for that content. An author can create a longer communication by replying to a previous #micropub tweet and in this way create a chain that the journal can track and group in MicroPub stacks (TM). What the team involved here has done is nothing short of amazing. We are taking the same platform we use to share cute pictures of cats and revolutionizing scientific publishing. To start using the journal authors pay a one time registration fee followed by a modest publication charge for each published content. However, the journals is waving any charges for the first 100 authors and the first 100 publications. We hit a snag in discussions with Pubmed but with your support we will be tracked by them starting next year.

Pioneering granularity
The project started a few months ago after a first attempt I covered in a previous blog post. Right now we have also an exciting experiment in granular sharing of genome content underway. You can follow the tweets of @GenomeTweet to get an idea of the future of this brave new world. The current front page of the journal gives you an indication of some of the cool science being published by early adopters. The site is currently only available to beta-testers so here is screen-shot of the current version:

I sat down with the open access advocate Dr Mark Izen from UC Merced to discuss the new journal.

Dear Mark, given your enthusiasm for open access what do you think of this initiative?
I think that experimentation in scientific publishing is fantastic. Any attempt to promote open access and get rid of the current established closed access, impact factor driven system is a great thing. One concern I have is that, although the content is published under a CC0 license, the publishing process is currently reliant on Twitter which is a closed proprietary technology. We should really ask ourselves if this goes all the way in terms of open science.

Some would say that they don't have time to read science with this level of granularity so devoid of any significant advances. In the words of an anonymous tenured prof: “You must be joking right!?”. What would you say to these naysayers?
To be blunt, I think they lack vision. Ultimately we owe it to all tax payers to be as transparent as possible about our publicly funded science. Now we can do that, 140 characters at a time. Moreover, the possibility to drive science forward by making information available as quickly as possible is an amazing possibility. Scientists are already using twitter to share information, we are just going one step forward here and going to start sharing science as it happens. You could get corrections on your protocol before your experiments have finished running ! If you blink your eyes you may literally lose the next big discovery.

So you are not concerned that this increasing level of granularity, small bite research, is going to drown everyone in noise and actually slow down the spread of information in science ?
Absolutely not. It is a filter failure and we are sure that someone is bound to come up with a solution. In the future all sorts of different signals will allow us to filter through all this content and weed out the most interesting parts for you. You will be able to get to your computer and get in your email or in some website just the information that an algorithm thinks you should read. I am sure it is doable, it is question of setting it up.

Disclaimer: If you have not noticed by now, this is a fictional post meant to entertain.

Tuesday, October 29, 2013

Sysbio postdoc fellowship: spatio-temporal control of cell-cycle regulation

Funding is available for a 3 year postdoctoral fellowship to study spatio-temporal control in cell-cycle regulation. This is a join project between our group at the EMBL-EBI and the Quantitative Cell Biology group headed by Silvia Santos at the MRC Clinical Sciences Centre in London. More information about the groups interests can be found in the respective webpages.

The main objective of this project will be to study how the spatial and temporal control of key cell-cycle proteins change in different biological contexts. Examples of these different contexts include different differentiation states and/or different species.

We are looking for candidates that are interested in doing both experimental and computational work and previous experience in cell biology, microscopy, programming, image analysis and/or modelling of dynamical systems are all considered an asset. We will consider candidates that have a stronger expertise on either experimental or computational methods but are interested in learning and using both approaches. Additional information and application link is here with a closing date of 24 November 2013. We are available for further clarification in regards to suitability of background or information about the projects.

Tuesday, October 22, 2013

Pubmed Commons - the new science water-cooler

Pubmed has decided to dip its toes into social activities by adding a commenting feature to it's website (named Pubmed Commons). It will start off in a closed pilot phase where you have to receive an invite in order to be able to comment but it should eventually be widely available. The implementation is simple and everything works as you would expect. Here is a screenshot with an example comment:

As you would expect you get an option to add a comment, to edit or delete previous comments you have made and up-vote other comments. In future versions you will be able to reply to comments in a threaded discussion. The comments, at least for now, cannot be anonymous and in the pilot phase you have to be invited to join. It is also restricted to authors that have at least one abstract on Pubmed already. There are arguments in favour and against anonymity but I lean in favour of identifiable comments to keep the trolls at bay. In this way the comments are also associated to you (via your NCBI profile) and can be listed. Unfortunately NCBI accounts are still not possible to link to an ORCID ID but that should be easily fixed. You will be able to search articles that have comments are these will be made available through their APIs.

I am sure there will be several criticism such as the fact that is invite only or that you are adding comments to articles that you might not even have access too. Overall, I think this is a great development. Commenting systems have, for the most part, failed to work on the publishers side and the hope is that this might finally create a discussion forum with higher participation. The advantages here are a higher visibility and lower friction when compared with most publishers existing commenting systems. For ALMs it might be also very positive assuming this does increase the level of participation. I for one would like to have useful opinion attached to articles while I search for them online.

You can get the whole back-story from this post by Rob Tibshirani and from many of other blog posts and press releases that I am sure will be hitting the web today.

Monday, October 21, 2013

Project management (online) tools

I am currently looking for a tool to centralize project management across the group. I asked on twitter for suggestions and received a number useful tips. In case this is of use to others here are a few notes I took when exploring of few of these options. The features I am particularly interested are: low/no set-up or upkeep requirements, intuitive use, rich project notebooks with the possibility to add images and back-up support. Nice features to have: possibility to share with public; integration with Dropbox and/or Google Drive.

Here are the notes in no particular order with my preferences at the end.

Basecamp

Simple, intuitive and well designed project management and collaboration tool. Each project can have: project updates (activity list), text documents (simple text documents, cannot add images), To-do lists (linked to the calendar); discussion items (text and embedded images that can stand-alone or be linked to any other item including other discussions). Group view can quickly show you updates across all projects you are involved in. The group and projects view is great but it would be nicer to have notebooks within each project as implemented in Evernote. Discussions can be used as notebooks but they get mixed in with comments on any item such as a to-do list item. All projects can be downloaded for back-up but automation required 3rd party service or coding via the API. iOS app available and Android via 3rd party app. No free account (60 day trail), plans start at $20/month 10 projects 3GB limit up to £3000/year unlimited projects 500GB limit. Basecamp can be extended from a list of additional services (mostly 3rd party) and they tend to cost additional fees.

Freedcamp
Project views with to-do lists, discussions, milestones, file attachments. Dashboard view with group activity. Marketplace with additional group and project widgets to add (eg. Group chat and wiki's). Free account 20MB limit with paid accounts starting at $2.5/month for 1GB up to $40/month for unlimited storage. Fairly cheap but below average design and somewhat sluggish.

Evernote

This tool is centred on the idea of notebooks (collections of notes). Notes can contain text, embedded images, to-do lists, voice clips. Has a stand-alone program that facilitates copy-paste actions into the notebooks (mac and windows but works well under wine). Notebooks from free accounts cannot be edited by others. Premium accounts (£35 per year) can have notebooks edited by others. One premium account could be used to centralise group notebooks. Business accounts (£8.00/user/month) are needed to have group management features. Limited tools for group interactions (no comments, chat, activity dashboard) when compared with others.

Redmine
Free but but requires local installation. Fully fledged project management tool: activity, roadmap, issue tracker, gannt charts,calendar, news, documents, wiki, forum, files. Recommended by several people in twitter. I only had a quick look since I would prefer an online tool without set-up.

Trello
Card concept – Each card can have Activities (could be text description of project entries), to-do lists, files, can be assigned to specific people, due-dates,Attachments including google drive and dropbox. Cards can be stacked in groups, moved around, tagged with color codes, stickers and individuals responsible for them. It looks nice but I don't like the design for project management. Android and iOS apps. 10mb standard, 250mb gold (plus additional customization features) $5/month or $45 per year

Teambox

Dashboard concept; Users can be assigned to projects. Dashboard view has the list of tasks and notifications for the day. Projects can have activities, conversations tasks, notes, files and members. Notes would be were the project/sub-project/task notes could be added. Notes have version history and can be shared to public. Images can be embedded in the notes. Additional group tools: calendar , gantt chart, time tracking, video conference (by Zoom). Pro accounts also have workload and group chat. iOS and Android apps available. Free - 5 users/5 projects – Pro accounts are $5 per user per month (annual- 20% discount, two years – 30% discount) with unlimited projects. dropbox integration, workload views, group chat functionality and priority support.

Labguru
Project management with a specific focus on science labs. Very large number of features including: dashboard with activity feed, projects (organized into past/present/future milestones, notes with embedded and resizeable images, attachments, pubmed integration, automatic report generation), lab equipment/reagents inventory. Organizing science into milestones makes more sense than into tasks as it fits more the spirit of research versus engineering. Android and iOS apps meant to be used to follow protocols, take pictures, check storage, etc. Overkill for a computational group. Not very smooth as every action results in a full webpage refresh. Expensive ($12 per person/month, yearly billing).

Projecturf
Dashboard view and project view. Projects have: overview, calendar, tasks, tickets, time (could be useful for contract work or grant reporting issue), files, conversations and notes. Files can be integrated with Google drive and dropbox. Notes can have embedded images. Pricing starts at 5 projects, 5GB $20/month up to unlimited projects, 100GB for $200/month (1 month free for annual billing). Very directed towards engineering code based projects.

Summary
My favourites at this point are Basecamp, Teambox and Evernote. Evernote is clearly lacking as as group tool but has a nice focus on notebooks (as in lab notebooks). Basecamp is more polished and intuitive than Teambox but is missing a proper "notebook" within each project and is somewhat expensive. Teambox is not as well designed as Basecamp but should work well, is cheaper and has integration with Google drive.

Saturday, October 19, 2013

Scientific Data - ultimate salami slicing publishing

Last week a new NPG journal called Scientific Data started accepting submission. Although I discussed this new journal with colleagues a few times I realized that I never argued here why I think this a very strange idea for a journal. So what is Scientific Data ? In short it is a journal that publishes metadata for a dataset with data quality metrics. From the homepage:

Scientific Data is a new open-access, online-only publication for descriptions of scientifically valuable datasets. It introduces a new type of content called the Data Descriptor designed to make your data more discoverable, interpretable and reusable.

So what does that mean ? Is this a journal for large scale data analysis ? For the description of methods ? Not exactly. Reading the guide to authors we can see that an article "should not contain tests of new scientific hypotheses, extensive analyses aimed at providing new scientific insights, or descriptions of fundamentally new scientific methods". So instead one assumes that this journal is some sort of database where articles are descriptors of the data content and data quality. The added value of the journal would be to store the data and provide fancy ways to allow for re-analysis. That is also not the case since the data is meant to be "stored in one or more public, community recognized repositories". Importantly, these publications are not meant to replace and do not preclude future research articles that make use of these data. Here is an example of what these articles would look like. This example more likely represents what the journal hopes to receive as submissions so let's see how this shapes up in a year when people try to test the limits of this novel publication type.

In summary, articles published by this journal are mere descriptions of data with data quality metrics. This is the same information that any publication already should have except that Scientific Data articles are devoid of any insight or interpretation of the data. One argument in favor of this journal would be that this is a step into micro-publication and micro-attribution in science. Once the dataset is published anyone, not just the producers of the data, can make use of this information. A more cynical view would be that NPG wants to squeeze as much money as they can from scientists (and funding agencies) by promoting salami slicing publishing.

Why should we pay $1000 for a service that does not even handle data storage ? That money is much better spent supporting data infrastructures (disclaimer: I work at EMBL-EBI). There is no added value from this journal that is not or cannot be provided by data repository infrastructures. Yet, this journal is probably going to be a reasonable success since authors can essentially publish their research twice for an added $1000. In fact, anyone doing a large-scale data driven project can these days publish something like 4 different papers: the metadata, the main research article, the database article and the stand-alone analysis tool that does 2% better than others. I am not opposed to a more granular approach to scientific publication but we should make sure we don't waste money in this process. Right now I don't see any incentives to limit this waste nor any real progress in updating the way we filter and consume this more granular scientific content.

Monday, September 23, 2013

Single-cell genomics: taking noise into account

Technical variation versus average read counts
Reprinted by permission from Macmillan Publishers Ltd
Nat Methods, advance online (doi:10.1038/nmeth.2645)

Sequencing throughput and amplification strategies have improved to a point where single cell sequencing has become feasible. There was a recent review in Nat Rev Gen covering the progress in single cell genomics and some of its potential applications that is worth a read. However, the required amplification steps are likely to introduce significant variation for small amounts of starting material. A group of investigators from the EBML-Heidelberg, EMBL-EBI and the Sanger had a look at this problem and developed an approach to quantify and account for such technical variability. The method is described in a paper that is now in press and makes use of spiked-ins to estimate technical variation across a range of different mean expression strengths (see Figure). As with most of these short communications a lot of work is included in supplementary materials, including a detailed R workflow description that should allow anyone to recreate the main figures from the paper.

This paper is a starting point for more things to come. It is focused on the method and there is clearly a lot of biological findings to be made from those data. More broadly, the Sanger and the EMBL-EBI have recently set up a joint single cell genomics centre to acquire an develop the required technology. From the EBI side this is headed by Sarah Teichmann (also affiliated with Sanger) and John Marioni. Unfortunately, for my interests in post-translational regulation, single-cell proteomics is still lagging way behind. The Cytof comes closest but still requires antibodies for detection.

Tuesday, July 02, 2013

Interdisciplinary EMBL postdoc fellowship in genome evolution and chemical-biology

The EMBL Interdisciplinary Postdocs (EIPOD) program is now accepting applications (deadline 12 of September). This program funds interdisciplinary research projects between different units of the EMBL. Applicants are encouraged to discuss self-defined project ideas with EMBL scientists or select up to two project ideas available at the EIPOD website.

One of the project ideas listed this year is for a joint project between our group (EMBL-EBI) and the group of Nassos Typas at the EMBL Genome Biology Unit in Heidelberg. Here is a short description of project idea, entitled "Modeling genotype-to-phenotype relationships in a bacterial cell":

Understanding how phenotypic variability originates from mutations at the level of the DNA is one of the fundamental problems in biology. Sequencing of genomes for multiple individuals along with rich phenotypic profiling data allows us to pose the question of how the sum of mutations in each individual genome results in the observed phenotypic differences. The goal of this project is to develop computational methods to predict the consequences of mutations and gene-content variation on fitness in different conditions for different strains of E. coli.

The Typas group develops high-throughout approaches to study gene function via chemical-genetics and genetic-interaction screening. Previous publications and current research interests are listed in the group webpage. Our group is generally interested in studying the evolution of cellular interaction networks and in this context is interested in understanding how mutations and gene-content variation results in phenotypic consequences for different individuals.

Potential applicants are encouraged to get in touch to discuss a project proposal that relates to this topic. We are particularly keen on applicants with previous experience in any of the following: chemical-informatics, chemical-biology, protein and genome evolution, sequence/structural based prediction of effect of mutations, bacterial pan-genome studies.

Saturday, June 08, 2013

Doing away with scientific journals

I got into a bit of an argument with Björn Brembs on twitter last week because of a statement I made in support of professional editors. I was mostly saying that professional editors were no worse than academic editors but our discussion went mostly into the general usefulness of scientific journals. Björn was arguing his positions that journal rankings in the form of the well known impact factor are absolutely useless. I was trying to argue that (unfortunately) we still need journals to act as filters. Having a discussion on Twitter is painful so I am giving my arguments some space in this blog post.

Björn arguments are based on this recently published review regarding the value of journal ranking (see paper and his blog post). The one line summary would be:

"Journal rank (as measured by impact factor, IF) is so weakly correlated with the available metrics for utility/quality/impact that it is practically useless as an evaluation signal (even if some of these measures become statistically significant)."

I covered some of my arguments before regarding the need of journals for filtering here and here. In essence I think we need some way to filter through the continuous stream of scientific literature and the *only* current filter we have available is the journal system. So lets break this argument in parts. Is it true that : we need filters; journals are working as filters; there are no working alternatives ?

We need filters

I hope that few people will try to argue that we have no need for filters in scientific publishing. On pubmed there are 87551 abstract entries for May which is getting close to 2 papers per minute. It is easy to see that the rate of publishing is not going down any time soon. All current incentives on the author and publishing side will keep pushing this rate up. One single unfiltered feed of papers would not work and it is clear we need some way to sort out what to read. The most immediate way to sort would be on topics. Assuming authors would play nice and not try to tag their papers as broadly as possible (yeah right) this would still not solve our problem. For the topics that are very close to what I work on I already have feeds with fairly broad pubmed queries that I go through myself. For topics that might be one or several steps removed from my area of work I still want to be updated on method developments and discoveries that could have an impact on what I am doing. I already spend an average of 1 to 2 hours a day scanning abstracts, I don't want to increase that.

Journals as filters

If you follow me this far then you might agree that we need filtering processes that go beyond simple topic tagging. Without even considering journal "ranking", the journals already do more than topic tagging since journals are also communities that form around areas of research. To give a concrete example both Bioinformatics and PLOS Computational Biology publish papers in bionformatics but while the first tends to publish more methods papers the latter tends to publish more biological discoveries. Subjectively I tend to prefer the papers published in the PLOS journal due to its community and that has nothing to do perceived impact.

What about impact factors and journal ranking ? In reviewing the literature Björn concludes that there is almost no significant association between impact factors and future citations. This is not in agreement with my own subjective evaluation of different journals I pay attention to. To give an example, the average paper on journals of the BMC series are not the same to me as average papers published in Nature journals. Are there many of you that have a different opinion ? Obviously, this could just mean that my subjective perception is biased and incorrect. This would mean also that journal editors are doing a horrible job and the time they spend evaluating papers is useless. I have worked as an editor for a few months and I can tell you that it is hard work and it is not easy to imagine that it is all useless work. In his review Björn points to, for example, the work by Lozano and colleagues. In that work the authors correlated the impact factor of the journal with future citations of each paper in a given year. For biomedical journals the coefficient of determination has been around 0.25 since around 1970. Although the correlation between impact factor and future citations is not high (r ~ 0.5) it is certainly highly significant given that they looked at such large numbers (25,569,603 articles for biomed). Still this also tell us that evaluating the impact/merit of an individual publication by the journal it is published prone to error. However, what I want to know is: given that I have to select what to read, do I improve my chances of finding potentially interesting papers by restricting my attention to subsets of papers based on the impact factor ?

I tried to get my hands on the data used my Lozano and colleagues but unfortunately they could not give me the dataset they used. Over email, Lozano said I would have to pay Thomson Reuters on the order of $250,000 for access (not so much reproducible research). I wanted to test the enrichment over random of highly versus lowly cited papers in relation to impact factors. After a few other emails Lozano pointed me to this other paper where they calculated enrichment for a few journals in their Figure 4, which I am reproducing here under a Don't Sue me licence. For these journals they calculated the fraction of 1% top most cited papers divided by the fraction of top 1% cited papers across all papers. This gives you an enrichment over random expectation that for journals like Science/Cell/Nature turns out to be around 40 to 50. So there you go, high impact factors, on average, tend to be enriched in papers that will be highly cited in the future.

As an author I hate to be evaluated by the journals I publish in instead of the actual merit of my work. As a reader I admit to my limitations and I need some way to direct my attention to subsets of articles. Both the data and my own subjective evaluation tells me that journal impact factors can be used as way to enrich for potentially interesting articles.

But there are better ways ...

Absolutely ! The current publishing system is a waste of everyone's time as we try to submit papers down a ladder or perceived impact. The papers get reviewed multiple times in different journals, reviewers think that articles need to be improved with year long experiments and discoveries stay hidden in this reviewing limbo for too long. We can do better than this but I would argue that the best way to do away with the current journal system is to replace it with something else. Instead of just shouting for the destruction of journal hierarchies and the death of the impact factor talk about how you are replacing it. I try out every filtering approach I can find and I will pay for anything that works well and saves me time. Google Scholar has a reasonably good recommendation system and it is great to see people developing applications like the Recently app. PLOS is doing a great job of promoting the use of article level metrics that might help others to build recommendation systems. There is work to do but the information and technology for building such recommendation systems is all out there already. I might even start using some of my research budget to work on this problem just out of frustration. I have some ideas on how I would go about this but this blog post is already long. If anyone wants to chat about this drop me a line. At the very least we can all start using preprint servers and put our work out before we bury it for a year in the publishing limbo.

Monday, May 13, 2013

EBI-Sanger postdoctoral fellowship on Plasmodium kinase regulatory networks

I am happy to announce a call for applications for a EBI-Sanger postdoctoral fellowship to study the kinase regulatory networks in Plasmodium. This is one of four currently open calls in the the EBI–Sanger Postdoctoral (ESPOD) Programme and the call closes on the 26th of July. This interdisciplinary programme is meant to foster collaborations between the EBI and the Wellcome Trust Sanger Institute, both at the Genome Campus near Cambridge UK. Our project is a collaboration between myself (EBI), Jyoti Choudhary (mass-spectrometry group leader at Sanger) and Oliver Billker (group leader at Sanger studying malaria parasites). The postdoctoral fellow would have the opportunity to work at the interface between bioinformatics, mass-spectrometry (MS) and Plasmodium biology. A description of the project can be found online (PDF) but briefly the objective is to characterize kinase regulatory network of the malaria parasite by combining quantitative phosphoproteomics with computational analysis. There will be a strong emphasis on the computational analysis of the MS data and some prior computational experience is a plus. The ideal candidate would have prior experience in phosphoproteomics with a strong interest in learning the computational aspects required or prior experience in the relevant computational skills and interest in learning/performing some of the experimental work. Feel free to contact me if you require more information about the project or the ESPOD fellowship.

Sunday, April 07, 2013

The case for article submission fees

For scientific journal articles the cost of publishing is almost exclusively covered by the articles that are accepted for publication. Either by the published authors or by the libraries. Advertisement and other items like the organization of conferences are probably not a very significant source of income. I don't want to argue here again the value of publishers and how we should be decoupling the costs of publishing (close to zero) from peer-review, accreditation and filtering. Instead I just want to explore the idea for a very obvious form a income that is not used - submission fees. Why don't journals charge all potential authors a fixed cost per submission, even if the article ends up being rejected ? I am sure publishers have considered this option and they have reached the conclusion that this is not viable. I would like to know why and maybe someone reading this can give a strong argument against. Hopefully someone from the publishing side that has crunched the numbers.

The strongest reason against that I can imagine would be a reduction in submission rates. If only some publishers adopt this fee authors will send their papers to journals that don't charge for submission. Would the impact be that significant ? For journals with high-rejection rates this might even be useful since it would preferentially deter authors that are less confident about the value of their work. For journals with lower rejection rates the impact of the fee would be small since authors are less concerned with a rejection. Publishers might even benefit from implementing a submission charge in the from of a lock-in effect if they do not charge when transferring articles between their journals. Publishers already use this practice of transferring articles and peer-review comments between their journals. It already functions as a form of lock-in since authors, wishing to avoid another lengthy round of peer-review, will tend to accept. If the submission fee is only charged once the authors are even more likely to keep the articles within the publisher. Given the current trend of publishers trying to own the full stack of high-to-low rejection rate journals these lock-in effects are going to be increasingly valuable.

The overall benefit would be an increased viability of open access. A submission fee might also accelerate the decoupling of peer-review from the act of publishing. If we get used to paying separately for publishing and for submission/evaluation we might get used to having these activities performed by different entities. Finally, if it results also in less slicing into ever smaller publishable units we might all benefit.

Update: Anna Sharman sent me a link to one of her blog posts where she covers this topic in much more detail.

Photo adapted from: http://www.flickr.com/photos/drh/2188723772/

Tuesday, April 02, 2013

Benchmark the experimental data not just the integration

There was a paper out today in Molecular Systems Biology with a resource of kinase-substrate interactions obtained from in-vitro kinase assays using protein micro-arrays. It is clear that there is a significant difference between what a kinase regulates inside a cell and what it could phosphorylate in-vitro given appropriate conditions. In fact, reviewer number 1 in the attached comments (PDF), explains at length why these protein-array based kinase interactions may be problematic. The authors are aware of this and integrate the protein-array data with additional data sources to derive a higher confidence dataset of kinase interactions. The authors then provide computational and experimental benchmarks of the integrated dataset. What I have an issue with is that the original protein-array data itself it not clearly benchmarked in the paper. How are we to know what is the contribution of that feature and all of the hard experimental work for the final integrated predictor ?

A very similar procedure was used in a recent Cell paper paper where co-complex membership was predicted based on the elution profiles of proteins detected by mass-spectrometry. Here again, the authors do not present benchmarks of the interactions predicted solely on the co-elution data. Instead they integrate it with around 15 other features before evaluating and studying the final result. In this case, they have in supplementary material some indirect indication of the value of the experimental data by itself by providing the rank each feature has in the predictor.

I don't think the papers are incorrect. In both cases the authors provide an interesting final result with the integrated set of interactions benchmarked and analysed. However, in both cases, we are unsure of the value of the experimental data that is presented. I don't think it is an unreasonable request. There are many reasons why this information should be clearly presented before additional data integration steps are used. At the very least this is important for other groups thinking about setting up similar experimental approaches.