Cellular Consequences of Genetic variation

Thursday, October 20, 2016

Group member profile - Marco Galardini

Marco Galardini (webpage, Gscholar, twitter, EMBL-EBI page), a postdoc in the group, is the next member that kindly volunteered to write a group profile page. He is currently one of the few people in the group that is not working directly with protein PTM regulation but is looking instead more generally at the consequences of mutations on cellular growth phenotypes.

What was the path the brought you to the group? Where are you from and what did you work on before arriving in the group?

I like to think of my career so far as a simulated annealing process, where the temperature parameter is substituted by curiosity. I started by studying applied chemistry in high school; we had to spend lots of time in the lab and we got plenty of opportunities to get our hands dirty with both inorganic and organic chemistry. The latter is probably the reason why I then pursued a bachelor degree in biotechnology at the university of Florence, with a focus on industrial and environmental processes; during that time I also got interested in microbiology, mostly by the great diversity and versatility of the bacterial kingdom. When I discovered that the University of Bologna was offering a masters degree in Bioinformatics I jumped into it with great enthusiasm, eventually combining it with the interest in microbiology during an internship at the Nijmegen university.

After a short break as a software developer in a company I started a PhD in Florence, carrying on a comparative genomics study in the nitrogen-fixing plant symbiont Sinorhizobium meliloti (PhD thesis). Since this project combined computational biology, microbiology and the impact on the environment, I can say that it succeeded in combining the various academic interests I had developed during the years. Following the simulated annealing analogy I can say that I sometimes felt like I was in a local optima. Under the supervision of Marco Bazzicalupo, Emanuele Biondi and Alessio Mengoni (lab page) I was lucky enough to ride the wave of genomics in a moment where getting bacterial genomes was becoming increasingly easy; I was therefore able to describe the interesting functional and evolutionary features of the (relatively) complex genome of S. meliloti, while developing some computational methods on the side.

What are you currently working on?

I'm currently two years into a very exciting project that aims to develop models
to predict phenotypes for the Escherichia coli species, in close collaboration with Nassos Typas (EMBL). Bacterial species are known to harbor striking genetic variability between strains, both in the form of point mutations, but also with respect to their gene content (the so-called pangenome), due to recombination and lateral gene trasfer. Understanding how this variability translates to differences in phenotypes has been therefore the focus of this project. This has proven to be both a challenging and valuable experience, as we had to build a strain collection from scratch, phenotype it on different growth conditions and sequence a large fraction of those strains.

For this I owe a great deal of gratitude to various members of the Typas group who have helped me out in running the wet-lab experiments, namely Lucia Herrera and Anja Telzerow. I am now in the process of testing the predictive models, who have proven to show very promising results, with potential applications to other species, in and outside the bacterial kingdom.

What are some of the areas of research that excite you right now?

Despite the common claim that no great discoveries are made anymore, I think that science is moving faster and getting bigger every day; if we want to be optimistic it should only be a matter of time before this will start to have an impact on our everyday lives. Some examples involving microbiology include real-time tracking of infectious diseases (e.g. WGSA, or NextFlu) and microbial communities as environmental sensors (e.g. Smith et al. mBio 2015). I'm therefore very excited to see how the lag time between a discovery and its application shrinks; there are legitimate concerns of course (e.g. laws not catching up, democratization of new technologies), but I can't help being thrilled about it. I also enjoy reading about how human activities are becoming a new powerful selective pressure in evolution; antibiotic resistance is the best known example, but there are also positive examples like the reports of bacterial species evolving the ability to degrade plastic. This shows that the natural world is still worth exploring and that evolution can also act on very short time-scale.

What sort of things do you like outside of the science?

I used to be quite active in photography, with a preference for analogic media
such as black and white films and polaroids; despite not being very active right now, I'm still packing my camera when going for a short trip. I also have an interest in small DIY projects involving music; I have built some experimental synths running on Arduino, which were used in a band I used to play in. Apart from that, I enjoy reading and watching movies, going to contemporary art exhibitions, and a bit of cycling.

Thursday, October 13, 2016

Phylogenetic history of fungal protein phosphorylation – the anti-press release

I have long been interested in studying the rate by which protein interactions change during evolution. A new chapter in this ongoing research agenda has been published this week (article & perspective) in collaboration with the group of Judit Villén in the University of Washington and many contributions from the labs of Maitreya J. Dunham, Eulàlia de Nadal and Francesc Posas. For the first time I tried to engage with the press by putting out a press-release and it was interesting to work with Mary Todd Bergman at EMBL-EBI to digest the work to its core message. However, to atone for my sins of not being able to give sufficient context and credit to the work that has come before this, I decided that I could use this blog to write a sort of anti-press release. Grab some coffee, get confy and don’t expect a punchy fast message here because this manuscript has a long and branched root.

Cue flashback …

For me, this started 15 years ago (gasp, I can't believe it has been this long) when Andreas Wagner published some work trying to measure the conservation of protein interactions after gene duplication. This in turn was made possible by the first protein interaction mapping efforts. In my PhD lab I was using conservation to predict interactions for SH3 domains that bind short linear proline rich peptides. Influenced by Andreas Wagner’s papers, “linear-motif” research at EMBL and the field of evolution of gene expression I hypothesized that domain-peptide interactions could be poorly conserved since they are mediated only by a few residues in a linear unstructured peptide. This idea was first reported in the literature in a perspective by Neduva and Russell also at the EMBL at the time. I tried to generalize the concept that specificity and evolvability could be related such that very unspecific interactions may be more prone to change during evolution (article, blog post). Other groups have also shown that linear motif interactions can be fast evolving (e.g. Chica et al, Edwards et al.,)

Mass spectrometry to the rescue

The problem with trying to compare protein interactions is that you need to measure them first. The domain-peptide interactions mediated by linear motifs are particularly hard to identify because they are usually of low affinity. So, the work described above was based predicted interface sites for linear motifs. At this point, improvements in mass spectrometry and enrichment strategies really made a difference. The identification of protein phosphorylation sites made it possible to find, in large scale, thousands of sites that represent high-confidence interaction sites. The back-story that resulted in these developments in MS is a story our collaborator Judit Villén has been a part of and that I can’t tell as well.

Kinase-target interactions are also linear motif interactions and if the previous linear motif research was correct the phosphosites that represent these interactions should be rapidly evolving. That was exactly what I ended up testing when I started my postdoc. We were just one of several groups working on it and in 2009 several papers got published on the topic including our work (Beltrao et al., blog post) and others (Landry et al., Tan et al., Holt et al, Amoutzias et al. ). All of these together made a really strong case for the fast divergence of protein phosphorylation, although other articles followed to also note the constraints (Nguyen Ba & Moses and Gray & Kumar). At this point the conversation was also shifting to the consequences of these evolutionary changes. Mirroring similar discussions around the consequences of changes in gene-expression there was a sense that some of these phosphosites, and therefore kinase-substrate interactions do not play a functional role (Gustav E. Lienhard 2008 , Landry et al. 2009.). I also tried to contribute to the debate on functional relevance by trying to assign functions to PTM sites computational and extending the conservation analysis to other PTMs (Beltrao et al. 2013, blog post).

What was left to find then?

Most of the studies mentioned so far have relied on pairwise species comparisons. What we tried to do in this more recent study was to obtain a phylogenetic history of protein phosphorylation across a very broad phylogeny. For this, Judit’s lab obtained phosphorylation data for 18 fungal species that shared a common ancestor hundreds of millions of years ago. Romain Studer in our group then tried to combine the phosphorylation observations, which are known to be incomplete, with sequence based predictions of phosphorylation potential and the species phylogenetic tree. This allowed us to predict a likely evolutionary history for thousands of phosphosites.

If you happen to have kept up with the literature that I mentioned above then you might expect some of the findings we observed next - most phosphosites are recent acquisitions and the small fraction of ancient phosphosites is enriched in functionally relevant sites. From the ancient sites we tested a few cases for fitness and functional consequences and we think these serve as great resource for future cell signaling studies (and yes we are chasing that). Given the breath of species we studied we could also measure the changes in phosphorylation “motifs” that are found across species. Kinases recognize their target sites, in part, by the sequences around the phospho-acceptor residue, so-called kinase target motifs. We could observe that the types of target motifs used across species showed changes that we think relates to changes in the types of kinases or their activities. We are now interested in better understanding what determines kinase specificity so that we can study their evolution - what did the first protein kinase look like ?

So who the hell cares?

Many of the methods we are working on are useful to better understand the impact of mutations related to these signaling circuits in cancer or other diseases. We are working on this too but I care about this because I want to know how nature comes up with all these beautiful diverse mechanisms and forms. Coming up with a history of how these phosphosites have been changing across species is really just the first step. We have almost no clue as to what the thousands of observed phosphosites are doing, if anything. Are the signaling pathways changing in a neutral way that conserves the functional outcomes?

From a personal note it is fantastic to be able to connect this work to things that I did all the way back to my first PhD paper and that I can connect this blog post to a chain of several other blog posts covering the research I have done and that our research group is doing now.

Friday, June 17, 2016

Group member profile - Romain Studer

Next up on this series of group member profiles is Romain Studer (blog, scholar profile, twitter), a postdoc in the group that is very interested in protein evolution combining sequence and structural information.

What was the path the brought you to the group? Where are you from and what did you work on before arriving in the group?

My main interest in biology is the study of proteins in a broad diversity of organisms. My PhD work, as well as my postdoctoral research, was focused on protein evolution, at the primary sequence level and at the tertiary structure level.

I did my undergraduate studies and PhD work at University of Lausanne, Switzerland. My undergraduate studies were focused on immunology and biochemistry, with a dash of bioinformatics. My PhD research, with Prof. Marc Robinson-Rechavi, was more on evolution and mainly focused on the comparison between paralogs (i.e. genes that diverged after a duplication event) and orthologs (i.e. genes that diverged after a speciation event). Positive selection can be used as a mechanism to fix advantageous mutations between paralogs, as well as between orthologous genes. The conclusion of my analyses was threefold: (1) positive selection affects diverse phylogenetic branches and diverse gene categories during vertebrate evolution; (2) positive selection concerns only a small proportion of sites (1%-5%); and (3) whole genome duplication had no detectable impact on the prevalence of this positive selection (Studer RA et al. 2008, Studer RA et al 2010).

After my PhD, I stayed a few months in Lausanne to work with Prof. Bernard C. Rossier to explore the evolution of sodium pumps and channels, involved in the regulation of blood pressure. I found that the sequential emergence of the different subunits of these proteins could be directly linked to the emergence of multicellularity in animals (Studer RA et al. 2011; Rossier BC et al. 2015).

In 2010, I then obtained two successive fellowship grants from the SNSF to move to UK. I worked in the group of Prof. Christine Orengo, where I have explored in more details the influence of structure on protein evolution. I contributed to the evolutionary aspect of the CATH database, a classification of protein domains. I also explored the evolution of RubisCO, the enzyme responsible for photosynthesis. I reconstructed the ancestral 3D structures of RubisCO and estimated the stability effect (ΔΔG) of mutations during evolution. The essential conclusion of this work was that mutations providing an increase in catalytic rate tend to be destabilising, but are rapidly followed by stabilising mutations during the course of evolution (Studer et al 2014).

My SNSF funding finished by the end of Summer 2013 and I then started to work as a senior postdoctoral fellow with Pedro Beltrao at the European Bioinformatics Institute (EMBL-EBI).

What are you currently working on?

My current project is to estimate the level of conservation of posttranslational modifications (PTMs) in proteins, in particular phosphorylation. Phosphorylation is an important mechanism to quickly regulate protein function. Combining phylogenomics methods and experimental phosphoproteomics data, I am evaluating the replacement rate of phosphorylated residues during the evolution of multiple yeast species. I found that (1) most phosphosites are quite recent, (2) ancient phosphosites are very likely to be important for function and (3) motif preference have diverged across species.

What are some of the areas of research that excite you right now?

One interesting field is the application of experimental analyses on ancestral characters, such ancestral amino acid mutations, phosphorylation state or whole ancestral proteins. Evolutionary frameworks allow the prediction of ancestral sequences with good accuracy. Such sequences can then be modelled in 3D structure by homology modelling, or can be even resurrected in vitro by protein synthesis. These ancient proteins can be submitted to the same analysis as their modern counterpart and explore the difference over the time. This framework has the potential to reveal important properties.

Monday, January 11, 2016

State of lab, year 3 - the first group outcomes

Lab poster made by Omar for the EMBL lab day

This is the third blog post of what I hope will be a very long series. Even in just three years it is fun to go back and read the past yearly entries (year 1 and year 2). I am sure I will enjoy reading back over 5 and 10 of these yearly reports. This report marks the end of the third year of the lab. I have to stop thinking of how quickly a year goes by. We will have a review in March 2017 that will likely dictate our extension after the first 5 years and if extended the group has then an additional 4 years before having to leave the EMBL-EBI (after a maximum of 9 years).

During the third year we said goodbye to Juan A Cordero Varela (master student, linkedIn). Marta Strumillo, that was doing an internship, stayed on to do her PhD in the group. Towards the very end of last year we were joined by two additional postdoctoral fellows, Bede Busby and Cristina Vieitez. As I had mentioned last year they will be working at the Genome Biology unit in Heidelberg in a close partnership with Nassos Typas' lab. Bede and Cristina are setting up yeast genetics methods to study protein modifications. This year I also started a blog series on our group members and I will try to get everyone to participate.

Group size and grant applications

At least for one year I have let myself apply for fewer funding opportunities. The group has now 12 members with one additional person joining this March. I am not sure what is the best strategy to manage the size of a group. Most grants and fellowships have very low success rate (10% to 30%) and if the objective is to maintain a specific group size then one would have to be very lucky to get just enough funding to stay at steady-state. I suspect that many group leaders just keep applying to all available funding and let the group size increase and collapse according to the success of the applications. I would be curious to hear from others what their thoughts are on this. My current impression is that somewhere between 5-15 people is a manageable and efficient group size but does anyone limit growth to stabilize group size ?

To be, or not to be, an experimental group (revisited)

Cristina and Bede at the visitor
lab space in EMBL-EBI

As described in the first year report, we don't have lab space at the EMBL-EBI. To be able to have access to lab space my initial solution was to co-supervise group members with experimental groups. This has been useful, particularly in creating closer collaborations with some of the groups involved. Haruna and Brandon have worked with Jyoti Choudhary to have access to mass-spec instruments. Sheriff has been working in London in the lab of Silvia Santos where he has contributed to some microscopy experiments and Marco spent some time in Nassos Typas' lab learning how to do chemical genetic screens. In all of these projects the group members are spending >50% of the time analysing the data. Bede and Cristina will be the first group members that will be primarily dedicated to experimental work, although I am sure they will also have an opportunity to further develop their computational skills. So far, these arrangements have been working out scientifically. However, I am now sure that, when I move out of the EMBL-EBI, I will aim to have access to lab space.

Projects as science, stories and publishable units

As I had mentioned in the second year report, I am no longer working on a research project myself. I had two periods of time last year where I emptied my to-do list but it didn't stay down long enough to be able to pick up a project. I am more at ease with the management role in the sense that I have convinced myself that it is actual work. It took me a while not to feel guilty about just doing management tasks. It is actually great to be able to help guide the flow of the projects of all of the lab members. From the inception, through the initial stumbles, turns in direction, building up the promising results, up until there is enough progress to be worth communicating it. This also means deciding to quit an idea when the research direction is no longer promising. In this process of managing a large set of projects I have felt a very clear temptation to focus on the publishable units as the outcomes. Although science is nothing if not communicated there is a risk of losing track of the priority of moving science forward. Asking questions and gathering evidence happens always in a scientific context. This context or story is also important for properly communicating your results to others. The problem is when the focus shifts too much into thinking about what are the experiments that are needed to write a paper instead of what are the best experiments to answer the scientific question at hand. These two things are hopefully aligned but the publishable unit should not be the goal in itself.

The first group outcomes

In the past year I finally managed to publish the last papers still involving my postdoctoral lab. The two articles reflect the two strands of research in our group. One paper describes a set of phosphorylation sites collected for X. laevis and an analysis of its conservation and structural features. We found that the degree of conservation of phosphosites and putative kinase-protein interactions is predictive of functionally relevant sites and interactions. We also describe a potential way to identify PTM sites that may control protein conformations. The second article is a large effort to identify conditional genetic interactions in S. cerevisiae. The main message of that work was that there is a substantial amount of genetic interactions that are condition specific. These conditional genetic data allowed us to identify novel roles for yeast genes in the cell wall integrity pathway. Besides these studies we also published the first articles from work that was started within the group. I mentioned before Omar's method to predict kinase specificity from interaction networks. In addition to this we also published a news and views article highlighting recent work from Stelzl's lab and a review on the feasibility of using rational design strategies to create novel PTM regulatory sites in proteins of interest. I was anxious with the time it was taking to get the group to this point. Three years to have research outputs coming from the group feels slow but when talking with others it is apparently not unusual.

Preprints and open science

We have two additional manuscripts that are now making their way through journals. David's project on a map of human signalling states based on conditional phosphoproteomics data and Romain's phylogenetic based analysis of fungal phosphorylation sites. I am personally very much in favour of preprint servers. Although I think I have been ahead of others in suggesting the use of preprints in biology (blog post 2006) I have been slow to actually do it. My current policy in the lab lab is to first ask the authors in the group if they want to submit and then make sure all collaborators are ok with it. Unfortunately, so far, there was no consensus among the authors. I will start to push more strongly for future manuscripts to be submitted to preprint servers. When possible, we will also experiment with making a projects's data and initial analysis available online before the preprints.

Tuesday, December 15, 2015

Group member profile - Haruna Imamura

Here is the second entry into what I hope will be a very long series where I introduce our lab's members. Next up is Haruna Imamura (pubmed), an interdisciplinary postdoc with experience in mass-spectrometry and informatics.

What was the path the brought you to the group? Where are you from and what did you work on before arriving in the group?

I first joined the biological network analysis group in my undergraduate course in the lab of Masaru Tomita at Keio University (Japan). I launched a project, which applied the concept of network analysis to a dataset of phosphorylation dynamics. Because of this experience, I grew increasingly interested in resolving the biological importance of phosphorylation in the context of signal transduction and began to study phosphoproteomics. From my master’s course, I joined the proteome group led by Yasushi Ishihama, in the same university, and learned proteomics-related experimental skills, including phosphorylation enrichment and mass spectrometry (MS) manipulation. As Prof. Ishihama moved to Kyoto University (Japan), I also moved and started my PhD course there. My PhD project was to determine the protein kinase selectivity towards their substrates (Imamura et al. 2014, Imami et al. 2012, Imamura et al. 2012) . We analysed lysates after in vitro kinase reactions and identified phosphorylation sites with MS to obtain kinase/substrate relationships in a high-throughput manner. The information obtained in the study would allow connecting already accumulated phosphorylation data to kinases.

As MS has been improved dramatically, nowadays there are more research studies coming up with a long list of identified phosphopeptides. However, it is revealing that only a small fraction of modification sites seem to have an important function in biological systems. So the next challenge in this field is mining functionally important phosphorylation among the pool of ‘junk’ phosphorylation. In this context, I mainly had three wishes for my post-doc project: (1) to be able to contribute through proteome experience, (2) to learn more about informatics, and (3) to reveal important phosphorylation in biological systems. I found Pedro’s group to be a great environment for it, and I asked him for position availability. Fortunately, there was a project that matched my background, and here I am.

What are you currently working on?
I am working on a project to study how phosphorylation in host cells is changed by the infection of Salmonella. Salmonella is a facultative intracellular pathogen that is one cause of diarrhoea in humans. The process of infection is like a series of offensive and defensive battles between Salmonella and the host cells. Salmonella tries to hijack and utilise the host’s cellular system for its proliferation, while the host cells tried to eliminate them by activating an immune response. Among various changes happening in the cells, post-translational modifications, including phosphorylation, play important roles.

We use Salmonella enterica serovar typhimurium as a model system and study host cell-lines that have been infected in a time-course. Their phosphoproteome are analysed using MS, and the experimental dataset was combined with other publicly available information by informatics to find out key regulations for Salmonella infection. I am an EIPOD fellow, which is a programme from the EMBL Interdisciplinary Postdocs (EIPOD) initiative, promoting interdisciplinary research at EMBL. This work is a collaboration with the Typas lab in EMBL-Heidelberg, which is an expert of microbiology and genetic interactions. The MS analysis has been done with the help of the Proteome core facility led by Jyoti Choundhary in the Sanger institute.

What are some of the areas of research that excite you right now?
With the current technology, phosphoproteome analysis with MS still requires a group of cells. It means the outcome would be averaged among a variety of cell populations. So I am interested in some projects attempting to do single-cell whole-proteome (or even phosphoproteome). Also, cellular imaging interests me, as it would be a complementary technology to MS. For example, mass cytometry could capture and quantify phosphorylation at the single-cell resolution in a systematic way, which enables the study of phosphorylation signalling on intercellular communications.

Besides, out of curiousness, I am interested in research which raises doubt regarding ‘self-consciousness’. For example, in molecular-scale, ‘behaviour epigenetics’ is one of the attractive topics for me, which describes how nurture shapes nature. Also, ‘gut-brain axis’ is gaining more attention, as it is shown that the gut microbiota communicate with the central nervous system and influence the brain. How true is ‘you are what you eat’? Finally, in macro-scale, one of my favourite videos from TED (Suicidal crickets, zombie roaches and other parasite tales) talks about some surprising incidences where parasites control the host brain and can change its behaviour.

What sort of things do you like outside science?
I have fun horse riding since a year ago. I have always wanted to do it since I was in Japan, and the environment here inspired me to start. The stable is about 15 mins by bike from the institute, so I can go there to have class once a week after work. It is good exercise and riding horses is relaxing. Also, it is fun to talk with people there who love horses and learn hands-on biology. I am trying to build a better relationship with the horses, who have a variety of personalities. In daily life, I usually go to the gym for running. It is becoming a routine in my life after I began when I was a PhD student. It helps me to clear my mind and gives my brain a chance to refresh. Running a marathon is one of the things on my bucket list, but I have to put more effort in to achieve it.

Monday, December 14, 2015

Replace journals with recommendation engines

There was another round of interesting discussions on twitter after Mike Eisen decided to scrub all journal tittles from his lab's publication list. Part of the discussion was summarized in this Nature news story. The general idea is that our science should not be evaluated by where it is published but should stand by its own merit. We all want this to happen but unfortunately we don't have infinite time to read papers. Megajournal and open access advocates often dismiss this problem. They will often say that journal rankings are not adequate filters and that we should be able to make our own opinions and to search for whatever we want to read. This is the line of argument that just drives me crazy. It is basically implying that any defence of journal rankings is an admission of inability to evaluate science. The biomedical scientific community is producing over 100 thousand articles every month. Any suggestion that we don't need some sort of filtering mechanism is in turn an admission that you are not aware of the extent of science that is being produced. If you are not scanning table of contents yourself, you are being feed suggestions by someone that does.

Imagine a world without any science journals. Just a single pot where all articles are deposited. I think that the spread of knowledge would slow down. I can barely keep track of advances and authors that are closely related to my work by using keyword searches. I would not think one day to just search for "clustered regularly interspaced short palindromic repeats" or to have a curious look into advances in cryoEM. In the absence of good filters we would risk becoming even more isolated in our small little corners of science and miss out on cross-fertilization. We would tend to focus even more on the science of a few labs that we knew from past works or from personal contact. I would not know where to look at for important new discoveries in other fields that could impact my own. The current system of journals serve this role of trying to assign a piece of science to a target audience. If nothing else, journals can filter through self-selection of topics at submission for specific communities. Less specific journals try to promote the advances in science that should reach a broader audience. I think that we are not even aware of how much the current system of journals facilitates the exchange of information within and across fields. In my opinion, the best way and probably the only way to get rid of the current system is to replace it by something that can do the equivalent job.

One way to replace the current system, by something less frustrating, would be to use automated recommendation engines. I have tried Google Scholar recommendations and Pubchase and both work really well. If we want to get rid of journals we need to figure out a way for such automated systems to mimic the journal's transfer of knowledge within and across communities. I can easily imagine the steps needed to come up with article similarity metrics and clustering of users and so on. One can also easily imagine that the recommendation engines can react to user feedback such that a niche community will "bump up" - for example by click-trough counts - the perceived value of a piece of science to such an extent that it get's recommended to a wider community. This would require a hierarchical recommendation engine that is widely used. The biggest advantage of such a system would be that it can work post publication on top of megajournals. Scientists could stop focusing their energy on submitting to journal X and just focus on producing good science that would spread widely. I am convinced that the fastest way to get to a world without journals is to come up with this replacement. If we really want to get rid of impact factors and journal rankings we need to start talking about what we will do instead.

One thing we won't be able to change - we don't have enough time to read all of the science in the world. Unfortunately we don't even have enough time to read all of the articles of job applicants. It is not hard to predict that any other solution that replaces journal rankings will too often used to make hiring decisions.

Friday, November 27, 2015

Predicting PTM specificities from MS data and interaction networks

Around four years ago I wrote this blog post where I suggested that it might be possible to combine protein interaction data with phosphosites from mass-spectrometry (MS) data to infer the specificity of protein kinases. I did a very simple pilot test and invited others to contribute to the idea. Nobody really picked up on it until Omar Wagih, a PhD student in the group, decided to test the limits of the approach. To his credit I didn't even ask him to do it, his main project was supposed to be on individual genomics. I am glad that he deviated long enough to get some interesting results that have now been published.

As I described four years ago, the main inspiration for this project was the work of Neduva and colleagues. They showed that motif enrichment applied to the interaction partners of peptide binding domains can reveal the binding specificity of the domain. One step of their method was to filter out regions of proteins that were unlikely to be target sequences before doing motif identification. For PTM enzymes or binding domains we should be able to take advantage of the MS derived PTM data to select the peptides for motif identification by just taking the peptide sequences around the PTM sites. This was exactly what Omar set out to do by focusing on human kinases as a test case.

To summarize the outcome of this project the method works with some limitations. For around a third of human kinases that could be benchmarked he got very good predictions (AUC>0.7). For some kinase families the predictions are better than others and we think it due to how specific the kinase is for the residues around the target site. It is known that kinases find their targets via multiple mechanisms (e.g. docking sites, shared interactions, co-localization, etc). This specificity prediction approach will work better for kinases that find their targets mostly by recognizing amino-acids near the phosphosite. With the help of Naoyuki Sugiyama in Yasushi Ishihama's lab we validated the specificity predictions for 4 understudied human kinases. One advantage of using this approach is that it could be very general. Omar tried it also on 14-3-3 domains, that bind phosphosites and also on a bromodomain containing protein that is known to bind acetylated peptides. Finally, we also tried to use this to compare kinase specificity between human and mouse but given the current limitation of the method I don't it is possible to use these predictions alone to find divergent cases of specificity.

The predictions for human kinase specificity can be found here and a tutorial on how to repeat these predictions is here. The motif enrichment was done using the motif-x algorithm. Given that we could not really use the web version Omar implemented the algorithm in R and a package is available here.

There are many other ways to predict specificities for PTM enzymes and binding domains. If you have many known target sites the best way is to train a predictor such as Netphorest or GPS. There is also the possibility of using the known target sites in conjunction with structural data to infer rules about specificity and the specificity determining residues. A great example of this is Predikin and more recently KINspect. Ongoing work in the group now aims to combine what Omar did with some aspects of Predikin to study the evolution of kinase specificity.

Going back to beginning of the post this idea was my second attempt at an open science project. The first attempt was a project on the evolution and function of protein phosphorylation (described here). This ended up being one of the main projects of my postdoc and now the main focus of the group. I am still curious to know if distributed open science projects will ever take off. I don't mean a big project consortia but smaller scale research where several people could easily contribute with their expertise almost as "spare cycles". Often when you are an expert in some analysis or method you could easily add a contribution with little effort. However, there was much more excitement about open science a few years ago whereas now most of the discussions have shifted to pre-prints and doing away with the traditional publishing system. Maybe we just don't have time to pay attention or to contribute to such open projects.

Thursday, August 13, 2015

EBPOD postdoctoral fellowship to study mutational properties in human cancers

Applications are open for the EMBL-EBI / Cambridge Computational Biomedical Postdoctoral Fellowships (EBPOD programme). This program, now in its second year, aims to foster collaborations between the EMBL-EBI, the NIHR Cambridge Biomedical Research Centre (BRC) and the University of Cambridge’s School of the Biological Sciences (SBS). Every year, groups of these institutions devise potential collaborative research areas and a set of project ideas is put forward. This year there are 8 projects to which applicants can apply to. The deadline is on the 3rd of September and applications should be sent via the Cambridge jobs website (http://www.jobs.cam.ac.uk/job/7770/).

This year our group is teaming up with Martin Miller's and Pippa Corrie's groups to study the mutational properties that associate with anti-tumour immune response and immunotherapy in human cancers. The full project description is available here. We are looking for applicants with a background in bioinformatics and an interest in genomics, DNA and protein evolution, sequence analysis and cancer biology. Extensive past experience (PhD) in bioinformatics is required.

We welcome any queries regarding the project including potential other directions that relate to theme of described project.

Wednesday, April 08, 2015

Positions available to study the functional relevance of protein phosphorylation

Photo by leg0fenris. Disclaimer: this photo should
not be taken as implicit support for the actions of the empire

In the past few years, thanks to advances in mass-spectrometry, tens to hundreds of thousands of phosphorylation sites have been discovered across different species. However, even for very well studied model organisms like yeast we known the function of only a very small number of these. Along with other groups, we have shown that these modifications can diverge quickly (Landry, Beltrao, Tan) leading to the hypothesis that some of these phosphorylation sites might even serve no purpose in extant species. Given these evolutionary observations and the large number of sites that are now routinely identified per study how do we go about identifying which ones are indeed functionally relevant ? In what environmental contexts ? How many might be "non-functional" ? If these questions sound interesting then we have two posts (postdoc and technician) currently open to develop genetic approaches that we think are going to be important to answer these questions. The work will be conducted at the EMBL Genome Biology unit in Heidelberg (Germany) in collaboration with the Typas lab.

Answering these questions will take a combination of different approaches ranging from proteomics to genetics and bioinformatics. These positions, although focused on the genetics aspects, will offer the possibility to explore and learn from the other expertise. The deadline for application is the 17th of May. Additionally information about our group can also be seen at the EBI webpage and we welcome informal questions about the project and positions by email.

Friday, March 27, 2015

Scientific Reports partners with Research Square to unbundle peer-review

Apparently, Scientific Reports has sent out emails to their editorial board members about an upcoming trial for a new peer-review track. The email is available this link for now. According to the email: "a selection of authors submitting a biology manuscript to Scientific Reports will be able to opt-in to a fast-track peer-review service at an additional cost"

They claim that this service will speed up the response time and they commit to have the editorial decision and peer-review comments in 3 weeks from submission. To do this, they will partner with Research Square that offers a third-party peer-review system called Rubriq. This service has been previously covered in the news by The Economist and Nature. Rubriq appears to be essentially a web-based reviewing platform. Scientists can register on the system and get matched to submitted articles. Reviewing is paid on a per-article basis. The process is described well in their website and they have online an example report.

Some of the reactions to this partnership have been negative. See for example this twitter thread. One of the negative comments appears to be that Scientific Reports is trying to sell a fast track on top of their existing peer-review track. I honestly think this was just a bad PR move and the wrong focus on their email to editors. I assume Scientific Reports is working like PLOS ONE with many academic editors and academic reviewers that are not paid at all. They still need some editorial staff to make sure the papers move along the process which is what probably costs them money. Rubriq is currently charging $500-$650 per article although I assume they might have some cheaper deal with Scientific Reports for this trial. If this trial works out I can imagine that Scientific Reports could cut down on costs per paper significantly but in the long run unbundled peer review would probably actually hurt them. If peer-reviewing is external and editorial decisions are based on scientific soundness then the journal becomes just a branded specialized blogging platform.

With the bad PR paid "fast-track" notion out of the way I think that the best discussion is really about the merit of unbundling peer review. What I don't like about it is that it sounds like Amazon Turk for peer review. I don't review articles because I get paid to do it. However, if I needed to make a living out of it I would not reject so many requests as I do now. I think Rubriq is currently paying $100 per referee report which, for now, is just an extra incentive. This extra incentive is apparently still very important has Rubriq found out when running a survey. If we imagine this sort of marketplace scaling up we would need to have some assurance that professional reviewing was up to some required standard. How do we define and evaluate these standards is really worth thinking about.

The positive aspects of third party peer review should be clear to anyone that has gone through the process. There is so much time wasted from having the same paper re-submitted to several different journals and getting reviewed by a different set of reviewers each time. Having the evaluation of the soundness and merit of research separate from dissemination would be a clear innovation in the scientific process. This would also make the publishing costs more transparent and probably would result in lower prices.

So, overall I think it is great the Scientific Reports is doing this trial. Many people have talked about third party peer review and paid peer review. We do want more transparency about the costs of publishing. Maybe it turns out that Amazon Turk for peer review is a bad model but if we don't try new things we won’t find out.

P.S - Dear NPG PR people, feel free to use the tittle in this post when you announce this partnership and put less of an emphasis on speed. You’re welcome.

Tuesday, March 10, 2015

A Borg moment and the end of Friendfeed

Apparently Facebook finally decided to shutdown Friendfeed after several years of declining usage. I only found out because Neil, Deepak and Cameron wrote posts about this. Although I was a heavy user I ended up moving with the crowd after the Facebook acquisition. For those that never used it but are familiar with Twitter or Facebook it might be hard to understand why some people like myself are so disappointed with it's decline. Friendfeed was simply leaps ahead of anything at the time as a mechanism for sharing information and organizing discussions around these shared items. In fact, although there has been no further development for 5 years it is still much better than Twitter for these things. As Neil mentioned in his post, it is hard to understand why this is the case. Maybe because comments were attached to a shared item and not limited to 140 characters so you could actually have meaningful discussions. Unlike forums the shared items were a feed/river so there was the same impression and emphasis on immediacy as twitter. However, recently commented items would jump up on your feed which would tend to foster discussions. It is possible that it only worked because those that joined were the right people at the right time. Maybe it would not scale with the trolls. We will never know.

For those that never used it I want to write down the best experience I ever had on Friendfeed. I was attending the ISMB conference in Toronto in 2008. The number of geeks at this conference is understandably high and there were many Friendfeed users attending. At the time Friendfeed had already introduced the notion of a "room" which was a separate public feed that anyone could join. Similar to tracking a hashtag on twitter. A feed for the conference was set up and many people at the conference joined and started participating. In fact, the feed is still available here so you can go have a look for the time being. This was the first time I really had the impression of connecting to a hive-mind. In this back channel tens of people were taking notes and giving comments about the several simultaneous talks. During keynotes you could even see, as the speaker was changing topics, different people would take up the slack of taking notes and commenting according to their own expertise. Unlike twitter these didn't feel like we were drowning in a sea of uncoordinated messages. You could always focus your attention on just one thread (i.e. a shared item) and its comments at any time. It worked so well that we ended up using the notes to write up a conference report that got published in PLOS Comp Bio.

That community of scientists and other open science advocates moved on to Twitter after the Facebook acquisition. Twitter usage by scientists and in particular by prominent established scientists also really took off at around the same time. Although it serves a similar purpose Twitter really is more of a broadcasting mechanism than a discussion forum. It is a pity that a lesser solution won out. Still, the amount of open scientific discussions that are going on online these days is just phenomenal and a drastic change from my PhD days.

Tuesday, February 10, 2015

Group member profile - Brandon Invergo

I had mentioned previously that we should do a better job of using the web to describe our group and work. As part of this effort I will try to have a recurrent blog post series to introduce the lab members more extensively. The first group member to give this a try is Brandon Invergo (website, twitter, GScholar) who is currently doing a postdoc in the group with an ESPOD fellowship. Here follows Brandon's answers to a few questions I asked him.

What was the path the brought you to the group? Where are you from and what did you work on before arriving in the group?

I originally studied Computer Science, but by the time I was finishing my degree, I was more interested in doing something Biology-related than in working for a software company. Not sure yet what I wanted to do specifically, after receiving my degree I was fortunate to get a job working in the lab of Lawrence H. Pinto at Northwestern University (Evanston, IL, USA). There, I performed electrophysiological and behavioral assays of the mouse visual system, in the context of a functional genomics program. After a few years, I decided that it was time to go back to school and to start on the path towards a career in academic research. So, I moved to the Netherlands to pursue a master's degree in Biology at Leiden University. I specialized in evolutionary and ecological sciences and I did my primary research project under the supervision of Bas Zwaan. I investigated how the dynamics of hormonal signaling during pupal development of a tropical butterfly change in response to environmental conditions (temperature) and how those changes give rise to distinct adult phenotypes (polyphenism).

For my PhD, I wanted to perform research where I could combine my backgrounds in computer science and evolution and a nascent interest in systems biology (bonus points if I could also tie in my background in vision research). For this, I moved to the Institute of Evolutionary Biology (Pompeu Fabra University / Spanish National Research Council) in Barcelona, Spain, where I joined the group of Jaume Bertranpetit and worked under his supervision with the co-supervision of Ludovica Montanucci. My thesis, which I successfully defended in November 2013, was entitled "A system-level, molecular evolutionary analysis of mammalian phototransduction". In it, I combined techniques from bioinformatics and computational biology for molecular evolution with network- and modelling-based tools from systems biology. I sought to uncover the influence of the structure and dynamics of the visual phototransduction pathway on the evolution of the proteins that comprise it. The work also resulted in the improvement of the most comprehensive mathematical model of the system produced to date (currently under review at Biomodels), as well as a Biopython module for working with codeml and other programs from the PAML package (which are notoriously annoying to work with in analysis pipelines).

What are you currently working on?

I joined the EBI and the Sanger Institute in December 2013 as an ESPOD fellow, one week after my thesis defense. Here, I am continuing to explore how complex signaling systems function and evolve, except now I'm working in the context of malarial parasites (Plasmodium spp.).

In particular, I'm studying post-translational modifications (PTMs) on a proteome scale in the parasites, with an eye towards how the parasite uses reversible PTMs (mainly phosphorylation) for cellular signaling during key transitions in its complex lifecycle. This work involves performing both the mass-spectrometry experiments to collect the data and the computational analyses on these and other datasets. I'm finishing up a rather big experiment now and in a few weeks I expect to be neck-deep in data.

What are some of the areas of research that excite you right now?

Really anything at the intersection (well, more generally, the union) of molecular evolution and systems biology immediately catches my attention, such as the evolvability of pathways or the patterning of natural selection across systems. Of course, I'm reading a lot right now about detecting and describing PTMs at the proteomic scale. I'm also excited by developments in biochemical system modelling, particularly right now in methods for bayesian inference of parameters from large-scale datasets. Finally, though it's not directly my field, I like to keep an eye on what's happening in complexity research at the most fundamental, mathematical level.

What sort of things do you like outside of the science?

I'm very active in the Free Software community and within GNU in particular. I help out a lot behind the scenes: working with (read: pestering) GNU software maintainers, evaluating new software that has been offered to us, and being on the advisory board. I also maintain some GNU packages (GSRC & pyconfigure) and some of my own software projects in my free time. My other main passion is music. I have written many mediocre electronic music songs over the years, some of which have even been released, and I was a moderately successful DJ for nearly a decade. Sadly, my music-writing died off as my PhD thesis gained steam and I haven't written anything recently. When I decide to take a break from all that or to be social, I like to play boardgames.

Friday, January 16, 2015

How many referee reports do you write per year ?

If peer-review is a fundamental aspect of how science is done then we in academia are required to act as reviewers as part of our jobs. I assume there is no real argument as to if peer-review is needed. Instead one can argue about when to do it, in the life-time of a research project, and who should do it. Do we do it before the results are made public (pre-publishing) or after (post-publication) ? Do we have dedicated reviewers or should acting scientists do this ? The current dominant form of peer-review is done by active scientists and before disclosure of the results. This is all done anonymously and hidden from view. There are many issues with this process as the many repeated peer-review rounds required when an article bounces from journal to journal. Another drawback is that nobody gets credit from doing the work and reversely nobody gets shunned from not contributing sufficiently to the process.

This week, Alex Bateman (@Alexbateman1) directed me to the Publons company that is attempting to create reviewer profiles. Publons is attacking the problem in a couple of different ways: they mine journals that provide open peer reviews; they curate the journal's confirmation emails sent in by reviewers and they are apparently in talks with publishers to automate this process. The reviewer controls the degree of information displayed by the site. The minimum information shared is the journal name and the month which is what I expect most people will opt for. Alex had apparently kept the journal's emails acknowledging the receipt of his reviews going back for many years and he has created an extensive profile that demonstrates his contributions as a reviewer (and editor).

The company has been profiled last October in a Nature news article where Andrew Preston, one of the co-founders, states the aim of making peer-review a measurable research output. It is useful to have a verified account of our reviewing activities but I am not sure if Publons is the best way of getting there. Given that we have ORCID we could imagine that publishers would be able to jump over Publons and report reviewing activities directly to ORCID. On the other hand, Publons may serve as a focus point to get publishers to provide this information in a standardized and automated fashion. For now, the closed reviews are coming from the authors so that is why I assume the company has a reward program that has been giving out awards worth $3000 for the top reviewers in a given cycle. I do wish that companies like this one, that collect information without an obvious source of income would make their business plans more transparent. Even their terms and privacy statements are currently empty. Are we putting effort into something that will last, that will sell this information ?

So how many referee reports should we do per year ? I guess that we should aim to do at least as many as the articles we publish. The truth is that this probably varies widely and having some feedback and accountability will be good for the system. I keep all my referee reports on file so apparently since 2007 I have done 53 referee reports or about 7.5 per year. This has varied a lot with years where I have done as few as 2. With 24 articles published and only 2 years into a group-leader position I think I have been contributing well. I have set up my profile in Publons and sent in a couple of recent reviews to see how the process works. So far it has all been very straightforward and I will
give it a try for a while.

Saturday, December 20, 2014

State of the lab, year 2 – reaching steady state

CC BY ,Jason Paul Smith

At the end of last year I wrote up a short description of what it was like to start a group at the EMBL-EBI. I though it would be interesting to try to make it an yearly event so here is the second installment. It is always scary how fast a year passes by and it is interesting to note how my perspective of managing a research group is changing.

During this year we said our first goodbyes as Vicky Kostiou (linkedin) finished her internship. We also welcomed several new members including Rahuman Sheriff (postodoc, linkedin) Haruna Imamura (postdoc, pubmed), Marta Strumillo (intern, linkedin) and Juan A Cordero Varela (master student, linkedin). Sheriff is working on a collaboration with Silvia Santos' group at the MRC-CSC in London to study cell-cycle regulation. Haruna came initially on a 1 postdoc fellowship in collaboration with Yasushi Ishihama's lab (Kyoto University, Japan) and she has recently been awarded an EIPOD postdoc fellowship to study post-translational regulation of Salmonella in collaboration with Nassos Typas and Jeroen Krijgsveld at the EMBL-Heidelberg. Marta is studying the functional role of PTMs in the context of protein structural information and Juan is participating in a project lead by Marco Galardini (postdoc, @mgalactus, webpage) to model and predict bacterial phenotypes from sequence. These new members join the group of people that I already mentioned last year: David Ochoa (postdoc, @d0choa, webpage), Romain Studer (postdoc, @RomainStuder, blog), Brandon Invergo (postdoc, webpage) and Omar Wagih (PhD student, @omarwagih).

Shaking off that postdoc feeling

In the first year my concerns were dominated by the stress of facing an empty room that I needed to fill. It was a mistake to take 6 months to find the first person since I felt like I was wasting time. This year I had to come to terms with the fact that I no longer have time to do my own research projects. After over 10 years of measuring my own productivity by the progress in my research projects it is strange to try to let it go. I am certainly doing work that I enjoy. The progress in the group has been fantastic this year but it took me time to accept that the management activities I am doing is something I should count internally as productive work.

Reaching steady-state

Any new group, specially one that starts in a place like EMBL with very generous core funding, will grow to occupy a space in research. Any movement from this position will then only happen with a slower turnover of projects and people. That seems to be one of the trade-offs from managing a research as group versus an individual. Changing directions for a whole group has to be slower than for an individual. However, as as group it is still possible to explore opportunities while maintaining a common theme of research underway. This year I think we have reached this steady-state. Although we got significant new funding starting next year I don't expect the group to grow much larger. I am curios to see how the research theme of the group will change with time.

The bad and the good of 2014

So I will start off by summarizing some of the aspects I wish had been different this year. Above all I had hoped to publish the first article(s) from the group in 2014. I am happy with the progress of the projects so far (see below) but I am still amazed on how long it takes to get a group up-and-running. Most of the group joined towards the end of last year so it has not been that much time objectively. The second aspect I think we could have done better was to communicate more online on what we have been up to. This has been one of the years with fewest blog posts since I started blogging about 11 years ago. We should do better than this, both because we are publicly funded and because the people (and projects) in the group deserve better exposure. So I will try to change this next year.

On a more positive note this has been great a great scientific year for me and the group even if not very visible to the outside. The two last papers that started still at UCSF are finally under revision and should come out next year. One is about studying the function and evolution of X. laevis phosphosites (biorxiv) and the second about conditional genetic interactions in S. cerevisiae. We also have 3 projects that are getting close to being finished from Omar, David and Romain that I hope we will submit early next year. If possible we will put them up on biorxiv as well before submission. It is obviously a great privilege to see this work take shape and I hope some of you will also be excited about it when we make it public.

Regarding funding, I had mentioned already that Haruna got an EIPOD fellowship. In addition we got a 5 year ERC starting grant awarded. I am very excited about the starting grant since this will allow us to start doing yeast genetics work to complement the proteomics and genome analysis we have been doing. This will feed in and complement almost every project in the group so I really have to thank the committee for this opportunity. For this purpose we will be hiring 2 positions (postdoc and/or technician) early next year. Since the EBI does not have lab space, the work will be done at the Genome Biology unit in Heildeberg. This means I will be traveling (even) more to Heidelberg next year. Those hired to these positions will have the opportunity.to interact with the Typas lab that conduct similar genetics studies in bacterial species. If you know anyone looking for jobs with PhD and/or postdoc experience in yeast genetics please do let them know about these positions.

Friday, December 05, 2014

Alumni from a small PhD program you never heard about got 4 ERC grants this year

Last Friday I heard the amazing news that a our group will be awarded an ERC starting grant to support our ongoing studies of the function and evolution of protein phosphorylation. I will write more about this soon. I also got the very exciting news that two other fellow alumni from the GABBA PhD program were also awarded a starting grant this year. Ana Carvalho and Nuno Alves both have their groups at the IBMC in Porto. Earlier in the year, another GABBA alumnus Rui Costa was awarded an ERC consolidator grant to support his neuroscience research at the Champalimaud institute in Lisbon. Rui had previously also received funding international funding from an ERC Starting Grant and from the HHMI international early career program .

I had mentioned the GABBA program before in a previous post. As I had described, this program has, for almost 18 years, allowed Portuguese PhD students to do their work abroad with no return clause or any strings attached. Unfortunately, this has changed a bit recently as the Portuguese government has been revising and seriously cutting science funding. GABBA students can still do their work abroad but they are now required to work between two groups with some time spent in Portugal. The funding was also reduced from 12 to 9 students per year.

This program is not the only one that has been allowing students to do their PhD thesis abroad and it is easy to question if the investment made by the government is worthwhile. Not surprisingly, many PhD students end up doing their postdoctoral work also away from Portugal and even fewer end up setting up their groups there. However, Portugal did create a pool of talented researchers and some do end up returning. Measuring the return on this investment is very hard since most of the benefit is a gain in knowledge and talent from those that return and network possibilities with those that stay abroad. This year's ERC grants are a very obvious demonstration that this investment pays off. The 3 GABBA alumni that have set up labs in Portugal are together going to bring in around 5 million Euro of EU funding. By itself, this funding does not cover the funding costs of the whole life of the GABBA program but it is a very concrete validation of the investment made that hopefully even politicians will understand.

Thursday, October 16, 2014

Science publishers' pyramid structure and lock-in strategies

It is not recent news that AAAS will start a digital open access multidisciplinary journal. It is called Science Advances and it will be the 4th journal of the AAAS family of journals. As I have described in the past this is part of trend in science publishing to cover wider range of perceived impact. Publishers are aiming to have journals that are highly selective and that drive brand awareness but also have journals that can publish almost any article that pass a fairly low bar of being scientifically sound. This trend was spurred by the success of PLOS One that showed that it is financially viable to have large open access journals. Financially, the best strategy today would be to have some highly selective journals with a subscription model and then a set of journals that are less selective that operate with an author paying model.

The Nature Publishing Group has implemented this strategy very well. They increased their portfolio of open access journals with the addition of Nature Communications, Scientific Data, Scientific Reports, the partner journals initiative and the investment/partnership with the Frontiers journals. NPG has also expanded their set of subscription based Nature branded research and review journals.

This combination approach is not just financially interesting it also protects the publishers from the future imposition of immediate open access via a mandate from funding agencies. Publishers that have such a structure in place would be able to survive the mandate while others that only have subscription based journals would struggle to adjust. This has the useful side-effect of actually speeding up the transition since the bulk of the papers will be increasingly published in the larger, less restrictive open access journals. If most of the research is open access there will be less justification to have subscription journals. This will also be true even for the most well cited papers, as reported recently by google scholar.

So, many publishers are trying to build this pyramid type structure. Even AAAS is doing it albeit (apparently) very reluctantly. One consequence of these changes is that there will be an abundance of large and permissive open access journals. Therefore, we will increasingly need better article filters, such as PubChase, as I previously discussed. These mega-journals will compete on price and features but the publishers as whole will increasingly try to lock authors into the structure. Any submission to the pyramid should be publishable in *some* journal of the publisher. If I was working in one of these publishing houses I would be thinking of ways to use brand power to attract submissions while adding such lock-in mechanisms.

Current practiced lock-in strategies

The best well known lock-in strategy is the transfer of referee reports within journals of the same publishing house. This is a common occurrence that I have experienced before. An editor at Cell might suggest authors to transfer their rejected article and reviewer comments to Molecular Cell or Cell Reports. Nature research journals might suggest Nature Communications or Scientific Reports. Science might suggest Science Signalling or Science Advances. This can be very tempting since it can shorten the time to get the paper accepted. The usefulness of this mechanism is going to work against the idea of having peer review outsourced to an independent accredited entity.

Cell press has an interesting mechanism that allows for co-submission to two journals at the same time (Guide to authors - Cosubmission). I never tried this but apparently one can select two journals and the article will be evaluated jointly by both editor teams. This looks more relevant for articles that fall in between two different topics that are covered by different journals. It is still an interesting way to improve the chances that a given article will find a home within this publisher.

Another more subtle approach might be to issue a topic specific call for articles. Back in February, Nature Genetics had an editorial with a call for data analysis articles. Note that the articles will not be necessarily published in Nature Genetics and the editorial mentions explicitly Nature Communications, Scientific Data and Scientific Reports. This allows NPG to use the Nature brand power to issue a call and then spread the resulting submissions along their pyramid structure according to input from reviewers and editors. The co-publication of a large number of articles on the same topic also almost guarantees a marketing punch.

Other ideas

I am curious to see what other ideas arise and please share other similar mechanisms that you might know of. One potential additional idea that is similar to the co-submission would be to have a chained submission. At submission the publisher could already ask for an ordered list of preference for journals. We might also start to see publishers requesting from reviewers comments with a group of journals in mind from the start.

Obviously, an alternative that would place less of a burden on reviewers and editors would be a mechanism similar to what the Frontiers journals have been promoting. Articles could be initially published at a large PLOS One like journal and then increase in awareness depending on article level metrics. This approach is probably going to take a much longer time spread widely.