This blog post is part of a yearly series and marks the end of the 5th year as a group leader at EBI. In March we had an external evaluation of all research groups at EMBL-EBI. It was an interesting experience and overall it was judged a great success for EBI. For our group it was also part of the evaluation towards the standard renewal of contract where I got the 4 year extension. Since there is essentially no tenure at EMBL this also means that I have 4 years until I have to find a senior PI position. This is still a long time but it will increasingly be on my mind going forward. I am not particularly worried but I feel like there are many more places now in Europe with fixed term junior group leader positions. The postdoc bubble will turn into the junior PI bubble and we will have another big barrier and competition in the transition between junior and senior positions.
Personally it is almost strange to stay in the same place after 5 years since I have been typically staying 4-5 years in each place during university (Coimbra), PhD (Heidelberg) and postdoc (San Francisco). It looks like I will have to find some other excuse to thin out my pile of papers on the desk instead of simply moving to a new country and trashing everything.
The end of a cycle
Last year was our most productive year so far, as measured by the number of publications. This year is going to top it based on the manuscripts that I should be working on at the moment instead of writing this post (sorry guys). The research in the group is just flowing with more synergies among the group members. Just when everything is working so well is when so many in the group are leaving. Last year our first PhD student finished (Omar, now at DeepGenomics) and two postdocs have left (Romain moved to benevolentAI and Sheriff is now a project leader at EBI). This year there will be even more people potentially leaving. It is going to be a new challenge to try to keep the science going through the turnover. On the other hand, new arrivals signal the start of new projects and are an opportunity to move the group in new directions. Just at the end of the year, we had 3 new members starting: Allistair (PhD student), Inigo (postdoc) and Abel (visiting PhD student). Abel and Inigo will be working on the impact of mutations in protein interactions and control of protein abundance while Allistair will likely work on the evolution of regulatory networks.
Highlight from 2017 – Predicting condition specific phenotypes from genomes
Most of the work in the group is focused on understanding the function and impact of genetic variants on protein post-translational regulation, in particular for phosphorylation and ubiquitin. However, we have been also working more generically on the genotype to phenotype problem. I think these analyses could use more prior knowledge information and we are trying to contribute in this direction.
Part of this work, led by Marco (GScholar, Twitter) and in collaboration with the Typas lab in Heidelberg was finally published at the end of this year. The question we wanted to address was to what extent we can predict condition specific phenotypes of a strain of E. coli based on its genome and what we know from the well-studied E. coli K-12 lab strain. This is inspired by work that Rob Jelier and Ben Lehner did in S. cerevisiae but on larger scale. To set the project up, imagine we know that a given gene X of E. coli is required for growth under high heat. Then, if that gene X is not present or severely mutated in a strain of E. coli, we would expect that this mutated strain should not survive well in high heat. To test this in large scale we assembled a panel of hundreds of strains of E. coli for which we obtained genomes and fitness measurements under many conditions. We modelled the consequence of mutations using different methods and we collected prior knowledge of which genes are supposed to be important for each condition. In the end we could only predict which strains would tend to grow poorly for around 40% of conditions. This level of success may not be surprising since we didn't take into account for example issues like gene expression levels or compensation by new genes. It could be that gene function may be a lot more plastic than currently assumed but to prove this we will need different experiments.
Besides testing the central question expressed above this collection of E. coli strains with associated data will hopefully serve as resource for future studies. Any additional layer of molecular data (e.g. gene expression) or phenotype (e.g. motility) we measure can make use of all of pre-exiting information. We could ask if motility correlates with the growth under several drugs we tested for example. All of the resources for this collection are freely available and of course this would not be possible without the hard work of the scientist that collected the strains to begin with (listed here).
Highlights for the year ahead
We have 3 different projects that are close to completion that relate to the functional relevance of protein phosphorylation. This is probably going to be our biggest contribution of 2018. We continue to work with the cancer related datasets, primarily using these data to study protein post-translational regulation. Not necessarily to better understand cancer but making use of the large genetic and molecular variation that exists in cancer to better understand the regulatory processes of normal cells. Additionally we will have some progress to report on the evolution of protein kinases and potentially the evolution and regulation of ubiquitylation.
Personally it is almost strange to stay in the same place after 5 years since I have been typically staying 4-5 years in each place during university (Coimbra), PhD (Heidelberg) and postdoc (San Francisco). It looks like I will have to find some other excuse to thin out my pile of papers on the desk instead of simply moving to a new country and trashing everything.
The end of a cycle
Last year was our most productive year so far, as measured by the number of publications. This year is going to top it based on the manuscripts that I should be working on at the moment instead of writing this post (sorry guys). The research in the group is just flowing with more synergies among the group members. Just when everything is working so well is when so many in the group are leaving. Last year our first PhD student finished (Omar, now at DeepGenomics) and two postdocs have left (Romain moved to benevolentAI and Sheriff is now a project leader at EBI). This year there will be even more people potentially leaving. It is going to be a new challenge to try to keep the science going through the turnover. On the other hand, new arrivals signal the start of new projects and are an opportunity to move the group in new directions. Just at the end of the year, we had 3 new members starting: Allistair (PhD student), Inigo (postdoc) and Abel (visiting PhD student). Abel and Inigo will be working on the impact of mutations in protein interactions and control of protein abundance while Allistair will likely work on the evolution of regulatory networks.
Highlight from 2017 – Predicting condition specific phenotypes from genomes
Most of the work in the group is focused on understanding the function and impact of genetic variants on protein post-translational regulation, in particular for phosphorylation and ubiquitin. However, we have been also working more generically on the genotype to phenotype problem. I think these analyses could use more prior knowledge information and we are trying to contribute in this direction.
Part of this work, led by Marco (GScholar, Twitter) and in collaboration with the Typas lab in Heidelberg was finally published at the end of this year. The question we wanted to address was to what extent we can predict condition specific phenotypes of a strain of E. coli based on its genome and what we know from the well-studied E. coli K-12 lab strain. This is inspired by work that Rob Jelier and Ben Lehner did in S. cerevisiae but on larger scale. To set the project up, imagine we know that a given gene X of E. coli is required for growth under high heat. Then, if that gene X is not present or severely mutated in a strain of E. coli, we would expect that this mutated strain should not survive well in high heat. To test this in large scale we assembled a panel of hundreds of strains of E. coli for which we obtained genomes and fitness measurements under many conditions. We modelled the consequence of mutations using different methods and we collected prior knowledge of which genes are supposed to be important for each condition. In the end we could only predict which strains would tend to grow poorly for around 40% of conditions. This level of success may not be surprising since we didn't take into account for example issues like gene expression levels or compensation by new genes. It could be that gene function may be a lot more plastic than currently assumed but to prove this we will need different experiments.
Besides testing the central question expressed above this collection of E. coli strains with associated data will hopefully serve as resource for future studies. Any additional layer of molecular data (e.g. gene expression) or phenotype (e.g. motility) we measure can make use of all of pre-exiting information. We could ask if motility correlates with the growth under several drugs we tested for example. All of the resources for this collection are freely available and of course this would not be possible without the hard work of the scientist that collected the strains to begin with (listed here).
Highlights for the year ahead
We have 3 different projects that are close to completion that relate to the functional relevance of protein phosphorylation. This is probably going to be our biggest contribution of 2018. We continue to work with the cancer related datasets, primarily using these data to study protein post-translational regulation. Not necessarily to better understand cancer but making use of the large genetic and molecular variation that exists in cancer to better understand the regulatory processes of normal cells. Additionally we will have some progress to report on the evolution of protein kinases and potentially the evolution and regulation of ubiquitylation.