Thursday, June 10, 2021
A not so bold proposal for the future of scientific publishing
Friday, May 21, 2021
Lab move to ETH Zurich, the job search and fixed term PI positions
| ETH Zurich (credit) |
Teaching, scientific integration and group structure
With any move there is always some thoughts about the challenges ahead. Professionally, the types of things on my mind are that I will need to setup the group, integrate myself scientifically and prepare myself for teaching. Setting up the group and integrating myself within the local environment won't be new experiences. I feel I was too slow with both of these things when I first joined EMBL-EBI so I am curious if I will be able to move things along faster this time. Coming from EMBL and the local EBI/Sanger campus I have the impression that ETH is less collaborative but there were clearly many people interested in collaborating just from the small sample I got during interviews. There is an interesting difference in group structure between EMBL and ETH where at ETH a group can have sub-groups with junior PIs that can have varying degrees of independence as per the decision of the more senior PI. Organising a lab in this way will be something new. Finally, I will have to teach at the undergraduate level for the first time. I have always said that students coming out of biology or related topics need to have better training in bioinformatics. While daunting this will be my chance to contribute to this training directly.
The interview process and decisions
For those less familiar with the EMBL, group leaders are hired for a maximal period of 9 years with only a few exceptions (around 10%) that end up having an open-ended contract. We get generous core funding and get to tap into a great scientific network which more than compensates for the lack of tenure. This means that around year 7 your thoughts start moving into the future. At faculty presentations I would often write how many years I had left in the tittle slide as a personal reminder. Towards the end of year 7 I started applying and spent most of year 8 applying and interviewing. The first time I applied for PI positions it was all very unidirectional, with myself looking broadly for possible places. This time it felt more like dating a potential future university/institute with expressions of interests on both sides. One of the issues in going into this is that I didn't really know what my value would be in the market. I knew I had a good CV and would certainly find a job, I just didn't know where I could aim for in terms of seniority and resources. That become clearer only after the first interview and the expression of interest of places I felt were really fantastic.
The second half of 2020 became then about trying to find the best place professionally and personally. I ended up applying to 10 places, interviewed in 8 and received 5 offers. I tried to find a job in my home country (Portugal) but from the two places I was interested one picked another candidate and the other could not make an offer that was not fixed term. The decision ended up being among 3 places with the major differentiation factor being between 2 offers that had less core funding but higher management responsibilities and ETH with incredibly generous core funding and the best scientific fit (but less seniority). Personally the decisions were about staying in the UK or moving to France or Switzerland. There is quite a lot to be said about this choice (safety, adventure, integration, kid friendly, jobs for partner, etc) and in the end we went with Switzerland. While excited I am also anxious about yet another move to what will be my 5th home country, the now almost familiar sense of uprooting and new beginnings. But this is not yet time for goodbyes.
Non-tenure group leader positions (in Europe)
I don't know who invented the fixed term, non tenure track, group leader positions in academia. It may have been EMBL and this model has clearly spread across Europe with many research institutes having some form of junior positions that have a variable number of years (5 to 12) to set up a group and then necessarily need to move on to a different place. EMBL does this because it is funded by many member state countries to train the next generation of "academic leaders" that will lead research groups across the member states. The obvious advantage of hosting these positions is that it keeps the institute forever young if you manage the turnover well. I think these positions can work well if they remain a relatively small proportion of the total PI/faculty positions; there is some level of support to at least kick start the group; and the positions last a sufficient number of years. Having gone through this at EMBL my impression is that 7 years would be the bare minimum and 9-10 years would be ideal. This also depends on the level of support beyond the PI salary. If conditions are not met then it is not worth setting up people for failure with the selfish goal of using the higher turnover to bring in new ideas/methods. Don't give people super postdoc positions for 3-5 years with no funding and no chances of tenure just because you want fresher ideas around. If there is some mechanism for tenure or open ended contract then it should be crystal clear from the start how (un)likely this is and what are the transparent criteria for achieving it.
Friday, January 29, 2021
State of the lab 7 & 8 - The last years at EMBL
This is usually part of a yearly series of posts where I note down thoughts related to managing a research group in academia over the years. This post covers years 7 and 8 and it brings me now to the start of year 9, my last at EMBL. While I usually do one of these posts every year, with all of the craziness of 2020 I ended up skipping one.
Year 7, group turnover
2019 was the year where the group fully turned over all lab members that were with us since the earlier years with 2 postdocs (Haruna Imamura and David Ochoa) and 3 PhD students (David Bradley, Claudia Hernandez-Armenta and Marta Strumillo) leaving. Haruna is now a Research Scientist at the Systems Biology Institute in Japan, David O is a the platform coordinator at Open Targets and Claudia and David B are now doing postdocs. Marta is finding her way through consulting. We were joined by 2 postdocs (David Burke and Miguel Correa) and 2 PhD students (Eirini Petsalaki and Rosana Garrido). This constant turnover of group members is quite difficult to manage both personally and professionally. Year 7 was really the year with largest amount of changes in the group and there is something to be considered about trying to make sure that changes remain gradual. However, it is not always possible to plan for this to happen. While I think that this change in academia is generally positive for science, I do wonder what could be achieved if this was not a requirement (see earlier post).
Managing research focus over the years
Over the last few years, the research in the group had some dispersion in terms of the group research topics. At the start, the group was named "Evolution of cellular interactions" with a primary focus on the evolution and functional relevance of protein phosphorylation. While this remained the central focus there were other areas we worked on including cancer genomics and genetics of human disease and microbial trait diversity. We also have work that is not yet visible on drug mode-of action predictions. This led me to change the group name to "Cellular consequences of genetic variation" which could better serve as umbrella to the different topics. This is, at least in part, a simple reflection of funding opportunities but also a reflection of true movement in my research interests and the environment I have been working in (Genome Campus). On one hand I feel this dispersion is detrimental in that we could do more with a single minded focus, but on the other hand these extensions have not really been the majority of our work and also act as way for the group to explore new directions. My visual reference for this is a cell sending out protrusions in some directions to feel out the environment around. On some of these new areas (e.g. microbial trait diversity) I feel we have done enough, even with a small total investment, to make the work stand on its own.
I have to say that the without explicitly planning for it, the dispersion worked to my advantage when applying for position last year as it allowed me to present the group through slightly different lenses depending on where I was interviewing in. Of course, this is only beneficial if there is sufficient research progress made by the group not to appear superficial or unfocused. I suspect that this movement in research topics is normal but I haven't had many deep conversations with others about how this has happened to them in their research groups. In some cases, the changes in topics for some groups seem more abrupt from the outside but it could be just a perception. I will soon have an opportunity to rethink where we put most of our research efforts and likely cut back on some of these extensions.
Year 8 - A new group, the pandemic and the job market
At the start of last year, I was finally getting comfortable with the idea that the group had changed so much and I was truly excited about the new beginning. Just as the year was starting and I was enjoying this excitement the pandemic hit. As I had described before, we ended up devoting some effort in the group to work on SARS-CoV-2 projects which I think was also good for group morale. However, the changes in working conditions, the effort on the SARS projects and my need to go back to the job market made me less capable of keeping up with some of the projects in the group. While most of the work has kept going there are at least 3 projects/manuscripts that have been neglected simply for my own lack of time/effort. We all know these stories of PIs that let work pile up on their desk and I feel it as a failure although I can rationalise why I really didn't have the time to fully keep up.
Finally, over last year I was fully back on the job market and I am so relieved that this is now over. Since there nothing official that I can announce I will wait to write up in detail what the process was like and compare it to my first attempt to secure a PI position. I can at least say that I will leave EMBL-EBI at the end this year and I will certainly write more about the 9 years of EMBL. I do want to look back to all that has been good (mostly) and bad, make a summary of what I feel were the biggest advances we made, perhaps discuss the finances, and more broadly go over the issues of this lack of tenure for junior PIs now implemented in so many European research institutions.
Friday, December 04, 2020
A year of SARS-CoV-2 research
This post may be premature but I feel like writing down some thoughts about the roller coaster that this year has been. At the start of the year, with the number of reported cases rising in Europe the EMBL and our institute (EMBL-EBI) decided to send everyone home as precautionary measure. As most of our group is computational, this has meant we have been working from home for most of this year. Early on, somewhat frustrated by not being able to help, I emailed a few people that could be working on the virus. Nevan Krogan replied saying our help would be useful and we joined the global effort to contribute to solving this crisis.
Science at science fiction speed
Over the course of 9 months we took part in 4 projects, some of these being the most thrilling science I have ever taken part in. We condensed what would easily be a 3 to 5 years research project into something done in 3-4 months, involving typically 10-20 research groups with a few key people helping to direct the research. We were collecting data, analysing and suggesting new experiments in the span of days with some of the best scientists in the world. Contributing to the direction of this level of resources has been an amazing experience that I wish every scientist could try at least once in their life. These projects were all geared towards studying how SARS-CoV-2 takes control of its target cells to be able to suggest human targeting drugs that could counter the infection. Several of the compounds identified in these studies are in clinical trials for COVID-19 so I feel the projects met their main objective.
While this has been my perspective from working on these specific projects we are all aware of the amazing scientific progress that has been made over the course of this year. I remember seeing the movie Contagion and almost laughing at the unrealistically fast pace of research in the movie. However, SARS-CoV-2 research has in fact happened at an incredibly fast pace that probably matches the movie.
Why don't we do this for disease X?
One discussion point that has come up often is if we can learn from this period to apply it to research into other diseases. Science is an international endeavour but the degree of collaborations for SARS-CoV-2 research has been higher than usual. The effort put into this was also high among the projects I have seen personally but this eventually results in some exhaustion and it is not sustainable. I don't think this is easy to repeat for other diseases without the same external sense of urgency. Most scientists won't just drop what they are working on to fully focus on some other research question. Maybe it is an argument for even higher degree of collaboration, in particular between academia and biotech/pharma. There may be some small increase in productivity of collaborations through the use of online tools like slack and zoom but overall I don't see that the way we do science has been dramatically changed going forward.
The case for higher spending in research
| I'm gonna have to science the s**t out of this |
Over the last 10 years academic science budgets have been squeezed and a lot has been said about how academic science needs to be more applied and how much we should justify the investment it is being made. This week, DeepMind, a private research institute funded by what is essentially an advertising company (Alphabet/Google) has made headlines with their impressive research into predicting the structure of a protein from its sequence. An advertising company finds the money to invest into what are fundamental biological problems and in the middle of a pandemic that is being solved by a global scientific infrastructure we can't get the EU science budget to increase. We should be ready to make our case over the course of the next months.
Thursday, May 30, 2019
PlanS, the cost of publishing, diversity in publishing and unbundling of services
Less publishers means less innovation in publishing
Potential solutions for small publishers
Friday, March 29, 2019
Research summary - Predicting phenotypes of individuals based on missense variants and prior knowledge of gene function
Predicting phenotypes of individuals from coding variants and gene deletion phenotypes
Modes of failure – variant effect predictions and genetic background dependencies
Perspectives and different directions on genotype-to-phenotype mapping
Wednesday, January 09, 2019
State of the lab 6 – group turnover and getting back in the job market
Wednesday, January 10, 2018
State of the lab 5 – in the flow with 4 years to go
Personally it is almost strange to stay in the same place after 5 years since I have been typically staying 4-5 years in each place during university (Coimbra), PhD (Heidelberg) and postdoc (San Francisco). It looks like I will have to find some other excuse to thin out my pile of papers on the desk instead of simply moving to a new country and trashing everything.The end of a cycle
Last year was our most productive year so far, as measured by the number of publications. This year is going to top it based on the manuscripts that I should be working on at the moment instead of writing this post (sorry guys). The research in the group is just flowing with more synergies among the group members. Just when everything is working so well is when so many in the group are leaving. Last year our first PhD student finished (Omar, now at DeepGenomics) and two postdocs have left (Romain moved to benevolentAI and Sheriff is now a project leader at EBI). This year there will be even more people potentially leaving. It is going to be a new challenge to try to keep the science going through the turnover. On the other hand, new arrivals signal the start of new projects and are an opportunity to move the group in new directions. Just at the end of the year, we had 3 new members starting: Allistair (PhD student), Inigo (postdoc) and Abel (visiting PhD student). Abel and Inigo will be working on the impact of mutations in protein interactions and control of protein abundance while Allistair will likely work on the evolution of regulatory networks.
Highlight from 2017 – Predicting condition specific phenotypes from genomes
Most of the work in the group is focused on understanding the function and impact of genetic variants on protein post-translational regulation, in particular for phosphorylation and ubiquitin. However, we have been also working more generically on the genotype to phenotype problem. I think these analyses could use more prior knowledge information and we are trying to contribute in this direction.
Part of this work, led by Marco (GScholar, Twitter) and in collaboration with the Typas lab in Heidelberg was finally published at the end of this year. The question we wanted to address was to what extent we can predict condition specific phenotypes of a strain of E. coli based on its genome and what we know from the well-studied E. coli K-12 lab strain. This is inspired by work that Rob Jelier and Ben Lehner did in S. cerevisiae but on larger scale. To set the project up, imagine we know that a given gene X of E. coli is required for growth under high heat. Then, if that gene X is not present or severely mutated in a strain of E. coli, we would expect that this mutated strain should not survive well in high heat. To test this in large scale we assembled a panel of hundreds of strains of E. coli for which we obtained genomes and fitness measurements under many conditions. We modelled the consequence of mutations using different methods and we collected prior knowledge of which genes are supposed to be important for each condition. In the end we could only predict which strains would tend to grow poorly for around 40% of conditions. This level of success may not be surprising since we didn't take into account for example issues like gene expression levels or compensation by new genes. It could be that gene function may be a lot more plastic than currently assumed but to prove this we will need different experiments.Besides testing the central question expressed above this collection of E. coli strains with associated data will hopefully serve as resource for future studies. Any additional layer of molecular data (e.g. gene expression) or phenotype (e.g. motility) we measure can make use of all of pre-exiting information. We could ask if motility correlates with the growth under several drugs we tested for example. All of the resources for this collection are freely available and of course this would not be possible without the hard work of the scientist that collected the strains to begin with (listed here).
Highlights for the year ahead
We have 3 different projects that are close to completion that relate to the functional relevance of protein phosphorylation. This is probably going to be our biggest contribution of 2018. We continue to work with the cancer related datasets, primarily using these data to study protein post-translational regulation. Not necessarily to better understand cancer but making use of the large genetic and molecular variation that exists in cancer to better understand the regulatory processes of normal cells. Additionally we will have some progress to report on the evolution of protein kinases and potentially the evolution and regulation of ubiquitylation.
Friday, January 05, 2018
Group member profile - Omar Wagih
The latest instalment of this blog post series is by Omar Wagih (@omarwagih, Gscholar) who has just last month successfully defended his PhD. Along with Marco, Omar has been part of the group working on studying how DNA variants relate to phenotypes. He developed the mutfunc resource and the fantastic guess the correlation game.What was the path the brought you to the group? Where are you from and what did you work on before arriving in the group?
My love of genetics is, in more ways that one biologically ingrained. Growing up in a family of scientists, I was always surrounded by a wealth of information which I instinctually sought to organise. For this, I pursued my undergraduate and masters degree at the University of Toronto, majoring in computational biology and computer science, respectively. Along the way, I was fortunate to work in some of the leading computational biology labs in Canada including those of Gary Bader, Philip Kim, Charlie Boone, Brenda Andrews, Andrew Fraser and Andrew Emili. I worked on a range of projects which ranged from analysing images of genetic screens of yeast to determining the impact of disease mutations on kinase-substrate phosphorylation. These experiences led me to develop an interest in understanding how changes in the genome translate to variability in cellular physiology, and ultimately phenotype, which prompted me to pursue my PhD.
What are you currently working on?
My current project involves working towards a deeper understanding of how changes in the genome propagate to phenotypic variability by predicting which cellular mechanisms are likely to be impacted. For the past several years I have been developing and using computational methodologies to assess the mechanistic impact of natural and disease-causing mutations. I have been applying these to yeast, human and bacteria models in hopes of streamlining hypothesis-driven variant annotation. I have also been utilising these predictions to assess the overall burden these mutations impose on gene function and putting such information towards conducting gene-phenotype associations.
What are some of the areas of research that excite you right now?
I'm intrigued by novel mutagenesis technologies that are allowing us experimentally assess the impact of genetic variants on cellular fitness and function in a massively parallel fashion. Technologies like deep mutational scanning CRISPR are becoming increasingly common in achieving this and their off-target effects are steadily being reduced.
With such massive amounts of mutagenesis data, I'm also interested in how machine learning methodologies such as deep learning can be applied to learn how mutations collectively impinge on cellular function and ultimately phenotype. This would significantly improve the precision of variant impact predictors and, in my opinion, will have crucial roles in shaping the development of novel and personalised drug therapies.
What sort of things do you like outside of the science?Whether I'm skiing, hiking, camping or exploring the city, or you'll more likely than less find me outdoors. I often partake in sports. During my time in Cambridge, I rowed for my college and was part of the university boxing team.
I have been fascinated by drones for a while and own a DJI Phantom 3, which I often use for aerial filming. I also enjoy landscape and portrait photography, particularly with my 50mm lens. If I still have extra time on my hands, you'll find me implementing silly ideas that come to mind into apps or games. Here are a few I've made: genewords, pubtex, and guess the correlation.
Monday, June 26, 2017
Building rockets in academia - big goals from individual projects
The big goals and peripheral bets
Individualized contributions to group goals
Friday, April 28, 2017
Postdoc positions on context dependent cell signalling (wet and/or dry)
Why do some mutations cause cancer in some tissues and not others ? What happens to the cell signalling pathways during differentiation ? Why are some genes essential in some cell types and not others or why are some drugs more effective at killing some cell types than others ?We think that this is a great time to be asking these questions of how the genetic background or tissue of origin changes cell states. More precisely for us, how this re-wires cell signalling. It has become routine to measure changes in phosphorylation across different conditions, including different cancer types. The Sanger and others are establishing panels of human cell lines that are being profiled with an increasing array of omics technologies with drug sensitivity and CRISPR based gene essentiality information. These panels offer a great opportunity to address these questions.
We want to combine the work we have been doing in studying human signalling with phosphoproteomic data, with variant effect predictors, microscopy based studies of cell signalling and network modelling to address this question of context dependent changes in cell signalling.
To support this research we have 2 postdoc positions available: one would be primarily computational and would involve image analysis and network modelling in collaboration with microscopy groups (see here for project and application); the second would be primarily experimental with a focus on microscopy. The latter would be available via the ESPOD fellowship scheme in collaboration with Leopold Parts group at Sanger (see here for project description and here to apply). The split between computational and experimental is open and wet/dry mixed candidates are encouraged as well to apply to both.
These projects complement existing work in the group using cancer Omics data to study the genetic determinants of changes in protein abundance and phosphorylation and will be in collaboration with work developed by the Petsalaki group at EBI that is also recruiting. Email me if you have any questions/concerns about the positions.
Monday, April 10, 2017
17 years of systems biology
Friday, February 10, 2017
Predicting E3 or protease targets with paired protein & gene expression data (negative result)
Cancer datasets as a resource to study cell biology
The amazing resources that have been developed in the context of cancer biology can serve as tools to study "normal" cell biology. The genetic perturbations that happen in cancer can be viewed almost as natural experiments that we can use to ask varied questions. Different cancer consortia have produced, for the same patient samples or the same cancer cell lines, data that ranges from genomic information, such as exome sequencing, to molecular, cellular and disease traits including gene expression, protein abundance, patient survival and drug responses. These datasets are not just useful to study cancer biology but more globally to study cell biology processes. If we were interested in asking what is the impact of knocking out a gene we could look into these data to have, at least, an approximate guess of what could happen if this gene is perturbed. We can do this because it is likely that almost any given gene will have changes in copy number or deleterious mutations given a sufficiently large sample of tumours or cell lines. Of course, there will be a whole range of technical issues to deal with since it would not be a "clean" experiment comparing the KO with a control.Studying complex assembly using protein abundance data
More recently the CPTAC consortium and other groups have released proteomics measurements for some of the reference cancer samples. Given the work that we have been doing in studying post-translational control we started a few projects making use of these data. One idea that we tried and have recently made available online via a pre-print was to study gene dosage compensation. When there are copy number changes, how often are these propagated to changes in gene expression and then to protein level ? This was work done by Emanuel Gonçalves (@emanuelvgo), jointly with Julio Saez-Rodriguez lab. There were several interesting findings from this project, one of these was that we could identify members of protein complexes that indirectly control the degradation of other complex subunits. This was done by measuring, in each sample, how much of the protein abundance changes are not explained by its gene expression changes. This residual abundance change is most likely explained either by changes in the translation or degradation rate of the protein (or noise). We think that, for protein complex subunits, this residual mainly reflects degradation rates. Emanuel then searched for complex members that had copy number changes that predicted the "degradation" rate of other subunits of the same complex. We think this is a very robust way to identify such subunits that act as rate-limiting factors for complex assembly.Predicting E3 or protease targets
If what I described above works to find some subunits that control the "degradation" of other subunits of a complex then why not use the exact same approach to find the targets of E3 ligases or proteases ? Emanuel gave this idea a try but in some (fairly quick) tests we could not see a strong predictive signal. We collected putative E3 targets from a few studies in the literature (Kim et al. Mol Cell Biol. 2015; Burande et al, Mol Cell Proteomics. 2009; Lee et al. J Biol Chem. 2011; Coyaud et al. Mol Cell Proteomics. 2015; Emanuele MJ et al. Cell 2011). We also we collected protease targets from the Merops database. We then tried to find a significant association between the copy number or gene expression changes of a given E3 with the proxy for degradation, as described above, of any other protein. Using the significance of the association as the predictor with would expect a stronger association between an E3 and their putative substrates than with other random genes. Using a ROC curve as descriptor of the predictive power, we didn't really see robust signals. The figure above shows the results when using gene expression changes in the E3 to associate with the residuals (i.e. abundance change not explained by gene expression change) of the putative targets. The best result, was obtained for CUL4A (AUC=0.59) in this case but overall the predictions are close to random.
A similar poor result was generally observed for protease targets from the merops database although we didn't really make a strong effort to properly map the merops interactions to all human proteins. Emanuel tried a couple of variations. For the E3s he tried restricting the potential target list to proteins that are known to be ubiquitylated in human cells but that did not improve the results. Also, surprisingly, the genes listed as putative targets of these E3s are not very enriched in genes that increase in ubiquitylation after proteasome inhibition (from Kim et al. Mol Cell. 2011) with the clearest signal observed in the E3 targets proposed by Emanuele MJ and colleagues (Emanuele MJ et al. Cell 2011).






