Cellular Consequences of Genetic variation

Monday, November 19, 2007

Linking out - Personalized medicine

Personalized medicine continues to climb the hype cycle. I have been getting most of the best news coverage on the subject from blogs.

- Bertalan Meskó reviews companies focused on personalized medicine (see part I and II)

- Attila Csordas and Deepak Singh cover the social aspects of personal health and the tie-in to 23andMe

- Gareth Palidwor reads into the details to speculate that the business model of 23andMe might be to sell the aggregated user data.

- Gene Sherpas puts on the brakes, describing the hype as Genomic Voyeurism

I am concerned that all the attention the genomics side of personalized medicine will distort the relative importance of nature versus nurture. Everyone craves for a peek at their own destiny and at their roots. These services hope to provide both of these by looking at our DNA. I don't think they can really do this reliably but nothing stops them from luring people.

Tuesday, November 13, 2007

Last call for Open Laboratory 2007

Bora has issued a last call for submissions to the Science Blogging anthology of 2007. As last year, the objective is to collect some of the best science blog posts of the year and compile it into a book to print on demand (deadline on December 20th 2007). Submissions can be sent using an online form and they will be reviewed by a panel that will compile the final list.
Anyone interested in participating can send in links to their favorite blog posts of the year and also volunteer to be part of the reviewing process (see instructions here).

Monday, November 12, 2007

4th year blog anniversary

It is hard to believe that is has been 4 years that I started blogging here. Not that I am a very prolific blogger with only 328 blog posts in this time. These are not very evenly distributed with more than 200 blog posts in the last two years. The style of blog posts also changed a lot from a link blog with a few sentences to longer more opinionated posts.

Having a glance a the blog posts it is easy to find some very weird ones :)
Your Identity Aura (2005)
Our Collective Mind (2005)
The Human Puppet (2005)
Social Network Dynamics in a Conference Setting (2006)
The Fortune Cookie Genome (2007)

There a lot of serious ones too but I will leave that list to some other time.

Thanks to Nodalpoint and the Nodalpoint regulars (Greg, Neil, Alf and Chris) for introducing me to blogging some 6 years ago and to everyone else that joined in along the way with their blogs and/or comments. It sure makes blogging more enjoyable.

(Image Credit: Picture taken by mattnjuzz and published under CC by-nc-sa. Originally taken from Flick)

Saturday, November 10, 2007

Predicting functional association using mRNA localization

About a month ago Lécuyer and colleagues published a paper in Cell describing an extensive study of mRNA localization in Drosophila embryos during development. The main conclusion of this study was that a very large fraction (71%) of the genes they analyzed (2314) had localization patterns during some stage of the embryonic development. This includes both embryonic localization or sub-cellular localizations.

There is a lot of information that was gathered in this analysis and it should serve as resource for further studies. There is information for different developmental stages so it should also be possible to look for the dynamics of localization of the mRNAs. Another application of this data would be to use it as information source to predict functional association between genes.

Protein localization information as been used in the past for prediction of protein-protein interactions (both physical and genetic interactions). Typically this is done by integrating localization with other data sources in probabilistic analysis [Jansen R et al. 2003, Rhodes DR et al. 2005, Zhong W & Sternberg PW, 2006].

To test if mRNA localization could be used in the same way I took from this website the localization information gathered in the Cell paper and available genetic and protein interaction information for D.melanogaster genes/proteins (can be obtained for example in BioGRID among others). For this analysis I grouped physical and genetic interactions together to have a larger number of interactions to test. The underlying assumption is that both should imply some functional association of the gene pair.

The very first simple test is to have a look at all pairs of genes (with available localization information) and test how the likelihood that they interact depends on the number of cases where they were found to co-localized (see figure below). I discarded any gene for each no interaction was known.

As seen in the figure there is a significant correlation (r=0.63,N=21,p<0.01) between the likelihood of interaction and the number of co-localizations observed for the pair. At this point I did not exclude any localization term but since images were annotated using an hierarchical structure these terms are in some cases very broad.

More specific patterns should be more informative so I removed very broad terms by checking the fraction of genes annotated to each term. I created two groups of more narrow scope, one excluding all terms annotated to more than 50% of genes (denominated "localizations 50") and a second excluding all terms annotated to more than 30% of genes (localizations 30). In the figure below I binned gene pairs according to the number of co-localizations observed in the three groups of localization terms and for each bin calculated the fraction that interact.

As expected, more specific mRNA localization terms (localizations 30) are more informative for prediction of functional association since fewer terms are required to obtain the same or higher likelihood of interaction. The increased likelihood does not come at a cost of fewer pairs annotated. For example, there are similar number of gene pairs in bin "10-14" of the more specific localization terms (localizations 30) as in the bin ">20" for all localization terms (see figure below).

It is important to keep in mind that mRNA localization alone is a very poor predictor of genetic or physical interaction. I took the number of co-localization of each pair (using the terms in "localizations 30") and plotted a ROC curve to determine the area under the ROC curve (AROC or AUC). The AROC value calculated was 0.54, with a 95% confidence lower bound of 0.52 and a p value of 6E-7 of the true area being 0.5. So it is not random (that would be 0.5) but by itself is a very poor predictor.

In summary:
1) the degree of mRNA co-localization significantly correlates with the likelihood of genetic or physical association.
2) less ubiquitous mRNA localization patterns should be more informative for interaction prediction
3) the degree of mRNA co-localization is by itself a poor predictor of interaction but it should be possible to use this information to improve statistical methods to predict genetic/physical interactions.

This was a quick analysis, not thoroughly tested and just meant to confirm that mRNA localization should be useful for genetic/physical interaction predictions. I am not going to pursue this but if there is anyone interested I suggest that it could be interesting to see what terms have more predictive power with the idea of integrating this information with other data sources or also possibly directing future localization studies. Perhaps there is little point of tracking different developmental stages or maybe embryonic localization patterns are not as informative as sub-cellular localizations to predict functional association.

Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, Gerstein M. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science. 2003 Oct 17;302(5644):449-53.
Rhodes DR, Tomlins SA, Varambally S, Mahavisno V, Barrette T, Kalyana-Sundaram S, Ghosh D, Pandey A, Chinnaiyan AM. Probabilistic model of the human protein-protein interaction network.Nat Biotechnol. 2005 Aug;23(8):951-9.
Zhong W, Sternberg PW. Genome-wide prediction of C. elegans genetic interactions.Science. 2006 Mar 10;311(5766):1481-4.

Thursday, November 08, 2007

What I don't like about BPR3

For those that have not heard about it before BPR3 stands for Bloggers for Peer-Reviewed Research Reporting. From their website:

"Bloggers for Peer-Reviewed Research Reporting strives to identify serious academic blog posts about peer-reviewed research by offering an icon and an aggregation site where others can look to find the best academic blogging on the Net."

It is all great except that it already exists and for a long time before BPR3. You can go to the papers section in Postgenomic and select papers by the date they were published, were blogged about, how many bloggers mentioned the paper or limit this search to a particular journal. I have even used this early this year to suggest that the number of citations increases with the number of blog posts mentioning the paper.

In this case I think that unless they really aim to develop something that is better that what Postgenomic already offers, the added competition will only fragment an already poor market. The value of a tracking site like Postgenomic, Techmeme or what BPR3 is proposing to create increases with user base in a non-linear way. This is what people usually refer to as network effects in social web applications. Increasing number of users make the sites more useful, reinforcing the importance of the social application. I suspect Postgenomic is not closed in any way to discussions. The code is even available here for re-use. So, why can't BPR3 and Postgenomic work this out and have a single tracking database and presentation. Let's say that BPR3 could be a mirror for the Postgenomics papers section (why re-invent the wheel).

I am not in favor of any particular site (sorry Euan :), what I think would be useful would be:
1 ) common standards for everyone (publishers, bloggers, etc) to carry information on published literature (number of times paper was read, ratings, comments, blog posts, e-notebook data, etc) attached to single identifier (DOI sounds fine)
2) one independent tracking site with enough users to gain hub status such that everyone gains from high exposure to the science crowd.

Thursday, November 01, 2007

The right to equivalent response

(disclaimer: I worked for Molecular Systems Biology)

The last issue of PLoS Biology caries an editorial about Open Access written by Catriona J. MacCallum. It addresses the definition of Open Access and what the author considers an "insidious" trend of obscuring "the true meaning of open access by confusing it with free access".

I agree with the main point of the editorial, that we should keep in mind the definition of open access and that the capacity to re-use a published work should have more value to the readers.

However, it is very unfortunate that the very fist example MacCallum picks on is the Molecular Systems Biology journal for the simple fact that very recently they have changed the publishing policies to address exactly this issue. Authors can choose one of two CC licenses, deciding for themselves if they want to allow derivatives of their work or not. See post at MSB blog. As it is explained in the blog post the discussions about the licenses actually started several month ago and I think the final implementation is a very balanced decision on their part.

Thomas Lemberger, editor at MSB wrote a reply to the editorial that PLoS decided to publish as a response from the readers. These can only be seen if readers decide to click the link "Read Other Responses" on the right side of the online version.

I am obviously biased but for me this is not really giving the right to equivalent response. It would not have cost them much to issue a correction or publish the letter as correspondence where it would have the same visibility as the editorial. This would signal that they are indeed committed to collaborating with other publishers and journals that support open access (as stated in PLoS core principles).

Bio::Blogs #16, The one with a Halloween theme

The 16# edition of Bio::Blogs is know available at Freelancing science. Jump over there for summary of what has been going on during this month in the bioinformatic related blogs. If not for anything else then just to have a look at the pumpkin. Thanks again to everyone that participated.
Paulo Nuin from Blind.Scientist has volunteered to host the 17# edition that is scheduled to appear as usual on the 1st of December.

Thursday, October 25, 2007

Building an e-Science platform with Miscrosoft tools

(via Frank Gibson's Peanutbutter) Hugo Hiden, the technical director of the North-East Regional e-Science Centre (NEReSC) started a new blog where he will explore how to build an e-Science platform based on Microsoft technology. The initial post explains a little bit why he is doing this:
"The reason for this blog is, primarily, to document my experiences with writing a prototype e-Science research platform using Microsoft tools instead of the more traditional approach of fighting with Open Source. This way is easier, supposedly."
and also, what he aims to build:
"The task I have set myself is to recreate, at a basic level, the software being developed by the CARMEN project (http://www.carmen.org.uk). "

Let's see how it goes. Maybe they'll take suggestions later on :).

Sunday, October 21, 2007

Bio::Blogs #16 - call for submissions

The next edition of Bio::Blogs (bioinformatics blog journal) will be hosted at Freelancing science on the 1st of November. If you find anything this month that you think is interesting to add to this addition send an email to bioblogs at gmail. com until the end of the month. Anyone interested in hosting future edition can also send an email to volunteer.

Friday, October 19, 2007

The Fortune Cookie Genome

*in an imaginary future*

Today is the day I get the sequencing results back. It is going be interesting to have finally a glimpse of my very own genome. At the same time I am afraid of the potential disease associations they might find in there. In any case I rather know it with time to do something about it. Thats it ... I exhale and open the main door to the building walking up the desk.

- Hi. I have an appointment with my genetic adviser.
- Oh yes, go up to the 3rd floor, they are expecting you.

I walk up a DNA shaped stairway and walk into the office of one of the attending specialists. He was the one convincing me of how useful it would be to purchase the GenomeSurvey(TM) package.

- I got your email. The results are in ?
- Yes, we have your genome fully sequenced and uploaded into your service of choice. I see you have picked Google Health as your storage provider as part of the package.
- Is there any bad news ? Will I have a serious disease soon ?
- I understand your concern. There is really nothing too serious, but I will come to that in moment. You may login with your Google account here and I can guide you through some of the results.

I login to my health page and I am confronted with the usual simple white-blue Google interface. I noticed the addition of a genome tab and let my adviser tell me more about it.

- As you can see, your genome as been uploaded to your account. It has also been submitted as an John Doe genome to the NCBI personal genomics database. You may select later to make your identity known and/or associate any of your personal history information to it.
- What about the disease associations ?
- Yes. So you can click here on the associations report to have a full listings of the phenotypic associations. You have a very healthy genome, no serious rare diseases. In your case the most important finding is that you have a 2% increased likelihood of developing a heart condition when you are above 60 and a 1% increased likelihood of having Alzeimer's disease after 65.
- That's it ? 2% ? 1 %?
- Well, that is assuming no prior knowledge on your diet and other personal history as established in the large HapMap version 10. From now on you may input into the forms provided in Google Health all your diet and other personal information on a daily basis and as the information accumulates the service will automatically update the probabilities. As your adviser I should tell you that this information can be used by Google to provide you with better targeted advertisement in all other Google products.
- Right ... is this it ? Does the package include anything else ?
- Of course ! As I mentioned to you before you can click here on the prescription tab to get an informal advice on how best to deal with the associations that were found for you. You should always discuss these suggestions with your doctor before doing anything. By company policy I cannot read this information with you, since we are not liable for this. You can read it at home when you get there.
- Well , if there is nothing else I will go.
- Thank you again for choosing our GenomeSurvey(TM) package I am happy to have served you and I hope that you feel more empowered about your own health. Be well.

I go home feeling a bit cheated but obviously happy of having no serious disorder in the horizon. I rush to my home computer to read the prescription that will help me prevent my heart condition and Alzeimers. I click the GoolgeDoctor(TM) button and a clip like avatar jumps around in the screen. A computerized voice reads aloud the text appearing in the screen:

Dear Pedro. You can call me clipy ! I will be your assistant for any of your health needs. In order to decrease the likelihood for the negative phenotypes associated to your genome please consider abiding by the following rules:
- Do a lot of exercise
- Eat a healthy diet
- Find balance in your life

*in an imaginary present*

- Snap out of it, what does your say ?
I look back to the small piece of paper in my hand and read:
- "You must find balance in your life", thats what it says.
- Well, these things are never wrong.

I drop the paper on my dish and finish eating the fortune cookie before leaving the chinese restaurant with my friends.
- You won't believe what I thought of ...

Further reading
The Future of Personal Genomics (21 September 2007 Science)
How much information is there really in personal genomes and how much should patients know ? Extra points for citing a post from Eye on Dna in a Science Policy Forum.
The Science and Business of Genetic Ancestry Testing (10th October 2007 Science)
A discussion surrounding results of genetic ancestry tests and the commercialization of these tests.
Google Says Its Health Platform Is Due In Early 2008 (17 October InformationWeek)
Google is still trying to build a platform to host the health related information. Microsoft already launched a service called HealthVault (read about it from Deepak).
BMC Medical Genomics (17 October BMC blog)
BMC will launch a journal dedicated to Medical Genomics, covering articles on "on functional genomics, genome structure, genome-scale population genetics, epigenomics, proteomics, systems analysis and pharmacogenomics in relation to human health and disease."
Do-it-yourself science (17 October Nature)
This editorial links up several news, opinions and articles in the last issue of Nature to ask the question - How much involvement can patient advocates have in genetics? The most impressive articles is the story of Hugh Rienhoff, a trained geneticist and biotechnology that decided to personally research about his daughter's disease (as in buying a PCR machine etc). (via Keith)
Common sense for our genomes (18 October Nature)
Steven E. Brenner explains the need for a Genome Commons. See discussion at bbgm.

Thursday, October 11, 2007

JournalFire

A new science related service called JournalFire has started. It was apparently created by a group of graduate students that are "frustrated with the current system of scientific discourse and publication". According to the initial blog post this service "provides a centralized location for you to share, discuss, and evaluate published journal articles. You, the scientists, are put in charge of determining what studies are significant and noteworthy."

I did not have a chance to test it since it is in private beta but I have asked for an account. It looks like anyone with an .edu account should be able to access it already. It sounds promising but has many of these services a lot depends on the capacity to attract a sufficiently large group of people to sustain interesting discussions. I will update the post if I get an account to test the service.
(I wonder if the people from OpenWetWare have anything to do with this)

Monday, October 01, 2007

Bio::Blogs #15

Welcome to the 15th edition of the bioinformatics blog journal Bio::Blogs.

I complained a while ago that there was very little expansion of the bioinformatics blogging community but at least in the last couple of months it looks like this is changing. Although not necessary started last month here are three blogs that I only recently noticed: At the end of the day from Stephen Spiro (Spiro lab homepage), Paradoxus and Saaien Tist from Jan Aerts.

Not only are there more blogs there are many more examples of bloggers posting original ideas and research. Most people agree that being open about research should foster collaboration but so far few people have really tried to do it. It is inspiring to read trough these examples and trying to imagine how we might be doing science in the next couple of years.
This month was also marked by the many conference reports that we had available to read and by the experiments of taking real life conferences into Second Life.

Keeping this short and to the point this edition of Bio::Blogs focuses on these conference reports and on the ongoing experiments of using blogs to post about original research. I hope this nudges more people to go ahead and give blogging and open science a try.

Conference Reports

Neil Saunders was at the ComBio2007 conference and posted his notes about it in a four part series (1,2,3,4).

Allyson from Systems Biology & Bioinformatics provided a very extensive coverage of Integrative Bioinformatics 2007. Read all about it in chronological order from parts 1 to 10 (1,2,3,4,5,6,7,8,9,10).

From my blog here are two blog posts on the FEBS workshop - "The Biology of Modular Protein Domains" (1,2). This was not really about bioinformatics but I hope it will be interesting from the perspective of what data is coming that requires good integration strategies.

I'll jump know from real life to virtual talks. Those creative people at Nature keep testing out the potential of the web to improve interchange of knowledge. They kicked-off a seminar series of digital talks in the Second Nature island withind Second Life. The first talk by Philipp Holliger, entitled "New polymerases for old DNA" was about the engineering of new polymerases to amplify ancient DNA. Joanna Scott (working at Nature) has a very nice report on the talk in her blog.

Continuing on with virtual talks, in the past month there were another 3 sessions of the series SciFoo Lives On, organized by Jean-Claude Bradley and hosted also in Second Nature. JC Bradley covered the sessions on his blog: Sept 4 - Definitions in Open Science,Sept 10 - Communicating Science with Video, Sept 24 - Open Notebook Science Case Studies. Additional coverage by other bloggers can be found via the wiki page.

Blog articles

What are some of the most frustrating bottlenecks in bioinformatics research ? Where do we really spend most of our time ? Given that we work with digitized information it should in principle be mostly about the ideas. Thinking about interesting questions, crossing information and interpreting the results. At least for me this is typically not the case. What usually takes time is gathering all the necessary information in a way that can be analyzed. Three blog posts this month discuss this problem. Hari Jayaram and Neil Saunders posted about the problems they faced when attempting to do conceptually simple tasks. In response Deepak wrote a thoughtful post on how science databases should focus also on making the information easily accessible via appropriate APIs.

From online discussions to great examples of open science we start off with Jeremiah Faith's post were he describes an idea to determine the effect of sequence level mutations on transcription, translation, and noise.

Michael Barton from Bioinformatics Zen created a new blog dedicated to posting about his research on gene expression in yeast. Jump over there to read the many blog posts that he has already there, to provide feedback and maybe find common ground for collaborations.

Also this month, RPM from Evolgen re-started his attempt to publish original research on the blog. He is trying to study the evolution of a duplicated gene in Drosophila. There are two posts covering the introduction to the problem (part 1, part 2).

The last post highlighted in this month's edition is from Benjamin M Good. He has been working on a tool called Entity Describer to add semantic controlled vocabularies to Connotea and he has posted the manuscript they will try to publish on his blog and in Nature Precedings (10101/npre.2007.945.2).

This is it for this month. As usual, if anyone is interesting in serving as editor for any future edition, tell me by email.

ICSB 2007

I am attending the eighth International Conference on Systems Biology (ICSB 2007) in Long Beach. I typically prefer smaller conferences but this one is probably the best one to get an overview of the recent progress in systems biology. As expected the program has a broad scope and unlike last year's meeting there are no parallel sessions so I will have a chance to ear more from others fields. Any other bloggers attending ?

Saturday, September 29, 2007

Modular protein domains (an overdue wrap-up)

I did not even cover 1/3 of the Module Protein Domain workshop in my previous blog post. I will not attempt to do it know after so much time. The organizers were clearly concerned about keeping the information withing the participants so I will just post some of the general impressions that I took from the meeting.

Specificity profiling in high gear
There were several sessions dedicated to particular protein domains (SH3, SH2 and PDZ in particular) and for all of these there are several projects under way (or mostly completed) to determine the binding specificity of a large number of these domains (although in different species) using either phage display, spotted peptides and other methods. We should project ahead and start planning what to do with this information. How to combine this to predict pathways and pathway models with dynamical information. The work of Rune Linding is a a very good start at this (see NetworKIN).
Given that the methods are set up I suspect that the emphasis might shift now on exploring the evolution of binding specificities and the impact of disease causing mutations (i.e. profiling binding specificities of domain variants).

Good integration of different methods
Compared to the same meeting two years ago I had an impression that there was a better integration of different approaches (biochemical, structural, computational, etc). A particularly good example was the work of Michael B. Yaffe. There were plenty of structural talks (probably a bit too much) but I found particularly interesting the work of Ivan Dikic that presented extensive novel work on ubiquitin binding domains and Charalampos Kalodimos that presented his lab's work on potential functional roles of proline isomerization (Pubmed).
The computational part was well represented too and it was fun to see again Gary Bader and to get to know Philip Kim.

I hope to be there again in two years time to see how the field changed.

Bio::Blogs #15 - call for submission

Since there were no volunteers :) I will be hosting the 15th edition of Bio::Blogs here in the blog. I will be gathering some posts from around the web on bioinformatics and other science related topics from the last month and will post about in on the 1st of October. Suggestions are more than welcome. Please email any links to interesting blog posts to bioblogs at gmail dot com.

On a personal note, I have defended my PhD :). This mostly explains the low volume blogging.

The ephemeral journal II

(via Deepak) Earlier this month I posted about how re-grouping of content after publication could be used to foster the creation of more focused online scientific communities. My impression is that these "places" could more easily attract a group of people of similar interests that would more likely engage in discussions, in contrast to a place like PLoS ONE that covers way to many topics.

There are several names for these groupings (Nature/BMC gateways, Nature Reports, a topics page) and PLoS came out with another one - Hub. They launched a re-grouping of content focused on Clinical Trials that they call PLoS Hub for Clinical Trials. It is built on Topaz so it has everything that PLoS ONE has (comments, ratings, trackbacks,etc).

They mention in the home page of this Hub that they plan to in the future also "feature open-access articles from other journals plus user-generated content". I suspect that they could go even a bit further on this and give more control to the users for the creation of content for the Hubs and even to create new Hubs. One thing that I like in traditional journals that also creates a feeling of identity and community is the more personal news and views and editorials. PLoS could commission/invite scientists/bloggers to help create this type of content for their Hubs. This would be something like a community blog centered on this Hubs' research.

Once upon a time (before Digg if I remember right), we tried to do this in Nodalpoint. For a while we had a queue from bioinformatic related journals that we could vote on to upgrade it to the front page of the blog. At the time it did not work very well because of lack of users and participation but it in essence it was not very different from what the publishers are trying to do now. Maybe we could try it again :).

Wednesday, September 19, 2007

Vote for your favorite life science blogs

(via Science Hacker and Postgenomic) The Scientist wants to compile a list of life science blogs that people enjoy reading as a reference. It is really not a good question to ask since there are so many different fields and styles of writing.

Tuesday, September 18, 2007

More on open science

I am still catching up with a backlog of feeds and e-tocs but I just noticed that Benjamin Good posted his manuscript on E.D. in Nature Precedings. I wend back to his post where he first presented the manuscript to have a look at the comments and there is a nice discussion going on there. It is a good example of the usefulness of posting our work online. There might be still few people knowledgeable about particular interests to gain very good feedback in all areas but this will tend to grow with time.

Michael Barton from Bioinformatics Zen started a new blog to use as an open science notebook about his own research.

I have a mini project in mind about the evolution of domain families that I will start describing and working on here in the blog soon.

Sunday, September 09, 2007

The biology of modular domains (day1 and morning of day2)

I am attending the 3rd (I think it is just the third) conference on modular protein domains. It is a small conference of just 80 people with a very nice environment for discussions. Given the nature of the conference I suspect that a lot of the talks will be about unpublished material so I will be light on the details since I have not personally asked people if I may post about their work.

In the first day of the conference on modular protein domains we had the opening lecture by Wendell Lim. It was a very light and interesting discussion of the evolution and engineering of signaling pathways. Lim started by discussing some interesting results coming from the sequencing of M. brevicollis, a unicellular choanoflagellate that is related to Metazoa and might provide some information about their evolution. It is a continuation of an analysis done by Nicole King and Sean B. Carroll that first identified a receptor tyrosine kinase in M. brevicollis, the first time one was identified outside of the Metazoa. The discussion was generally about the evolution of kinase signaling and how such a system of what Lim was naming "readers"~phospho-binding domains, "writers"~kinases and "erasers"~phosphotases can arise in evolution.
The second part of his talk was about the efforts to understand the evolutionary capacity of signaling networks by trying to engineer new or altered pathways. In this case the focus was on how with few components and small changes in these components it is possible to shape the dynamic responses of signaling networks.

Morning session of the second day

Synthetic biology
The Synthetic Biology sessions started off with a talk by David Searls on "A linguistic view of modularity in macromolecules and networks" (that was not very related to synthetic biology but nevertheless interesting). Searls detailed his views on the analogies between linguistics and biology. Here is a recent review by Mario Gimona on this analogy. At the protein level we could think of sequence, structure, function and protein role as similar to lexical, syntatic, semantic and pragmatic levels of linguistic analysis:

(Image reproduced with permission)

The general idea of building these bridges over topics is to be able to take existing methods and discussions from one side to the other (see review).

The second talk was by Kalle Saksela and again it had little to do with synthetic biology. Saksela's group is working on high-throughput interaction mapping for human SH3 domains against full proteins (human and viral proteins). They mentioned their progress in expressing and analyzing a subset of these interactions. He mentioned an interesting example were the Nck and Eps8L1 SH3 domain binding site in CD3epsilon overlaped with an ITAM motif such that the phosphorylation of the ITAM motif abolished binding by the SH3 domains. It is a nice example of signaling mediated by different types of peptide binding domains (see paper for details).

The third talk was by Rudolf Volkmer. He gave a short talk on a library of coiled coil proteins. The library contains many single mutant variants of the GCN4 leucine-zipper sequence. They then tested pairs mutants for heterodimerization by SPOT assays. Aside from a extending the knowledge of these domain family the library can also be used know as a toolkit of binding domains for synthetic biology (the work is already published).

The final talk on this panel was from Samantha Sutton from the Drew Endy lab. This was more like what one would expect from a synthetic biology talk . Samantha Sutton is interested in developing what she calls Post Translational Devices, general abstract devices that can regulate the post translational state of proteins in a predictable fashion. She has a page in OpenWetWare detailing her thoughts on this.

The second panel in the morning was about In silico computational methods.
Cesareni presented their ongoing efforts to experimentally determine human SH3 and SH2 interactions with spotted peptides. He then showed how this data can be used to search for examples where there is overlapping recognition by different domain types. The work is similar in methodology to the paper published by Christiane Landgraf and colleagues in PLoS Biology but know using two domain families and the human proteome.

Vernon Alvarez from AxCell Biosciences, gave a talk about a proprietary database called ProChart (that I cannot find online) containing many domain-peptide interactions tested by the company. He was basically promoting the database for anyone interested in collaborations.

The third talk was by Norman Davey author of SLIMDisc a linear motif discovery method. He is trying to improve their method, mostly by improving the statistics.

I gave the second short talk of the session. It was on predicting binding specificity of peptide binding domains using structural information. It is basically a continuation of some of the work I mentioned before here in the blog about the use of structures in systems biology but know applied to domain-peptide interactions.

Saturday, September 08, 2007

The Biology of Modular Protein Domains

From tomorrow on I will be in Austria for a small conference on the biology of protein domains. I might post some short notes about the meeting in the next few days. I'll get a chance to present some of the things I have been working on about the prediction of domain-peptide interactions from structural data.

Here is one of these modular protein domains, an SH3 domain, in complex with a peptide:

The very short summary of it is that it is possible to take the structure of one of these domains in complex with a peptide (ex: SH3, phospho binding domains, kinases, etc) and predict their binding specificity. To some extent it is also possible to take a sequence, obtain a model (depends on structural coverage) and determine its specificity. I'll talk more about the details (hopefully) soon.