Monday, October 30, 2006

Networks in the sky: a new concept of modularity?

A recent paper by Batada and colleagues published in PLoS Biology tries to consolidate the available information on protein-protein interactions for S. cerevisiae. The authors have attempted to create a high-confidence set of interactions that they then further analyze. The main conclusion from the paper is that the highly connected proteins (usually referred to as protein hubs) do not avoid each other, as was previously put forward by other authors. From this observation they suggest that we should rethink our view of modularity in cellular networks. Cellular interaction networks should be viewed, not as altocumulus clouds, “i.e., cotton ball-like structures sparsely connected by thin wisps”, but instead as the “continuous dense aggregations of stratus clouds”.

Although I find it useful to constantly update our view of cellular networks trough the consolidation of available data, I think some words of caution remained unsaid in this work.

The new consolidated protein interaction network was obtained mainly from the addition of a recent curation effort from the literature, to the already available high throughput interaction datasets obtained using yeast-two-hybrid and affinity methods. The majority of the interactions added are from affinity methods. This leads me to one of points I think are not usually mentioned in this type of efforts, that not all methods will provide the same information. For example, I think that affinity methods mostly inform us that two proteins share the same complex. When a protein is tagged and used as a bait to capture prey proteins, the identified preys should belong to the same complex but it is not obvious that there should be a direct interaction between the two. However this was the assumption used here in this work. This is usually referred to as the spoke model (see figure 1).

I have tried to evaluate how likely are bait-prey interactions to occur, when compared to prey-prey interactions, using either structural information or yeast-two-hybrid interaction data (see table 1 and table 2).

Table 1 – Pull down experiments were taken from Gavin et al, 2002. For each individually reported pull down, potential bait-prey and prey-prey interactions were counted if the corresponding proteins had known PFAM domains. Using the database of domain structural interactions (iPFAM), I tried to search for plausible domain-domain interactions that could account for the protein interaction. Bait-prey interactions are roughly 2 times more likely to be explainable by a known domain-domain interactions currently stored in structural databases, than prey-prey interactions.
Table 2 – Pull down experiments were taken from Gavin et al, 2002. For each individually reported pull down, potential bait-prey and prey-prey interactions were counted. The overlap of these interactions with known yeast-two-hybrid interactions is shown. Bait-prey interactions are roughly 2 times more likely to be observed in a yeast-two-hybrid study than prey-prey interactions.

What one could conclude from this is that in fact bait-prey interactions are more likely to occur than prey-prey interactions but also that a small percentage of the bait-prey interactions can be validated with a method that is more likely to measure direct interactions. Using both domain-domain structural information and yeast-two-hybrid studies, 20% of the bait-prey interactions can be accounted for. Although this value depends on our current knowledge of domain-domain interactions and the coverage of yeast-two-hybrid studied it should at least be discussed. One problem in using this model can be for example seen in figure two. When multiple baits are used for the same complex, it is easy to create artificially interacting hubs, when extrapolating binary interactions from affinity data.

In all fairness, in this study, the authors only took as a true interaction, one that was observed more than once but they also consider multiple affinity observations as confirming a direct interaction. It would be useful to come up with better methods to extrapolate from complexes to binary interactions. I hope the differences observed in this work regarding hub-hub interactions are not mainly due to the proportion of interactions extrapolated from affinity methods.

One observation that the authors used to support their claims was that, apparently, the fraction of hub-hub interactions depends on the scale of the experiment (see figure 3A taken from the authors paper). According to this result, experiments reporting higher number of interactions tend to have a lower fraction of hub-hub interactions. Hubs were defined, in the figure legend, as the 10% more connected proteins in the network, but in the results description are defined as the 5% more connected proteins. One thing that the authors failed to show is how this fraction of hub-hub interactions depends just on the size of the network. I have used the network given in the manuscript and randomly sampled 10% to 90% of the interactions (repeated 50 times) to plot the dependence of the fraction of hub-hub interactions on network size (see figure 3B). Hubs were defined as the 5% more connected proteins.

What we can see is that the fraction of hub-hub interactions depends on the size of the network, decreasing for smaller network sizes. Can this explain the result observed by the authors? In figure 3A, taken from the authors’ paper, the whole network was binned by the number of interactions reported per paper. I have tried to calculate the size of different networks obtained from binning according to the scale of the experiment (see table 3).

I would say that, although I could not reproduce exactly the same bins reported in the paper, the trend is for a decrease of network size when comparing all interactions reported in small-scale studies to those reported in medium-scale experiments. Therefore I think the observed result is mostly explained by differences in network size. In fact, if one would take all interactions reported in experiments observing at least 30 interactions the fraction of hub-hub interactions observed would be even higher than in the network obtained from very small scale experiments.

I think this manuscript highlights that we should constantly re-evaluate our views on cellular networks as more data is made available. Although the concerns raised here do not contradict their conclusions I think they should have been more carefully discussed. In particular I think we require better methods to extrapolate binary interactions from affinity methods and that it is important to mention that this might lead to false positive interactions. Also, there is a strong effect of network size on the observed fraction of hub-hub interactions. This might explain both the observed increase of hub-hub interactions with increase in the coverage and the observed dependence of the fraction of hub-hub interactions on the scale of the experiment.

Thursday, October 26, 2006

I googled from Yahoo too

There are some google people going nuts over at the official Google blog. They don't want us common mortals to use the word Google as a verb when searching trough something other than their website. That is a very good away to attract some negative feelings.

While we're pleased that so many people think of us when they think of searching the web, let's face it, we do have a brand to protect, so we'd like to make clear that you should please only use "Google" when you’re actually referring to Google Inc. and our services.

So, let me say how much I like to google from Firefox for example. I actually google in Firefox most everything. When I don't google from firefox I google from my taksbar. Right now both are set to actually google in Google, but many more nice blog posts like this one and I migh actually give Live search another try.

Wednesday, October 25, 2006

Focus on Systems Biology: a User's Guide

Nature Cell Biology and Nature Reviews Molecular Cell Biology are jointly producing a Focus on Systems Biology: a User's Guide. It contains several reviews on the topics related to data management, bioinformatics and modelling (probably sub only)

PLoS ONE - Spread the Word

PLoS ONE will soon make its debut. They seem to be going strong with 210 submissions since the initial launch on the 4th of August. Last week alone they got 30 papers to review.

According to an email they sent to the mailing list they have now 179 members in the editorial board but they encourage scientists from areas that are not yet well covered in the PLoS ONE editorial board to work with the journal.

They have also added some goodies to download for anyone interested in spreading the word. Grab a flyer, print it and post it up on your work place :)


Sunday, October 22, 2006

Bio::Blogs icon entry

Here is an icon made by Rick from My Biotech Life:

I really like how he managed to fit the "bioinformatics blog journal" on top of bio::blogs. He is right that it is a bit big for an icon but it looks really great :). Thanks for the effort.
I guess we can keep the challenge open until the end of the month and then choose one during the edition. If anyone wants to submit an entry by email send it to bioblogs _at_
Recently added feeds - more science bloggers

The blog of Jonathan Eisen, an evolutionary biologist and a Professor at U. C. Davis (Eisen lab at OpenWetWare).

The blog of Marc Gerstein, Prof. Biomedical Informatics at Yale (lab site)

(via mndoci) The Omics World, a blog about : "Genomics, life science technology , computational biology , structural biology and their inter-relationships"

Wednesday, October 18, 2006

Changing scientific publishing

There was a panel discussion about the future of scientific publishing in the Neuroscience 2006 conference. Sandra Aamodt of Nature's Action Potencial blog, Jake Young from Pure Pedantry and Dave Munger from Cognitive Daily have blogged their thoughts on what was talked about.

There are two main points under discussion. One is how can the publishers make the switch to an open access model (where all the content is available) in a sustainable way. My impression from what I read and from talking to other people is that a lot of the publishers are or will be experimenting with open access options and if the demand is high enough this will be the direction they will go for.

The other big discussion is how to transition to the web, taking advantage of other possible tools that are not available in the print world. This issue is unfortunately much less explored. One of reasons is because they are stuck in the first issue. Another reason is that the people who are in charge of editorial boards are taking some time to realise the potential of the internet. Blogs and wikis for them are something messy and chaotic that teens use.

Is there any science related activities on blogs and wikis ?
- according the September statistics there are about 1500 science related blog posts per week coming from about 200 science blogs register in postgenomic
- postgenomic has gathered comments on about 2500 papers
- there are about 1000 science related blogs registered in Technorati.
- there are more than 1500 scientists helping out to build the OpenWetWare wiki.

I did not try to get numbers from connotea and citeulike but I am sure that are a lot of papers being tagged and rated every day.
Given that the science community as just started to participate online in blogs and wikis I guess the numbers will only increase.

One interesting detail, postgenomic keeps track of the most referenced journals (on indexed blogs) and the top tree are Nature, Science and ArXiv. The third most blogged about "journal" is a repository of manuscripts that have not yet been peer reviewed.

There are all sorts of possible criticism that one can make of these numbers. Technorati numbers are probably inflated with spam blogs and blogs that are not really science related, comments indexed by postgenomic can be anything from one line to a full review, etc. I just wanted to show that there is already a lot of science communication going on in blogs and wikis.

Tuesday, October 17, 2006

Bio::Blogs icon challenge

Bio::Blogs, the bioinformatics blog carnival/journal is going for it's 5th edition. It will be up on the 1st of November on Chris' blog. Maybe we could celebrate by trying to create an icon to represent Bio::Blogs. I quickly stole some ideas (I think from something I saw in Neil's blog :) to make this up:

It should not be difficult to make something better :) Does anyone want to try ?

Saturday, October 14, 2006

The 3rd EMBL Biennial Symposium

One of the nice things about being at EMBL is that we can sneak into the ongoing conferences. This one is entitled: "From functional genomics to systems biology" and today was the first day.

I will just highlight some of the talks that I found most interesting. One was by a group leader here at EMBL, Lars Steinmetz who has been using tilling arrays (microarrays that try to cover the whole genome) in S. cerevisiae to look at the expression of non-coding regions in the genome. Although S. cerevisiae does not have the components for RNA interference, it does seem to have many non coding RNAs that are expressed. Many seem to be antisense to coding genes and also they could identify some of these non coding RNAs that were oscillating in a cell-cycle dependent manner. They are also using this arrays to look at the differences of expression of different strains and to study events of recombination.

The talk that I most enjoyed today was by Alexander van Oudenaarden. His group is studying small cellular circuits and cellular noise. He showed at lot of data on the galactose inducible promoter, detailed in this picture I took from their site :

He showed that the activity of the Gal4 promoter can be in two separable steady states and that this is history dependent. So yeast cells have memory in their metabolic state. He also showed that this state is inherited by the daughter cells and that they can manipulate the systems such that circuit can have a more or less stable memory. Also, not only the state is inherited but even the likelihood that the cells will switch state is also transmitted to the daughter cells. They studied this with flow cytometry to quantify single cell measurements and with microscopy to be able to follow the lineages. All of this was well integrated into predictive models of the system. Extra points for having a webpage on OpenWetWare :)

When will we be at a point were we can actually take high-throughput assays, combine them, make a model of modules that we find and have this detailed understanding of how they work ?

Friday, October 13, 2006

Google Data Privacy - GDP

(via Konrad) I like this idea of having some way to tell Google what data to keep and for how long to keep it. Some one-point access to how much they really know about me :). I don't mind to much that they have this information, but I would like to be able to control it. They even have gotten really good at serving me ads. If it is an ad pointing to something I want to go check out than it's a good ad.
I would go a step further then. I want to own that information. I want to be able to take it somewhere else and share it with other services that might work better if they know these things about me. Amazon would probably suggest more interesting things.

So, I also want a Google Data Privacy.

For more check out AttentionTrust.

Wednesday, October 11, 2006

Community consultation @ Nature Biotech

Open peer review at Nat Biotech ? At least this was the first time I noticed it:

Various scientific communities are engaged in producing data-reporting standards similar to the MIAME guidelines for microarray data. Some of these papers are under consideration for publication in Nature Biotechnology. To encourage broad participation in the standards-development process, we are making the papers freely available at and we urge you to participate and send us your comments and suggestions, which will be carefully considered by the authors, reviewers and editors.

Currently they are asking for comments on the proposal for: The Minimum Information required for reporting a Molecular Interaction Experiment (MIMIx)(PDF).

Sunday, October 08, 2006

Those funny buzzwords in research

I find it interesting to keep track of buzzwords in science. What exactly is a buzzword ? Here are some definitions from wikipedia:
A buzzword (also known as a fashion word or vogue word) is an idiom, often a neologism, commonly used in managerial, technical, administrative, and sometimes political environments.
Buzzwords appear ubiquitously but their actual meanings often remain unclear.
Buzzwords are typically intended to create the impression of knowledge for a wide audience.

Some of the buzzwords in science are used carve out sub fields of research, like systems biology, synthetic biology, comparative genomics, bioinformatics, metagenomics. What makes them more or less trendy ? The attention they are able to draw is (i guess) based on who is promoting them, the coverage they get in journals and probably more importantly the funding bodies perception of their relative importance. Ofcourse a well backed meme will be perceived as important, will receive more funding and the trend cycle kicks in. Then, the creation of a new buzzwords in science and ultimatly of new fields is very much dependent on marketing. Nothing new there, right?

On the other hand it is useful to be able to create some boxes around a couple of ideas and to build communities that can together work on a problem. These buzzwords help people to identify with each other as a part of the same community. Being forced to define the problems that the X-omics or X-biology faces helps us to tackle them, propouse new methods and to apply for grants. So, I think buzzwords are necessary to identify a group of related problems and to build communities around them.

What got this whole rant started ? :) A review by Eugene V Koonin entitled: "Evolutionary systems biology: links between gene evolution and function". I recently posted about my interest in the effect of mutation in biological systems so I am interested in this meme. Do we really need to call it "evolutionary systems biology" ? :)

Anyway, some facts about it.
I think it was first proposed as a field in a paper in April 2005: "Genomes, phylogeny, and evolutionary systems biology". At this time there are 704 hits in Google (one more after they crawl this post) and 6 papers in pubmed. Three of these are just because the CNIO has an "Evolutionary systems biology Initiative" in the Structural and Computational Biology Program.

There are at least 8 groups or people supporting the meme in their research interests or even in the name of the group.

There is one poster with this buzzword in the title in ICSB-2006, that is starting tomorrow.

What is it about ?

"To understand molecular evolution and gene regulation on the scale of complete genomes and biological systems."

"Evolutionary constraints on the trajectories are reflected at the molecular level, and can be probed by a number of techniques including biochemistry and comparative analysis of extant successful protein sequences. These constraints should also be reflected at higher levels of organization in biological systems, such as biochemical pathways. Integrating information across these levels of organization is critical for understanding the evolution of biological systems."

I'll probably check back on this in some time.

Friday, October 06, 2006

The igNobel prizes

The prizes for improbable research are back again. I think they are currently suffering from too much load on their servers so here is the link to Google News.

It is always good the have a bit of a laugh at how focused some of the scientific research can be. So, here are some of my favourites :)

BIOLOGY — Bart Knols and Ruurd de Jong, for showing that female malaria mosquitoes are attracted equally to the smell of Limburger cheese and to the smell of human feet.

MEDICINE — Francis Fesmire, for his medical case report “Termination of Intractable Hiccups with Digital Rectal Massage”; and Majed Odeh, Harry Bassan, and Arie Oliven for their subsequent medical case report.

PHYSICS — Basile Audoly and Sebastien Neukirch, for their insights into why dry spaghetti often breaks into more than two pieces when bent.

From the abstract:
"When thin brittle rods such as dry spaghetti pasta are bent beyond their limit curvature, they often break into more than two pieces, typically three or four. With the aim to understand these multiple breakings, we study the dynamics of a rod bent just below its limit curvature and suddenly released at one end. We find that the sudden relaxation of the curvature at the newly freed end leads to a burst of flexural waves, whose dynamics are described by a self-similar solution with no adjustable parameters. These flexural waves locally increase the curvature in the rod and we argue that this counter-intuitive mechanism is responsible for the fragmentation of brittle rods under bending. A simple experiment supporting the claim is presented."

ACOUSTICS — D. Lynn Halpern, Randolph Blake and James Hillenbrand for their experiments to learn why people dislike the sound of fingernails on a chalkboard.

(via the spotlight radio):
"The scientists performed experiments on willing people. They chose one of the most disliked sounds. Do you recognise it? Can you remember sitting in a classroom at school? The teacher would stand at the front. She would write on a blackboard. There was always someone who waited for the teacher to leave the room. They would run to the front of the room. And then, they put their fingers at the top of the black board. They moved their fingernails slowly down the board. Listening to this sound still makes you feel horrible!"

Wednesday, October 04, 2006

SlideShare - Share your presentations

SlideShare is online tool to upload and share presentations. Very much like what Youtube does with video. The presentations are uploaded with some metadata (tags, tittle and description) and the content is searchable. Once the presentation is uploaded you get a direct link to it and the possibility to embed it in webpages (like the blog). From the webpage you can also start a full-screen view that allows you to present the slides from any pc with net and a browser. The only complain so far is that it did not convert the animations I had in the slides but they should be working on it.

I gave it a quick go with some slides of mine. Jokes on my poor design skills are not very welcomed :).

Google widgets set free

Google announced that their widgets (or gadgets) can now be used on third party webpages. I thought of trying some parasitic computing. Creating some modules that would store state. Whenever I would see a gadget on a page it would query another gadgets on some other page, retrieve or set some state, do something with it and set some state on a third widget somewhere else.
Unfortunately for some reason they explicitly say that recording state will invalidate your gadget:
* It cannot be inlined. For example, a syndicated gadget cannot modify the container page.
* It cannot store state. For example, a syndicated gadget cannot be a to-do list that stores personal list items for each user.
* Its functionality should not be dependent on each user specifying different user preferences.

I am not sure they work with blogs, but here goes a try.

Tuesday, October 03, 2006

(via Neil) Pansapiens and Chris, regular visitors at Nodalpoint, started two new bioinformatic related blogs. As I mentioned before Chris will host the next edition of BioBlogs.

Sunday, October 01, 2006


Another month as gone by and so we have again another round up of some bioinformatic related posts on Bio::Blogs. The 4th edition is up in Sandra Porter's blog, Discovering Biology in a Digital World. The best way to keep up with the carnival is to get the rss feed from the Bio::Blogs blog.
Anyone interested in participating can submit links to interesting posts to bioblogs at gmail com. Bio::Blogs is hosted every month by a different blog and the November 1st edition will be hosted by Chris. Any bioinformatics related blog can host the blog by sending an email to the Bio::Blogs mail.