Cellular Consequences of Genetic variation

Monday, October 01, 2007

Bio::Blogs #15

Welcome to the 15th edition of the bioinformatics blog journal Bio::Blogs.

I complained a while ago that there was very little expansion of the bioinformatics blogging community but at least in the last couple of months it looks like this is changing. Although not necessary started last month here are three blogs that I only recently noticed: At the end of the day from Stephen Spiro (Spiro lab homepage), Paradoxus and Saaien Tist from Jan Aerts.

Not only are there more blogs there are many more examples of bloggers posting original ideas and research. Most people agree that being open about research should foster collaboration but so far few people have really tried to do it. It is inspiring to read trough these examples and trying to imagine how we might be doing science in the next couple of years.
This month was also marked by the many conference reports that we had available to read and by the experiments of taking real life conferences into Second Life.

Keeping this short and to the point this edition of Bio::Blogs focuses on these conference reports and on the ongoing experiments of using blogs to post about original research. I hope this nudges more people to go ahead and give blogging and open science a try.

Conference Reports

Neil Saunders was at the ComBio2007 conference and posted his notes about it in a four part series (1,2,3,4).

Allyson from Systems Biology & Bioinformatics provided a very extensive coverage of Integrative Bioinformatics 2007. Read all about it in chronological order from parts 1 to 10 (1,2,3,4,5,6,7,8,9,10).

From my blog here are two blog posts on the FEBS workshop - "The Biology of Modular Protein Domains" (1,2). This was not really about bioinformatics but I hope it will be interesting from the perspective of what data is coming that requires good integration strategies.

I'll jump know from real life to virtual talks. Those creative people at Nature keep testing out the potential of the web to improve interchange of knowledge. They kicked-off a seminar series of digital talks in the Second Nature island withind Second Life. The first talk by Philipp Holliger, entitled "New polymerases for old DNA" was about the engineering of new polymerases to amplify ancient DNA. Joanna Scott (working at Nature) has a very nice report on the talk in her blog.

Continuing on with virtual talks, in the past month there were another 3 sessions of the series SciFoo Lives On, organized by Jean-Claude Bradley and hosted also in Second Nature. JC Bradley covered the sessions on his blog: Sept 4 - Definitions in Open Science,Sept 10 - Communicating Science with Video, Sept 24 - Open Notebook Science Case Studies. Additional coverage by other bloggers can be found via the wiki page.

Blog articles

What are some of the most frustrating bottlenecks in bioinformatics research ? Where do we really spend most of our time ? Given that we work with digitized information it should in principle be mostly about the ideas. Thinking about interesting questions, crossing information and interpreting the results. At least for me this is typically not the case. What usually takes time is gathering all the necessary information in a way that can be analyzed. Three blog posts this month discuss this problem. Hari Jayaram and Neil Saunders posted about the problems they faced when attempting to do conceptually simple tasks. In response Deepak wrote a thoughtful post on how science databases should focus also on making the information easily accessible via appropriate APIs.

From online discussions to great examples of open science we start off with Jeremiah Faith's post were he describes an idea to determine the effect of sequence level mutations on transcription, translation, and noise.

Michael Barton from Bioinformatics Zen created a new blog dedicated to posting about his research on gene expression in yeast. Jump over there to read the many blog posts that he has already there, to provide feedback and maybe find common ground for collaborations.

Also this month, RPM from Evolgen re-started his attempt to publish original research on the blog. He is trying to study the evolution of a duplicated gene in Drosophila. There are two posts covering the introduction to the problem (part 1, part 2).

The last post highlighted in this month's edition is from Benjamin M Good. He has been working on a tool called Entity Describer to add semantic controlled vocabularies to Connotea and he has posted the manuscript they will try to publish on his blog and in Nature Precedings (10101/npre.2007.945.2).

This is it for this month. As usual, if anyone is interesting in serving as editor for any future edition, tell me by email.

ICSB 2007

I am attending the eighth International Conference on Systems Biology (ICSB 2007) in Long Beach. I typically prefer smaller conferences but this one is probably the best one to get an overview of the recent progress in systems biology. As expected the program has a broad scope and unlike last year's meeting there are no parallel sessions so I will have a chance to ear more from others fields. Any other bloggers attending ?

Saturday, September 29, 2007

Modular protein domains (an overdue wrap-up)

I did not even cover 1/3 of the Module Protein Domain workshop in my previous blog post. I will not attempt to do it know after so much time. The organizers were clearly concerned about keeping the information withing the participants so I will just post some of the general impressions that I took from the meeting.

Specificity profiling in high gear
There were several sessions dedicated to particular protein domains (SH3, SH2 and PDZ in particular) and for all of these there are several projects under way (or mostly completed) to determine the binding specificity of a large number of these domains (although in different species) using either phage display, spotted peptides and other methods. We should project ahead and start planning what to do with this information. How to combine this to predict pathways and pathway models with dynamical information. The work of Rune Linding is a a very good start at this (see NetworKIN).
Given that the methods are set up I suspect that the emphasis might shift now on exploring the evolution of binding specificities and the impact of disease causing mutations (i.e. profiling binding specificities of domain variants).

Good integration of different methods
Compared to the same meeting two years ago I had an impression that there was a better integration of different approaches (biochemical, structural, computational, etc). A particularly good example was the work of Michael B. Yaffe. There were plenty of structural talks (probably a bit too much) but I found particularly interesting the work of Ivan Dikic that presented extensive novel work on ubiquitin binding domains and Charalampos Kalodimos that presented his lab's work on potential functional roles of proline isomerization (Pubmed).
The computational part was well represented too and it was fun to see again Gary Bader and to get to know Philip Kim.

I hope to be there again in two years time to see how the field changed.

Bio::Blogs #15 - call for submission

Since there were no volunteers :) I will be hosting the 15th edition of Bio::Blogs here in the blog. I will be gathering some posts from around the web on bioinformatics and other science related topics from the last month and will post about in on the 1st of October. Suggestions are more than welcome. Please email any links to interesting blog posts to bioblogs at gmail dot com.

On a personal note, I have defended my PhD :). This mostly explains the low volume blogging.

The ephemeral journal II

(via Deepak) Earlier this month I posted about how re-grouping of content after publication could be used to foster the creation of more focused online scientific communities. My impression is that these "places" could more easily attract a group of people of similar interests that would more likely engage in discussions, in contrast to a place like PLoS ONE that covers way to many topics.

There are several names for these groupings (Nature/BMC gateways, Nature Reports, a topics page) and PLoS came out with another one - Hub. They launched a re-grouping of content focused on Clinical Trials that they call PLoS Hub for Clinical Trials. It is built on Topaz so it has everything that PLoS ONE has (comments, ratings, trackbacks,etc).

They mention in the home page of this Hub that they plan to in the future also "feature open-access articles from other journals plus user-generated content". I suspect that they could go even a bit further on this and give more control to the users for the creation of content for the Hubs and even to create new Hubs. One thing that I like in traditional journals that also creates a feeling of identity and community is the more personal news and views and editorials. PLoS could commission/invite scientists/bloggers to help create this type of content for their Hubs. This would be something like a community blog centered on this Hubs' research.

Once upon a time (before Digg if I remember right), we tried to do this in Nodalpoint. For a while we had a queue from bioinformatic related journals that we could vote on to upgrade it to the front page of the blog. At the time it did not work very well because of lack of users and participation but it in essence it was not very different from what the publishers are trying to do now. Maybe we could try it again :).

Wednesday, September 19, 2007

Vote for your favorite life science blogs

(via Science Hacker and Postgenomic) The Scientist wants to compile a list of life science blogs that people enjoy reading as a reference. It is really not a good question to ask since there are so many different fields and styles of writing.

Tuesday, September 18, 2007

More on open science

I am still catching up with a backlog of feeds and e-tocs but I just noticed that Benjamin Good posted his manuscript on E.D. in Nature Precedings. I wend back to his post where he first presented the manuscript to have a look at the comments and there is a nice discussion going on there. It is a good example of the usefulness of posting our work online. There might be still few people knowledgeable about particular interests to gain very good feedback in all areas but this will tend to grow with time.

Michael Barton from Bioinformatics Zen started a new blog to use as an open science notebook about his own research.

I have a mini project in mind about the evolution of domain families that I will start describing and working on here in the blog soon.

Sunday, September 09, 2007

The biology of modular domains (day1 and morning of day2)

I am attending the 3rd (I think it is just the third) conference on modular protein domains. It is a small conference of just 80 people with a very nice environment for discussions. Given the nature of the conference I suspect that a lot of the talks will be about unpublished material so I will be light on the details since I have not personally asked people if I may post about their work.

In the first day of the conference on modular protein domains we had the opening lecture by Wendell Lim. It was a very light and interesting discussion of the evolution and engineering of signaling pathways. Lim started by discussing some interesting results coming from the sequencing of M. brevicollis, a unicellular choanoflagellate that is related to Metazoa and might provide some information about their evolution. It is a continuation of an analysis done by Nicole King and Sean B. Carroll that first identified a receptor tyrosine kinase in M. brevicollis, the first time one was identified outside of the Metazoa. The discussion was generally about the evolution of kinase signaling and how such a system of what Lim was naming "readers"~phospho-binding domains, "writers"~kinases and "erasers"~phosphotases can arise in evolution.
The second part of his talk was about the efforts to understand the evolutionary capacity of signaling networks by trying to engineer new or altered pathways. In this case the focus was on how with few components and small changes in these components it is possible to shape the dynamic responses of signaling networks.

Morning session of the second day

Synthetic biology
The Synthetic Biology sessions started off with a talk by David Searls on "A linguistic view of modularity in macromolecules and networks" (that was not very related to synthetic biology but nevertheless interesting). Searls detailed his views on the analogies between linguistics and biology. Here is a recent review by Mario Gimona on this analogy. At the protein level we could think of sequence, structure, function and protein role as similar to lexical, syntatic, semantic and pragmatic levels of linguistic analysis:

(Image reproduced with permission)

The general idea of building these bridges over topics is to be able to take existing methods and discussions from one side to the other (see review).

The second talk was by Kalle Saksela and again it had little to do with synthetic biology. Saksela's group is working on high-throughput interaction mapping for human SH3 domains against full proteins (human and viral proteins). They mentioned their progress in expressing and analyzing a subset of these interactions. He mentioned an interesting example were the Nck and Eps8L1 SH3 domain binding site in CD3epsilon overlaped with an ITAM motif such that the phosphorylation of the ITAM motif abolished binding by the SH3 domains. It is a nice example of signaling mediated by different types of peptide binding domains (see paper for details).

The third talk was by Rudolf Volkmer. He gave a short talk on a library of coiled coil proteins. The library contains many single mutant variants of the GCN4 leucine-zipper sequence. They then tested pairs mutants for heterodimerization by SPOT assays. Aside from a extending the knowledge of these domain family the library can also be used know as a toolkit of binding domains for synthetic biology (the work is already published).

The final talk on this panel was from Samantha Sutton from the Drew Endy lab. This was more like what one would expect from a synthetic biology talk . Samantha Sutton is interested in developing what she calls Post Translational Devices, general abstract devices that can regulate the post translational state of proteins in a predictable fashion. She has a page in OpenWetWare detailing her thoughts on this.

The second panel in the morning was about In silico computational methods.
Cesareni presented their ongoing efforts to experimentally determine human SH3 and SH2 interactions with spotted peptides. He then showed how this data can be used to search for examples where there is overlapping recognition by different domain types. The work is similar in methodology to the paper published by Christiane Landgraf and colleagues in PLoS Biology but know using two domain families and the human proteome.

Vernon Alvarez from AxCell Biosciences, gave a talk about a proprietary database called ProChart (that I cannot find online) containing many domain-peptide interactions tested by the company. He was basically promoting the database for anyone interested in collaborations.

The third talk was by Norman Davey author of SLIMDisc a linear motif discovery method. He is trying to improve their method, mostly by improving the statistics.

I gave the second short talk of the session. It was on predicting binding specificity of peptide binding domains using structural information. It is basically a continuation of some of the work I mentioned before here in the blog about the use of structures in systems biology but know applied to domain-peptide interactions.

Saturday, September 08, 2007

The Biology of Modular Protein Domains

From tomorrow on I will be in Austria for a small conference on the biology of protein domains. I might post some short notes about the meeting in the next few days. I'll get a chance to present some of the things I have been working on about the prediction of domain-peptide interactions from structural data.

Here is one of these modular protein domains, an SH3 domain, in complex with a peptide:

The very short summary of it is that it is possible to take the structure of one of these domains in complex with a peptide (ex: SH3, phospho binding domains, kinases, etc) and predict their binding specificity. To some extent it is also possible to take a sequence, obtain a model (depends on structural coverage) and determine its specificity. I'll talk more about the details (hopefully) soon.

Tuesday, September 04, 2007

Scifoo Lives On: Definitions in Open Science

I am having a quick look at the session Definition in Open Science, going on in Second Nature (I'm Duriel Akula in Second Life). The place looks very different from the first time I had a look around the island. It is full of posters and other interesting material. Here is a picture as some of the first people started gathering:

Live coverage of the event by Berci (also in the picture).

Wednesday, August 29, 2007

Bio::Blogs #14 - Update

The 14th edition of Bio::Blogs will be hosted by Ricardo at My Biotech Life. It will be made available on the 1st of September and submissions can be sent by email as mentioned in his blog post.

Update: The 14th edition is now posted at My Biotech Life. With all the deadlines I had this past month I left it almost until the end to organize a host. Thanks again to everyone that contributed on such short notice.

Is anyone interested in serving as host for the October edition ?

SciVee.tv background info

A while ago SciVee was announced via several blog posts. Here is a link to the first one I read by Deepak and a link to the cluster in Postgenomic.

I thought at first glance that this was a partnership between some small start-up and a content provider (PLoS). After browsing a couple of the videos I noticed that most are from papers authored by Philip E. Bourne. Given the connections to both PLoS and SDSC (two of the site's partners) I thought that this might be an academic effort after all.
A couple of searches tells us that abailey was responsible for a Scivee mailing list at SDCS that no longer exists. abailey apparently stands for Apryl Bailey, someone involved in the SDSC CI Channel, a "webcast video service and resource for the scientific communities" (from their about page).

Apryl Bailey also appears listed in the Scivee Team in one of the slides of a talk (PDF) that Philip Bourne gave in June this year. According to this recent news story it looks like the launch was actually premature and triggered by this talk:
"According to one founder, Philip Bourne of the University of California–San Diego (UCSD) and founding editor in chief of PLoS Computational Biology, he talked about the project at a scientific meeting and the buzz began prematurely."

It is an academic effort, probably related to this CI Channel mentioned above:
"The project began with some pilot pubcasts done at UCSD to test video formats and has involved the other PLoS editors. There are currently eight people on the SciVee team. The SDSC is providing the site hosting."

From one of the slides of the talk:
Developmental Phases
• Phase I (One Year) – Invite authors of papers published in PLoS journals to upload a video or podcast to SciVee.tv describing the motivation, key results and major conclusions of the published study. Establish linkage between literature and video – source of metadata etc. – September 2007

• Phase II (Years 2- 3) - Scrape PubMed on a daily basis and extend the invitation to authors of all papers in the life sciences; develop video authoring server; provide
ratings and virtual community comment

• Phase III (Year 4- ) - Extend to other scientific disciplines

Saturday, August 11, 2007

The ephemeral journal

Recently I mentioned the start of yet another journal covering one of the topics I would place on the top of a hype cycle curve. This together with the apparent ever increasing number of journals everywhere got me thinking of birth/death of science journals. The cost of starting up a new journal is so low that the turn-over can only be higher. Still, we don't typically see a lot of "journal death". They are meant to be respected and built up reputation among the public audience they serve.
It looks however inevitable that with a limited attention capacity and ever increasing number of journals that science hype cycles might have a strong influence on a journals activities. If hyped up subjects sprout out new journals quickly (i.e stem cells, systems biology, synthetic biology), underperforming science memes will suffer from lack of attention. If I had a biomedical related science publishing house I would probably be thinking of launching a journal to cover metagenomics and another to cover personalized medicine.

Creating and destroying journals based on hype cycles sounds a bit exaggerated but at least there is no reason to think that a journal is here to stay. This can also happen via in a more subtle way, trough re-grouping of content after publication. Call it a gateway, a report, a topics page,a portal (harder to find), the idea is there are several ways one can group published papers to serve a target audience. Digital works are not things, they can be in several places and we can slice and dice the views as we wish. One great thing about these views is that they are more likely to attract discussion since there is more likely a group of people around with similar interests. This would be even more so if the users had some power to control the content. Nature Reports allow users to submit papers and to vote on them but it is still too soon to tell if discussions in topic pages are more frequent than on a site like PLoS ONE.

Instead of subscribing to the high impact journals, and lower impact journals of our topics of interest, we would state our interests in the views/portals/gateways we select to participate in and hopefully the works would be distributed to target audiences as fitting. Things that are of very high perceived impact would be cross-posted to many more views than more specific works. The value could still be perceived either pre or post publication.

The main advantage for the publisher is many more pages with well targeted audiences. Some of these views could even be of interest to a very wide non scientific audience. All of these should improve advertisement revenue.

Quotes

Another interesting SciView interview is available at Blind.Scientist. Here is one quote from Alexei Drummond (Chief Scientist of Biomatters) that I liked:

"I think that bioinformatics has to become a field where people without programming skills can contribute substantially. I would argue that all of the programmers in bioinformatics should be working very hard to program themselves out of their jobs (and into more satisfying jobs)."

Science advances quickly and so do the computational needs. Can we ever do away with these one off scripts if there are always new data types and innovative ways of analyzing them ? I guess the ideas around workflows and such could lead to very visual oriented programing that anyone can do.

Thursday, August 09, 2007

First issue of IET Synthetic Biology

The first issue of (yet) another journal related to systems&synthetic biology is now online. IET Synthetic Biology will be freely available during this year. This issue covers several works from iGEM and the editorial is worth a read to have a look at the future direction of the journal.

In addition to conventional research and review articles, we see an important need for practical articles describing technical advances and innovative methods useful in synthetic biology. We will encourage submission of technical articles that might describe novel BioBrick components, construction techniques, characterisation of a new biological circuit, new software or a practical ‘hands-on’ guide to the construction of new instrumentation or a biological device.

In addition to the print journal, we are developing associated web resources. These will include a repository of online video resources, specialised review material and research tools for synthetic biology.

Some journals tracking similar fields:
Molecular Systems Biology
BMC Systems Biology
Systems and Synthetic Biology
HSFP Journal
IET Systems Biology

Tuesday, August 07, 2007

~~Two~~Three new bioinformatic related blogs

A quick post to link out to two new bioinformatic related blogs:

Freelancing science (by Paweł Szczęsny)
Open.nfo (by Keith)

I will be happy the day there are too many to track :).

Updated: It could the official month of "start your own bioinformatics blog". The bio.struct blog is the third one so far.

Saturday, August 04, 2007

SciFoo starts ...

and I am not there :). No fun ! The Science Foo Camp 2007 has started at Googleplex and there is already some blog coverage. To have a look at what is going on at camp here is a tip from Andrew Walkingshaw:

* http://www.lexical.org.uk/planetscifoo/ - participants’ blogs
* http://flickr.com/photos/tags/scifoo/ - photos
* http://www.technorati.com/tags/scifoo/ - general blogosphere commentary

There is also some live Twitter feeds from Deepak and Nat Torkington.

To start off go have a look at pictures posted by Bora, you might recognize one or two of these bloggers.

Maybe next year we can try to organize a Science Barcamp :) Why should they have all the fun.

Friday, August 03, 2007

Bio::Blogs#13

A great edition of the monthly Bio::Blogs is up at Neil's blog. This month there are plenty of tutorials and a round up of blog coverage about the ISMB/ECCB 2007 conference.

PDF version for offline reading of the editorial and highlighted posts is here and here (Box.net copy).

If someone wants to give it a try at editing future editions of Bio::Blogs let me know.

Speaking of community projects, the list of webservers published in that last NAR webserver edition are in this Nodalpoint wiki webpage. If you try one of these services spend a minute noting down if it was even available, if it worked well, etc.

Wednesday, August 01, 2007

Microattribution

(via Peter Suber) An editorial in Nature Genetics discusses the need to establish microatribution systems:
"When requiring authors to deposit data in public databases, journals, databases and funders should ensure that quantitative credit for the use of every data entry will accrue to the relevant members of the data-producing and annotating teams. In an era in which consortia are producing more (and more useful) papers than individuals and small groups, the careers of individuals are as much in need of specific credit as those of the scientific visionaries and wranglers who hold the consortia together."

This sounds great. From the journals point of view this would mean "encouraging" the authors to link to all resources used. This information would then need to be aggregated and made available to everyone. This and other measures would help to change the current credit system that tends to reward researchers for producing papers in high impact factor journals (that does not correlate with individual paper citations) instead of rewarding scientists for the usefulness of their research.

Sunday, July 29, 2007

Bio::Blogs #13 call for submissions

Neil has kindly agreed to host the next edition of Bio::Blogs, due out on the first of August. Send in links to blog posts of bioinformatics/chemioinformatics/omics/open science related content to bioblogs at gmail and they will be re-directed to him.