I said I would organize the 20th edition of Bio::Blogs here on the 1st of April but April fools and my current work load did not allow me to get Bio::Blogs up on time.
There were a couple of interesting discussions and blog posts in March worth noting. For example, Neil mentioned a post by Jennifer Rohn started that initiated what could be one of the longest threads in Nature Network :"In which I utterly fail to conceptualize". It started off as small anti-Excel rant but turned in the comments to 1st) a discussion of bioinformatic tools to use, 2nd) a discussion of wet versus dry mindset and how much one should devote to learn the other. Finally it ended up as a exchange about collaborations and how a social networking site like Nature Network could/should help scientists find collaborators. There was even a group started by Bob O'Hara to discuss this last issue further.
I commented on the thread already but can try to expand a bit on it here. Nature Network is positioned as a social networking site for scientists. So far the best that it has to offer has been the blog posts and forum discussions. This is not very different from a "typical" forum. It facilitates the exchange of ideas around scientific topics but NN could try to look at all the typical needs of scientists (lab books, grant managing, lab managing, collaborations, protocols, paper recomendations,etc) and decide on a couple that they could work into the social network site. Ways to search for collaborators and maybe paper recommendation engines that take advantage of your network (network+connotea) are the most obvious and easier to implement. Thinking long term, tools to help manage the lab could be an interesting addition.
Another interesting discussion started from a post by Cameron Neylon on a data model for electronic lab notebooks (part I, II, III). Read also Neil's post, and Gibson's reply to Cameron on FuGE.
How much of the day to day activities and results need to be structured ? How heavy should this structure be to capture enough useful computer readable information ? Although I find these questions and discussion interesting, I would guess that we are far from having this applied to any great extent. If most people are reluctant to try out new applications they will be even less willing to convey their day to day practices via a structured data model. I mentioned recently the experiment under way at FEBS letters journal to create structured abstracts during the publishing process. As part of the announcement the editors commissioned reviews on the topic. It is worth reading the review by Florian Leitner and Alfonso Valencia on computational annotation methods. They argue for the creation of semi-automated tools that take advantage of the automatic methods and the curators (authors or others). The problems and solutions for annotation of scientific papers are shared with digital lab notebooks. It hope that more interest in this problem will lead to easy to use tools that suggest annotations for users under some controlled vocabularies.
Several people blogged about the 15 year old bug found in the BLOSUM matrices and the uncertainty in multiple sequence alignments. See posts by Neil, Kay Lars and Mailund.
Both cases remind us of the importance of using tools critically. The flip side of this is that it is impossible to constantly question every single tool we use since this would slow our work down to a crawl.
In the topic of Open Science, in March the Open Science proposal drafted by Shirley Wu and Cameron Neylon, for the Pacific Symposium on Biocomputing was accepted. It was accepted as a 3 hour workshop consisting of invited talks, demos and discussions. The call for participation is here along with the important deadlines for submissions (talk proposals due June 1st and poster abstracts due the 12th of September).
On a related note Michael Barton has set up a research stream (explained here) He is collecting updates on his work, tagged papers and graphs posted to Flickr into one feed that gives an immediate impression of what he is working on at present time. This is really a great set up. Even for private use withing a lab or across labs for collaboration this would give everyone involved the capacity to tap into the interesting feeds. I would probably not like to have everyone's feeds and maybe a supervisor should have access to some filtered set of feeds or tags to get only the important updates but this looks a step in the right direction. The same way, machines could also have research feeds that I could subscribe too to get updates on some data source.
Also in March, Deepak suggested we need more LEAP (Lightly Engineered Application Products)in science. He suggests that it is better to have one tool that does a job very well than one that does many somewhat well. I guess we have a few examples of this in science. Some of the most cited papers of all time are very well known cases of a tool that does one job well (ex: BLAST).
Finally, some meta-news on Bio::Blogs. I am currently way behind on many work commitments and I don't think I can keep up the (light) editorial work required for Bio::Blogs so I am considering stopping Bio::Blogs altogether. It has been almost two years and it has been fun and hopefully useful. The initial goal of trying to nit together the bioinformatic related blogs and offering some form of highlighting service is still required but I am not sure this is the best way going forward.
Still, if anyone wants to take over from here let me know by email (bioblogs at gmail.com).
Tuesday, April 08, 2008
Tuesday, April 01, 2008
(April fools update) Leveling the playing field – NIH to ban brain enhancing practices
Update - This post was part of an April 1st news but I am sure everyone got it :). Still the pressure in science is real and worth thinking about.
There has been quite a buildup of discussion surrounding the idea of brain enhancing drugs in the last couple of days. It started early march with a New York Time piece “Brain Enhancement Is Wrong, Right?” and it has culminated with the recent announcement of the World Anti Brain Doping Authority (WABDA) a joint effort from the NIH and EU to initiates studies on the reach of brain enhancing practices in science today.
There are many points of view already expressed on the web, see for example: ·Chris Patil
·Bora
·Anna Kushnir
·Genome Technology
·Egghead
·Eye on DNA
·Bob Ohara
·Martin Fenner
·Jennomics
My first reaction was of pure skepticism, this must be some kind of joke I thought, so I tried to probe a little bit around the UCSF campus to see if anyone has ever heard of this as well. One of my supervisors mentioned that about a year ago he had to fill out a NIH survey addressing the current problem of very high rejection rates for NIH grants. It looks like within this survey there was a section regarding the problems of competition in science and some of these brushed around the topic of brain enhancing practices. It could be that at the time NIH was trying to measure how far would people go under an extreme competitive environment.
This really got me thinking about how we are engaged in an environment that is not that far removed from highly competitive sports. How many stories have we heard about data forgery and scandalous retractions in the last couple of years? To what extent will people go to secure their place in science? To be recognized?
So maybe NIH is right in being proactive. Even if the issue is not as serious in science as it is in sports, unless there is an amazing influx of money or a considerable decrease of working scientists this might become an important problem. If nothing else we will get to know the current extent of these practices and it highlights yet again how far we deviated from course. The money society puts into scientific research is being wasted on overlapping competitive projects. Research agendas should be open and free for anyone to participate in. Maybe NIH should regulate that as well.
There has been quite a buildup of discussion surrounding the idea of brain enhancing drugs in the last couple of days. It started early march with a New York Time piece “Brain Enhancement Is Wrong, Right?” and it has culminated with the recent announcement of the World Anti Brain Doping Authority (WABDA) a joint effort from the NIH and EU to initiates studies on the reach of brain enhancing practices in science today.
There are many points of view already expressed on the web, see for example: ·Chris Patil
·Bora
·Anna Kushnir
·Genome Technology
·Egghead
·Eye on DNA
·Bob Ohara
·Martin Fenner
·Jennomics
My first reaction was of pure skepticism, this must be some kind of joke I thought, so I tried to probe a little bit around the UCSF campus to see if anyone has ever heard of this as well. One of my supervisors mentioned that about a year ago he had to fill out a NIH survey addressing the current problem of very high rejection rates for NIH grants. It looks like within this survey there was a section regarding the problems of competition in science and some of these brushed around the topic of brain enhancing practices. It could be that at the time NIH was trying to measure how far would people go under an extreme competitive environment.
This really got me thinking about how we are engaged in an environment that is not that far removed from highly competitive sports. How many stories have we heard about data forgery and scandalous retractions in the last couple of years? To what extent will people go to secure their place in science? To be recognized?
So maybe NIH is right in being proactive. Even if the issue is not as serious in science as it is in sports, unless there is an amazing influx of money or a considerable decrease of working scientists this might become an important problem. If nothing else we will get to know the current extent of these practices and it highlights yet again how far we deviated from course. The money society puts into scientific research is being wasted on overlapping competitive projects. Research agendas should be open and free for anyone to participate in. Maybe NIH should regulate that as well.
Monday, March 31, 2008
call for Bio::Blogs #20
The 20th edition of Bio::Blogs will be posted here by the end of tomorrow. This is very short notice but if anyone would like to contribute please send a few links of the most interesting things of the past month and I will put everything together (email bioblogs at gmail).
Friday, March 21, 2008
The structured abstract experiment at FEBS letters
The journal "FEBS letters" is starting a publishing experiment on structured abstracts. As described in the editorial the experiment is aimed at:
"integrating each manuscript with a structured summary precisely reporting, with database identifiers and predefined controlled vocabularies, the protein interactions reported in the manuscript."
The experiment will be a collaboration between FEBS letters and the interaction database MINT, it has started in the beginning of this year and it will last 6 months. It will try to evaluate the necessary tools and the authors's "degree of interest (and competence) to invest" in this annotation process.
It will be very interesting to see the results of this experiment to see if authors are willing to do this extra bit of work and how much this might facilitate the annotation efforts.
"integrating each manuscript with a structured summary precisely reporting, with database identifiers and predefined controlled vocabularies, the protein interactions reported in the manuscript."
The experiment will be a collaboration between FEBS letters and the interaction database MINT, it has started in the beginning of this year and it will last 6 months. It will try to evaluate the necessary tools and the authors's "degree of interest (and competence) to invest" in this annotation process.
It will be very interesting to see the results of this experiment to see if authors are willing to do this extra bit of work and how much this might facilitate the annotation efforts.
Saturday, March 08, 2008
Bio::Blogs #19 - Bioengineering
This months edition of Bio::Blogs is now available at Duncan's blog and it is mostly focused on (bio)engineering. Click the link for a summary of interesting things that were blogged about in the past month.
I will be hosting issue number 20 here in the blog, without a clear topic. Possibly with some emphasis on data integration. Email your top picks of the month until the end of March to bioblogs at gmail .com
I will be hosting issue number 20 here in the blog, without a clear topic. Possibly with some emphasis on data integration. Email your top picks of the month until the end of March to bioblogs at gmail .com
Sunday, March 02, 2008
Design, mutate and freze
Drew Endy talked about engineering biology for Edge. Most of the emphasis is still on standardization of biological parts and the importance of simplifying the process of creating a biological function. Still it would be nice to hear from him some new ideas about establishing processes of engineering biology. His whole speech seems focused on creating the hacker culture in biology. To transpose all the same concepts that would allow us to re-create the explosive growth of tinkering and production that we saw for electronics and computer programing within the biological sciences.
I agree with most of what he says, that we should: 1)focus on method development; 2)work on a registry of parts and 3) foster an "open source"/hacker culture in synthetic biology. In this text he did not mention for example the importance of modeling but it is implicit in the standardization of parts. Once you have a computer simulation of the process you wish to engineer that you should be able to reach into the parts list to implement it. The problem with this concept of standardized parts is the complexity that Drew Endy dislikes so much. There is still no way around it. We can take a part that has been very well defined in E. coli, plug into a yeast plasmid and it might not work at all.
If we are still far way from the ideal plug and play maybe we could try to take advantage of what biology can do very well, to evolve to a suitable solution. I would argue that we should develop engineering protocols that could take advantage of the evolutionary process.
<insert rambling>
Lets say we want to implement a function and I know beforehand that I will not be able to get perfect parts to implement it. Can we design this function in a way that it will have a large funnel of attraction for the design properties that I am interested in ? Are there biological parts that are more amenable to a directed evolutionary experiment to reach that design goal ? How can I increase the mutation rate for a controlled period of time and only for the stretch of DNA that I want to evolve ? Maybe it is possible to place the parts in a plasmid and have the replication of this plasmid be under a different polymerase that is more error prone ?
</insert rambling>
If we could answer some of these questions (maybe we have already), we could design the function of interest (modeling), pull parts that would be close to the solution, mutate/select until the best design is achieved and then freeze it by reducing the generation of diversity in some way.
Further reading:
Synthetic biology: promises and challenges
Molecular Systems Biology 3 Article number: 158 doi:10.1038/msb4100202
I agree with most of what he says, that we should: 1)focus on method development; 2)work on a registry of parts and 3) foster an "open source"/hacker culture in synthetic biology. In this text he did not mention for example the importance of modeling but it is implicit in the standardization of parts. Once you have a computer simulation of the process you wish to engineer that you should be able to reach into the parts list to implement it. The problem with this concept of standardized parts is the complexity that Drew Endy dislikes so much. There is still no way around it. We can take a part that has been very well defined in E. coli, plug into a yeast plasmid and it might not work at all.
If we are still far way from the ideal plug and play maybe we could try to take advantage of what biology can do very well, to evolve to a suitable solution. I would argue that we should develop engineering protocols that could take advantage of the evolutionary process.
<insert rambling>
Lets say we want to implement a function and I know beforehand that I will not be able to get perfect parts to implement it. Can we design this function in a way that it will have a large funnel of attraction for the design properties that I am interested in ? Are there biological parts that are more amenable to a directed evolutionary experiment to reach that design goal ? How can I increase the mutation rate for a controlled period of time and only for the stretch of DNA that I want to evolve ? Maybe it is possible to place the parts in a plasmid and have the replication of this plasmid be under a different polymerase that is more error prone ?
</insert rambling>
If we could answer some of these questions (maybe we have already), we could design the function of interest (modeling), pull parts that would be close to the solution, mutate/select until the best design is achieved and then freeze it by reducing the generation of diversity in some way.
Further reading:
Synthetic biology: promises and challenges
Molecular Systems Biology 3 Article number: 158 doi:10.1038/msb4100202
Tuesday, February 26, 2008
Jonathan Eisen@PLoS
PLoS has a new Academic Editor in Chief that blogs, works on evolution and has been at SciFoo twice. Jonathan A. Eisen, explains his reasons for accepting the job in an editorial available online. Among other things, he states:
I wonder if we will ever see the AEIC of Science/Nature/Cell blogging :). The editorials are the closest article format to a blog post but they insist on a somewhat exaggerated formality. Just as an example here is a link to the 2007 archives of the (great) editorials of Frank Gannon from EMBO reports.
Second, I want to work with the professional staff at PLoS Biology, the Academic Editors, and anyone else in the community who shares my desire to build new initiatives that will keep PLoS Biology as a top-tier journal. These would include ideas like producing issues dedicated to particular themes, actively recruiting excellent papers in fields where OA is not yet common, producing more outreach and educational material, and engaging bloggers and fully embracing the Web 2.0 world.I actually would like to get a bit more involved with what they are doing at PLoS, in particular with what they might be discussing for PLoS ONE and the hubs. Maybe I can pester them later on during the year. For some reactions on the news and more information, here is the related Postgenomic cluster.
I wonder if we will ever see the AEIC of Science/Nature/Cell blogging :). The editorials are the closest article format to a blog post but they insist on a somewhat exaggerated formality. Just as an example here is a link to the 2007 archives of the (great) editorials of Frank Gannon from EMBO reports.
Friday, February 22, 2008
Call for Bio::Blogs#19
Duncan Hull has volunteer to host the next issue of Bio::Blogs (a bioinformatic related monthly blog journal). It will be out in the beginning of March on the O'Really? blog. The suggested theme for this month is the relationship between Biology and Engineering inspired on the interview published on Edge.org "Engineering and Biology": A Talk with Drew Endy. Anyone can send links for this issue on this topic but also for other interesting bioinformatic posts to bioblogs at gmail.com
We could also try to format if automatically using FeedJournal as suggested by Neil.
We could also try to format if automatically using FeedJournal as suggested by Neil.
Friday, February 08, 2008
Late Links: Bio::Blogs#18 + new blog
I have been away from the web for the last few weeks as I moved to San Francisco to start my first postdoc. I will be working at UCSF in the Lim Lab and the Krogan lab on the evolution of signaling in yeasts. I'll try to blog more about it later during the year. I am looking forward to getting to know the bay area and hopefully make the most of the great (and apparently relaxed) science & technology environment.
Early this month Michael Barton edited another great edition of Bio::Blogs mostly dedicated to open science. He also put together an essay on the subject that is worth reading and commenting on. The next edition of Bio::Blogs will probably come back here to Public Rambling on the 1st of March (unless there is another volunteer).
Also in these last few weeks Lars Juhl Jensen started blogging at Buried Treasure. I met Lars at EMBL while I was doing my PhD and he always had time to help me out when I had some work related question. Like Roland Krause said Lars is one of the most prolific researchers in computational biology I ever met.
Early this month Michael Barton edited another great edition of Bio::Blogs mostly dedicated to open science. He also put together an essay on the subject that is worth reading and commenting on. The next edition of Bio::Blogs will probably come back here to Public Rambling on the 1st of March (unless there is another volunteer).
Also in these last few weeks Lars Juhl Jensen started blogging at Buried Treasure. I met Lars at EMBL while I was doing my PhD and he always had time to help me out when I had some work related question. Like Roland Krause said Lars is one of the most prolific researchers in computational biology I ever met.
Saturday, January 26, 2008
Submissions for Bio::Blogs#18
I am slowly re-connecting to the online world again, trying to pick trough the thousands of blog posts and other RSS feed alerts piled up in GReader. Way before I manage to do that (unless I press the read all button) the next edition of Bio::Blogs will be up at Bioinformatics Zen. Michael Barton has kindly agree to host the 18th edition of Bio::Blogs with a particular emphasis on Open Science and Open Notebook Science. It is scheduled for February 1st and anyone can participate by sending a link of their submissions to bioblogs at gmail.com.
To get in the spirit of the upcoming edition and to inspire some related blog posts go check out his recent movie. What do you think ? Will there be a significant increase of people sharing and collaborating online this year ?
To get in the spirit of the upcoming edition and to inspire some related blog posts go check out his recent movie. What do you think ? Will there be a significant increase of people sharing and collaborating online this year ?
Sunday, December 23, 2007
Disconnecting for a while
I am disconnecting from blogging for longer than usual. There will not be a Bio::Blogs edition on the 1st of January but there will be one dedicated to Open Science on the 1st of February. Before I go, congratulation to the chemioinformatics related blogging group that got a paper from combined efforts. Also, have a look at the new blog from Jason Kelly called Free Genes that will focus on synthetic biology and open science issues.
I'll be back sometime in the end of January. Happy celebrations to everyone and a good start to the new year.
Wednesday, December 05, 2007
Open Science project on domain family expansion
Some domain families of similar function have expanded more than others during evolution. Different domain families might have significantly different constraints imposed by their fold that could explain these differences. This project aims to understand what properties determine these differences focusing in particular on peptide binding domains. Examples of constraints to explore include average cost of production or capacity to generate binding diversity for the domain family.
This project is also a test for using Google Code as a research project management system for open science (see here for project home). Wiki pages will be used to collect previous research and milestone discoveries during the project development and to write the final manuscript towards the end of the project. Issue tracking system can be used to organize the required project tasks and assign them to participants. The file repository can hold the datasets and code used to derive any result.

I plan to use the blog as a notebook for the project (tag: domainevolution) and the project home at Google Code as the repository and organization center. The next few post regarding the project will be dedicated to explain better why I am interested in the question and develop further what are some of my expectations. Anyone interested in contributing is more than welcome to join in along the way. I should say that I am not in any hurry and that this is something for my 20% time ;).
This project is also a test for using Google Code as a research project management system for open science (see here for project home). Wiki pages will be used to collect previous research and milestone discoveries during the project development and to write the final manuscript towards the end of the project. Issue tracking system can be used to organize the required project tasks and assign them to participants. The file repository can hold the datasets and code used to derive any result.
I plan to use the blog as a notebook for the project (tag: domainevolution) and the project home at Google Code as the repository and organization center. The next few post regarding the project will be dedicated to explain better why I am interested in the question and develop further what are some of my expectations. Anyone interested in contributing is more than welcome to join in along the way. I should say that I am not in any hurry and that this is something for my 20% time ;).
Sunday, December 02, 2007
Merry Bio::Blogs everyone
Paulo Nuin hosted the 17th edition of Bio::Blogs. The number of submissions was very low so I suspect I am not the only one rushing to finish everything before going on holidays.
Should we skip the edition of the 1st of January or maybe postpone it for a few days ? Anyone interested in hosting ? I have been thinking of changing the format a little bit to try to increase the incentives for participating but I'll leave this for another post.
Should we skip the edition of the 1st of January or maybe postpone it for a few days ? Anyone interested in hosting ? I have been thinking of changing the format a little bit to try to increase the incentives for participating but I'll leave this for another post.
Tuesday, November 27, 2007
Bio::Blogs #17 - call for submissions
The 17th edition of Bio::Blogs will be hosted by Paulo Nuin at Blind.Scientist . Submissions of interesting bioinformatic related blog posts of this month can be sent, until the end of the November, to the usual address (bioblogs at gmail dot com) or to nuin at genedrift dot org.
There is also still time to submit blog posts to the OpenLab 2007 compilation.
There is also still time to submit blog posts to the OpenLab 2007 compilation.
Monday, November 19, 2007
Linking Out - Open Science and a new blog
Cameron Neylon posted a request for collaboration in his blog:
...we are using the S. aureus Sortase enzyme to attach a range of molecules to proteins. We have found that this provides a clean, easy, and most importantly general method for attaching things to proteins.
(...)
We are confident that it is possible to get reasonable yields of these conjugates and that the method is robust and easy to apply. This is an exciting result with some potentially exciting applications. However to publish we need to generate some data on applications of these conjugates.
They are looking for collaborators interested in applying this method. Go check the blog posts if you are interested or know someone that works on something similar.
(via Open Access News) Liz Lyon, Associate Director of UK Digital Curation Centre posted an interesting presentation on Open Science: "Open Science and the Research Library: Roles, Challenges and Opportunities?".
(via Fungal Genomes) I found a new blog related to evolution called Thirst for Science with a lot of insightful posts.
...we are using the S. aureus Sortase enzyme to attach a range of molecules to proteins. We have found that this provides a clean, easy, and most importantly general method for attaching things to proteins.
(...)
We are confident that it is possible to get reasonable yields of these conjugates and that the method is robust and easy to apply. This is an exciting result with some potentially exciting applications. However to publish we need to generate some data on applications of these conjugates.
They are looking for collaborators interested in applying this method. Go check the blog posts if you are interested or know someone that works on something similar.
(via Open Access News) Liz Lyon, Associate Director of UK Digital Curation Centre posted an interesting presentation on Open Science: "Open Science and the Research Library: Roles, Challenges and Opportunities?".
(via Fungal Genomes) I found a new blog related to evolution called Thirst for Science with a lot of insightful posts.
Linking out - Personalized medicine
Personalized medicine continues to climb the hype cycle. I have been getting most of the best news coverage on the subject from blogs.
- Bertalan Meskó reviews companies focused on personalized medicine (see part I and II)
- Attila Csordas and Deepak Singh cover the social aspects of personal health and the tie-in to 23andMe
- Gareth Palidwor reads into the details to speculate that the business model of 23andMe might be to sell the aggregated user data.
- Gene Sherpas puts on the brakes, describing the hype as Genomic Voyeurism
I am concerned that all the attention the genomics side of personalized medicine will distort the relative importance of nature versus nurture. Everyone craves for a peek at their own destiny and at their roots. These services hope to provide both of these by looking at our DNA. I don't think they can really do this reliably but nothing stops them from luring people.
- Bertalan Meskó reviews companies focused on personalized medicine (see part I and II)
- Attila Csordas and Deepak Singh cover the social aspects of personal health and the tie-in to 23andMe
- Gareth Palidwor reads into the details to speculate that the business model of 23andMe might be to sell the aggregated user data.
- Gene Sherpas puts on the brakes, describing the hype as Genomic Voyeurism
I am concerned that all the attention the genomics side of personalized medicine will distort the relative importance of nature versus nurture. Everyone craves for a peek at their own destiny and at their roots. These services hope to provide both of these by looking at our DNA. I don't think they can really do this reliably but nothing stops them from luring people.
Tuesday, November 13, 2007
Last call for Open Laboratory 2007

Anyone interested in participating can send in links to their favorite blog posts of the year and also volunteer to be part of the reviewing process (see instructions here).
Monday, November 12, 2007
4th year blog anniversary
Having a glance a the blog posts it is easy to find some very weird ones :)
Your Identity Aura (2005)
Our Collective Mind (2005)
The Human Puppet (2005)
Social Network Dynamics in a Conference Setting (2006)
The Fortune Cookie Genome (2007)
There a lot of serious ones too but I will leave that list to some other time.
Thanks to Nodalpoint and the Nodalpoint regulars (Greg, Neil, Alf and Chris) for introducing me to blogging some 6 years ago and to everyone else that joined in along the way with their blogs and/or comments. It sure makes blogging more enjoyable.
(Image Credit: Picture taken by mattnjuzz and published under CC by-nc-sa. Originally taken from Flick)
Saturday, November 10, 2007
Predicting functional association using mRNA localization
About a month ago Lécuyer and colleagues published a paper in Cell describing an extensive study of mRNA localization in Drosophila embryos during development. The main conclusion of this study was that a very large fraction (71%) of the genes they analyzed (2314) had localization patterns during some stage of the embryonic development. This includes both embryonic localization or sub-cellular localizations.
There is a lot of information that was gathered in this analysis and it should serve as resource for further studies. There is information for different developmental stages so it should also be possible to look for the dynamics of localization of the mRNAs. Another application of this data would be to use it as information source to predict functional association between genes.
Protein localization information as been used in the past for prediction of protein-protein interactions (both physical and genetic interactions). Typically this is done by integrating localization with other data sources in probabilistic analysis [Jansen R et al. 2003, Rhodes DR et al. 2005, Zhong W & Sternberg PW, 2006].
To test if mRNA localization could be used in the same way I took from this website the localization information gathered in the Cell paper and available genetic and protein interaction information for D.melanogaster genes/proteins (can be obtained for example in BioGRID among others). For this analysis I grouped physical and genetic interactions together to have a larger number of interactions to test. The underlying assumption is that both should imply some functional association of the gene pair.
The very first simple test is to have a look at all pairs of genes (with available localization information) and test how the likelihood that they interact depends on the number of cases where they were found to co-localized (see figure below). I discarded any gene for each no interaction was known.
As seen in the figure there is a significant correlation (r=0.63,N=21,p<0.01) between the likelihood of interaction and the number of co-localizations observed for the pair. At this point I did not exclude any localization term but since images were annotated using an hierarchical structure these terms are in some cases very broad.
More specific patterns should be more informative so I removed very broad terms by checking the fraction of genes annotated to each term. I created two groups of more narrow scope, one excluding all terms annotated to more than 50% of genes (denominated "localizations 50") and a second excluding all terms annotated to more than 30% of genes (localizations 30). In the figure below I binned gene pairs according to the number of co-localizations observed in the three groups of localization terms and for each bin calculated the fraction that interact.

As expected, more specific mRNA localization terms (localizations 30) are more informative for prediction of functional association since fewer terms are required to obtain the same or higher likelihood of interaction. The increased likelihood does not come at a cost of fewer pairs annotated. For example, there are similar number of gene pairs in bin "10-14" of the more specific localization terms (localizations 30) as in the bin ">20" for all localization terms (see figure below).
It is important to keep in mind that mRNA localization alone is a very poor predictor of genetic or physical interaction. I took the number of co-localization of each pair (using the terms in "localizations 30") and plotted a ROC curve to determine the area under the ROC curve (AROC or AUC). The AROC value calculated was 0.54, with a 95% confidence lower bound of 0.52 and a p value of 6E-7 of the true area being 0.5. So it is not random (that would be 0.5) but by itself is a very poor predictor.
In summary:
1) the degree of mRNA co-localization significantly correlates with the likelihood of genetic or physical association.
2) less ubiquitous mRNA localization patterns should be more informative for interaction prediction
3) the degree of mRNA co-localization is by itself a poor predictor of interaction but it should be possible to use this information to improve statistical methods to predict genetic/physical interactions.
This was a quick analysis, not thoroughly tested and just meant to confirm that mRNA localization should be useful for genetic/physical interaction predictions. I am not going to pursue this but if there is anyone interested I suggest that it could be interesting to see what terms have more predictive power with the idea of integrating this information with other data sources or also possibly directing future localization studies. Perhaps there is little point of tracking different developmental stages or maybe embryonic localization patterns are not as informative as sub-cellular localizations to predict functional association.
Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, Gerstein M. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science. 2003 Oct 17;302(5644):449-53.
Rhodes DR, Tomlins SA, Varambally S, Mahavisno V, Barrette T, Kalyana-Sundaram S, Ghosh D, Pandey A, Chinnaiyan AM. Probabilistic model of the human protein-protein interaction network.Nat Biotechnol. 2005 Aug;23(8):951-9.
Zhong W, Sternberg PW. Genome-wide prediction of C. elegans genetic interactions.Science. 2006 Mar 10;311(5766):1481-4.
There is a lot of information that was gathered in this analysis and it should serve as resource for further studies. There is information for different developmental stages so it should also be possible to look for the dynamics of localization of the mRNAs. Another application of this data would be to use it as information source to predict functional association between genes.
Protein localization information as been used in the past for prediction of protein-protein interactions (both physical and genetic interactions). Typically this is done by integrating localization with other data sources in probabilistic analysis [Jansen R et al. 2003, Rhodes DR et al. 2005, Zhong W & Sternberg PW, 2006].
To test if mRNA localization could be used in the same way I took from this website the localization information gathered in the Cell paper and available genetic and protein interaction information for D.melanogaster genes/proteins (can be obtained for example in BioGRID among others). For this analysis I grouped physical and genetic interactions together to have a larger number of interactions to test. The underlying assumption is that both should imply some functional association of the gene pair.
The very first simple test is to have a look at all pairs of genes (with available localization information) and test how the likelihood that they interact depends on the number of cases where they were found to co-localized (see figure below). I discarded any gene for each no interaction was known.
More specific patterns should be more informative so I removed very broad terms by checking the fraction of genes annotated to each term. I created two groups of more narrow scope, one excluding all terms annotated to more than 50% of genes (denominated "localizations 50") and a second excluding all terms annotated to more than 30% of genes (localizations 30). In the figure below I binned gene pairs according to the number of co-localizations observed in the three groups of localization terms and for each bin calculated the fraction that interact.
As expected, more specific mRNA localization terms (localizations 30) are more informative for prediction of functional association since fewer terms are required to obtain the same or higher likelihood of interaction. The increased likelihood does not come at a cost of fewer pairs annotated. For example, there are similar number of gene pairs in bin "10-14" of the more specific localization terms (localizations 30) as in the bin ">20" for all localization terms (see figure below).
In summary:
1) the degree of mRNA co-localization significantly correlates with the likelihood of genetic or physical association.
2) less ubiquitous mRNA localization patterns should be more informative for interaction prediction
3) the degree of mRNA co-localization is by itself a poor predictor of interaction but it should be possible to use this information to improve statistical methods to predict genetic/physical interactions.
This was a quick analysis, not thoroughly tested and just meant to confirm that mRNA localization should be useful for genetic/physical interaction predictions. I am not going to pursue this but if there is anyone interested I suggest that it could be interesting to see what terms have more predictive power with the idea of integrating this information with other data sources or also possibly directing future localization studies. Perhaps there is little point of tracking different developmental stages or maybe embryonic localization patterns are not as informative as sub-cellular localizations to predict functional association.
Jansen R, Yu H, Greenbaum D, Kluger Y, Krogan NJ, Chung S, Emili A, Snyder M, Greenblatt JF, Gerstein M. A Bayesian networks approach for predicting protein-protein interactions from genomic data. Science. 2003 Oct 17;302(5644):449-53.
Rhodes DR, Tomlins SA, Varambally S, Mahavisno V, Barrette T, Kalyana-Sundaram S, Ghosh D, Pandey A, Chinnaiyan AM. Probabilistic model of the human protein-protein interaction network.Nat Biotechnol. 2005 Aug;23(8):951-9.
Zhong W, Sternberg PW. Genome-wide prediction of C. elegans genetic interactions.Science. 2006 Mar 10;311(5766):1481-4.
Thursday, November 08, 2007
What I don't like about BPR3
For those that have not heard about it before BPR3 stands for Bloggers for Peer-Reviewed Research Reporting. From their website:
"Bloggers for Peer-Reviewed Research Reporting strives to identify serious academic blog posts about peer-reviewed research by offering an icon and an aggregation site where others can look to find the best academic blogging on the Net."
It is all great except that it already exists and for a long time before BPR3. You can go to the papers section in Postgenomic and select papers by the date they were published, were blogged about, how many bloggers mentioned the paper or limit this search to a particular journal. I have even used this early this year to suggest that the number of citations increases with the number of blog posts mentioning the paper.
In this case I think that unless they really aim to develop something that is better that what Postgenomic already offers, the added competition will only fragment an already poor market. The value of a tracking site like Postgenomic, Techmeme or what BPR3 is proposing to create increases with user base in a non-linear way. This is what people usually refer to as network effects in social web applications. Increasing number of users make the sites more useful, reinforcing the importance of the social application. I suspect Postgenomic is not closed in any way to discussions. The code is even available here for re-use. So, why can't BPR3 and Postgenomic work this out and have a single tracking database and presentation. Let's say that BPR3 could be a mirror for the Postgenomics papers section (why re-invent the wheel).
I am not in favor of any particular site (sorry Euan :), what I think would be useful would be:
1 ) common standards for everyone (publishers, bloggers, etc) to carry information on published literature (number of times paper was read, ratings, comments, blog posts, e-notebook data, etc) attached to single identifier (DOI sounds fine)
2) one independent tracking site with enough users to gain hub status such that everyone gains from high exposure to the science crowd.
"Bloggers for Peer-Reviewed Research Reporting strives to identify serious academic blog posts about peer-reviewed research by offering an icon and an aggregation site where others can look to find the best academic blogging on the Net."
It is all great except that it already exists and for a long time before BPR3. You can go to the papers section in Postgenomic and select papers by the date they were published, were blogged about, how many bloggers mentioned the paper or limit this search to a particular journal. I have even used this early this year to suggest that the number of citations increases with the number of blog posts mentioning the paper.
In this case I think that unless they really aim to develop something that is better that what Postgenomic already offers, the added competition will only fragment an already poor market. The value of a tracking site like Postgenomic, Techmeme or what BPR3 is proposing to create increases with user base in a non-linear way. This is what people usually refer to as network effects in social web applications. Increasing number of users make the sites more useful, reinforcing the importance of the social application. I suspect Postgenomic is not closed in any way to discussions. The code is even available here for re-use. So, why can't BPR3 and Postgenomic work this out and have a single tracking database and presentation. Let's say that BPR3 could be a mirror for the Postgenomics papers section (why re-invent the wheel).
I am not in favor of any particular site (sorry Euan :), what I think would be useful would be:
1 ) common standards for everyone (publishers, bloggers, etc) to carry information on published literature (number of times paper was read, ratings, comments, blog posts, e-notebook data, etc) attached to single identifier (DOI sounds fine)
2) one independent tracking site with enough users to gain hub status such that everyone gains from high exposure to the science crowd.
Thursday, November 01, 2007
The right to equivalent response
(disclaimer: I worked for Molecular Systems Biology)
The last issue of PLoS Biology caries an editorial about Open Access written by Catriona J. MacCallum. It addresses the definition of Open Access and what the author considers an "insidious" trend of obscuring "the true meaning of open access by confusing it with free access".
I agree with the main point of the editorial, that we should keep in mind the definition of open access and that the capacity to re-use a published work should have more value to the readers.
However, it is very unfortunate that the very fist example MacCallum picks on is the Molecular Systems Biology journal for the simple fact that very recently they have changed the publishing policies to address exactly this issue. Authors can choose one of two CC licenses, deciding for themselves if they want to allow derivatives of their work or not. See post at MSB blog. As it is explained in the blog post the discussions about the licenses actually started several month ago and I think the final implementation is a very balanced decision on their part.
Thomas Lemberger, editor at MSB wrote a reply to the editorial that PLoS decided to publish as a response from the readers. These can only be seen if readers decide to click the link "Read Other Responses" on the right side of the online version.
I am obviously biased but for me this is not really giving the right to equivalent response. It would not have cost them much to issue a correction or publish the letter as correspondence where it would have the same visibility as the editorial. This would signal that they are indeed committed to collaborating with other publishers and journals that support open access (as stated in PLoS core principles).
The last issue of PLoS Biology caries an editorial about Open Access written by Catriona J. MacCallum. It addresses the definition of Open Access and what the author considers an "insidious" trend of obscuring "the true meaning of open access by confusing it with free access".
I agree with the main point of the editorial, that we should keep in mind the definition of open access and that the capacity to re-use a published work should have more value to the readers.
However, it is very unfortunate that the very fist example MacCallum picks on is the Molecular Systems Biology journal for the simple fact that very recently they have changed the publishing policies to address exactly this issue. Authors can choose one of two CC licenses, deciding for themselves if they want to allow derivatives of their work or not. See post at MSB blog. As it is explained in the blog post the discussions about the licenses actually started several month ago and I think the final implementation is a very balanced decision on their part.
Thomas Lemberger, editor at MSB wrote a reply to the editorial that PLoS decided to publish as a response from the readers. These can only be seen if readers decide to click the link "Read Other Responses" on the right side of the online version.
I am obviously biased but for me this is not really giving the right to equivalent response. It would not have cost them much to issue a correction or publish the letter as correspondence where it would have the same visibility as the editorial. This would signal that they are indeed committed to collaborating with other publishers and journals that support open access (as stated in PLoS core principles).
Bio::Blogs #16, The one with a Halloween theme
The 16# edition of Bio::Blogs is know available at Freelancing science. Jump over there for summary of what has been going on during this month in the bioinformatic related blogs. If not for anything else then just to have a look at the pumpkin. Thanks again to everyone that participated.
Paulo Nuin from Blind.Scientist has volunteered to host the 17# edition that is scheduled to appear as usual on the 1st of December.
Paulo Nuin from Blind.Scientist has volunteered to host the 17# edition that is scheduled to appear as usual on the 1st of December.
Thursday, October 25, 2007
Building an e-Science platform with Miscrosoft tools
(via Frank Gibson's Peanutbutter) Hugo Hiden, the technical director of the North-East Regional e-Science Centre (NEReSC) started a new blog where he will explore how to build an e-Science platform based on Microsoft technology. The initial post explains a little bit why he is doing this:
"The reason for this blog is, primarily, to document my experiences with writing a prototype e-Science research platform using Microsoft tools instead of the more traditional approach of fighting with Open Source. This way is easier, supposedly."
and also, what he aims to build:
"The task I have set myself is to recreate, at a basic level, the software being developed by the CARMEN project (http://www.carmen.org.uk). "
Let's see how it goes. Maybe they'll take suggestions later on :).
"The reason for this blog is, primarily, to document my experiences with writing a prototype e-Science research platform using Microsoft tools instead of the more traditional approach of fighting with Open Source. This way is easier, supposedly."
and also, what he aims to build:
"The task I have set myself is to recreate, at a basic level, the software being developed by the CARMEN project (http://www.carmen.org.uk). "
Let's see how it goes. Maybe they'll take suggestions later on :).
Sunday, October 21, 2007
Bio::Blogs #16 - call for submissions
The next edition of Bio::Blogs (bioinformatics blog journal) will be hosted at Freelancing science on the 1st of November. If you find anything this month that you think is interesting to add to this addition send an email to bioblogs at gmail. com until the end of the month. Anyone interested in hosting future edition can also send an email to volunteer.
Friday, October 19, 2007
The Fortune Cookie Genome
*in an imaginary future*
Today is the day I get the sequencing results back. It is going be interesting to have finally a glimpse of my very own genome. At the same time I am afraid of the potential disease associations they might find in there. In any case I rather know it with time to do something about it. Thats it ... I exhale and open the main door to the building walking up the desk.
- Hi. I have an appointment with my genetic adviser.
- Oh yes, go up to the 3rd floor, they are expecting you.
I walk up a DNA shaped stairway and walk into the office of one of the attending specialists. He was the one convincing me of how useful it would be to purchase the GenomeSurvey(TM) package.
- I got your email. The results are in ?
- Yes, we have your genome fully sequenced and uploaded into your service of choice. I see you have picked Google Health as your storage provider as part of the package.
- Is there any bad news ? Will I have a serious disease soon ?
- I understand your concern. There is really nothing too serious, but I will come to that in moment. You may login with your Google account here and I can guide you through some of the results.
I login to my health page and I am confronted with the usual simple white-blue Google interface. I noticed the addition of a genome tab and let my adviser tell me more about it.
- As you can see, your genome as been uploaded to your account. It has also been submitted as an John Doe genome to the NCBI personal genomics database. You may select later to make your identity known and/or associate any of your personal history information to it.
- What about the disease associations ?
- Yes. So you can click here on the associations report to have a full listings of the phenotypic associations. You have a very healthy genome, no serious rare diseases. In your case the most important finding is that you have a 2% increased likelihood of developing a heart condition when you are above 60 and a 1% increased likelihood of having Alzeimer's disease after 65.
- That's it ? 2% ? 1 %?
- Well, that is assuming no prior knowledge on your diet and other personal history as established in the large HapMap version 10. From now on you may input into the forms provided in Google Health all your diet and other personal information on a daily basis and as the information accumulates the service will automatically update the probabilities. As your adviser I should tell you that this information can be used by Google to provide you with better targeted advertisement in all other Google products.
- Right ... is this it ? Does the package include anything else ?
- Of course ! As I mentioned to you before you can click here on the prescription tab to get an informal advice on how best to deal with the associations that were found for you. You should always discuss these suggestions with your doctor before doing anything. By company policy I cannot read this information with you, since we are not liable for this. You can read it at home when you get there.
- Well , if there is nothing else I will go.
- Thank you again for choosing our GenomeSurvey(TM) package I am happy to have served you and I hope that you feel more empowered about your own health. Be well.
I go home feeling a bit cheated but obviously happy of having no serious disorder in the horizon. I rush to my home computer to read the prescription that will help me prevent my heart condition and Alzeimers. I click the GoolgeDoctor(TM) button and a clip like avatar jumps around in the screen. A computerized voice reads aloud the text appearing in the screen:
Dear Pedro. You can call me clipy ! I will be your assistant for any of your health needs. In order to decrease the likelihood for the negative phenotypes associated to your genome please consider abiding by the following rules:
- Do a lot of exercise
- Eat a healthy diet
- Find balance in your life
*in an imaginary present*
- Snap out of it, what does your say ?
I look back to the small piece of paper in my hand and read:
- "You must find balance in your life", thats what it says.
- Well, these things are never wrong.
I drop the paper on my dish and finish eating the fortune cookie before leaving the chinese restaurant with my friends.
- You won't believe what I thought of ...
Further reading
The Future of Personal Genomics (21 September 2007 Science)
How much information is there really in personal genomes and how much should patients know ? Extra points for citing a post from Eye on Dna in a Science Policy Forum.
The Science and Business of Genetic Ancestry Testing (10th October 2007 Science)
A discussion surrounding results of genetic ancestry tests and the commercialization of these tests.
Google Says Its Health Platform Is Due In Early 2008 (17 October InformationWeek)
Google is still trying to build a platform to host the health related information. Microsoft already launched a service called HealthVault (read about it from Deepak).
BMC Medical Genomics (17 October BMC blog)
BMC will launch a journal dedicated to Medical Genomics, covering articles on "on functional genomics, genome structure, genome-scale population genetics, epigenomics, proteomics, systems analysis and pharmacogenomics in relation to human health and disease."
Do-it-yourself science (17 October Nature)
This editorial links up several news, opinions and articles in the last issue of Nature to ask the question - How much involvement can patient advocates have in genetics? The most impressive articles is the story of Hugh Rienhoff, a trained geneticist and biotechnology that decided to personally research about his daughter's disease (as in buying a PCR machine etc). (via Keith)
Common sense for our genomes (18 October Nature)
Steven E. Brenner explains the need for a Genome Commons. See discussion at bbgm.
Today is the day I get the sequencing results back. It is going be interesting to have finally a glimpse of my very own genome. At the same time I am afraid of the potential disease associations they might find in there. In any case I rather know it with time to do something about it. Thats it ... I exhale and open the main door to the building walking up the desk.
- Hi. I have an appointment with my genetic adviser.
- Oh yes, go up to the 3rd floor, they are expecting you.
I walk up a DNA shaped stairway and walk into the office of one of the attending specialists. He was the one convincing me of how useful it would be to purchase the GenomeSurvey(TM) package.
- I got your email. The results are in ?
- Yes, we have your genome fully sequenced and uploaded into your service of choice. I see you have picked Google Health as your storage provider as part of the package.
- Is there any bad news ? Will I have a serious disease soon ?
- I understand your concern. There is really nothing too serious, but I will come to that in moment. You may login with your Google account here and I can guide you through some of the results.
I login to my health page and I am confronted with the usual simple white-blue Google interface. I noticed the addition of a genome tab and let my adviser tell me more about it.
- As you can see, your genome as been uploaded to your account. It has also been submitted as an John Doe genome to the NCBI personal genomics database. You may select later to make your identity known and/or associate any of your personal history information to it.
- What about the disease associations ?
- Yes. So you can click here on the associations report to have a full listings of the phenotypic associations. You have a very healthy genome, no serious rare diseases. In your case the most important finding is that you have a 2% increased likelihood of developing a heart condition when you are above 60 and a 1% increased likelihood of having Alzeimer's disease after 65.
- That's it ? 2% ? 1 %?
- Well, that is assuming no prior knowledge on your diet and other personal history as established in the large HapMap version 10. From now on you may input into the forms provided in Google Health all your diet and other personal information on a daily basis and as the information accumulates the service will automatically update the probabilities. As your adviser I should tell you that this information can be used by Google to provide you with better targeted advertisement in all other Google products.
- Right ... is this it ? Does the package include anything else ?
- Of course ! As I mentioned to you before you can click here on the prescription tab to get an informal advice on how best to deal with the associations that were found for you. You should always discuss these suggestions with your doctor before doing anything. By company policy I cannot read this information with you, since we are not liable for this. You can read it at home when you get there.
- Well , if there is nothing else I will go.
- Thank you again for choosing our GenomeSurvey(TM) package I am happy to have served you and I hope that you feel more empowered about your own health. Be well.
I go home feeling a bit cheated but obviously happy of having no serious disorder in the horizon. I rush to my home computer to read the prescription that will help me prevent my heart condition and Alzeimers. I click the GoolgeDoctor(TM) button and a clip like avatar jumps around in the screen. A computerized voice reads aloud the text appearing in the screen:
Dear Pedro. You can call me clipy ! I will be your assistant for any of your health needs. In order to decrease the likelihood for the negative phenotypes associated to your genome please consider abiding by the following rules:
- Do a lot of exercise
- Eat a healthy diet
- Find balance in your life
*in an imaginary present*
- Snap out of it, what does your say ?
I look back to the small piece of paper in my hand and read:
- "You must find balance in your life", thats what it says.
- Well, these things are never wrong.
I drop the paper on my dish and finish eating the fortune cookie before leaving the chinese restaurant with my friends.
- You won't believe what I thought of ...
Further reading
The Future of Personal Genomics (21 September 2007 Science)
How much information is there really in personal genomes and how much should patients know ? Extra points for citing a post from Eye on Dna in a Science Policy Forum.
The Science and Business of Genetic Ancestry Testing (10th October 2007 Science)
A discussion surrounding results of genetic ancestry tests and the commercialization of these tests.
Google Says Its Health Platform Is Due In Early 2008 (17 October InformationWeek)
Google is still trying to build a platform to host the health related information. Microsoft already launched a service called HealthVault (read about it from Deepak).
BMC Medical Genomics (17 October BMC blog)
BMC will launch a journal dedicated to Medical Genomics, covering articles on "on functional genomics, genome structure, genome-scale population genetics, epigenomics, proteomics, systems analysis and pharmacogenomics in relation to human health and disease."
Do-it-yourself science (17 October Nature)
This editorial links up several news, opinions and articles in the last issue of Nature to ask the question - How much involvement can patient advocates have in genetics? The most impressive articles is the story of Hugh Rienhoff, a trained geneticist and biotechnology that decided to personally research about his daughter's disease (as in buying a PCR machine etc). (via Keith)
Common sense for our genomes (18 October Nature)
Steven E. Brenner explains the need for a Genome Commons. See discussion at bbgm.
Thursday, October 11, 2007
JournalFire

A new science related service called JournalFire has started. It was apparently created by a group of graduate students that are "frustrated with the current system of scientific discourse and publication". According to the initial blog post this service "provides a centralized location for you to share, discuss, and evaluate published journal articles. You, the scientists, are put in charge of determining what studies are significant and noteworthy."
I did not have a chance to test it since it is in private beta but I have asked for an account. It looks like anyone with an .edu account should be able to access it already. It sounds promising but has many of these services a lot depends on the capacity to attract a sufficiently large group of people to sustain interesting discussions. I will update the post if I get an account to test the service.
(I wonder if the people from OpenWetWare have anything to do with this)
Monday, October 01, 2007
Bio::Blogs #15
Welcome to the 15th edition of the bioinformatics blog journal Bio::Blogs.
I complained a while ago that there was very little expansion of the bioinformatics blogging community but at least in the last couple of months it looks like this is changing. Although not necessary started last month here are three blogs that I only recently noticed: At the end of the day from Stephen Spiro (Spiro lab homepage), Paradoxus and Saaien Tist from Jan Aerts.
Not only are there more blogs there are many more examples of bloggers posting original ideas and research. Most people agree that being open about research should foster collaboration but so far few people have really tried to do it. It is inspiring to read trough these examples and trying to imagine how we might be doing science in the next couple of years.
This month was also marked by the many conference reports that we had available to read and by the experiments of taking real life conferences into Second Life.
Keeping this short and to the point this edition of Bio::Blogs focuses on these conference reports and on the ongoing experiments of using blogs to post about original research. I hope this nudges more people to go ahead and give blogging and open science a try.
Conference Reports
Neil Saunders was at the ComBio2007 conference and posted his notes about it in a four part series (1,2,3,4).
Allyson from Systems Biology & Bioinformatics provided a very extensive coverage of Integrative Bioinformatics 2007. Read all about it in chronological order from parts 1 to 10 (1,2,3,4,5,6,7,8,9,10).
From my blog here are two blog posts on the FEBS workshop - "The Biology of Modular Protein Domains" (1,2). This was not really about bioinformatics but I hope it will be interesting from the perspective of what data is coming that requires good integration strategies.
I'll jump know from real life to virtual talks. Those creative people at Nature keep testing out the potential of the web to improve interchange of knowledge. They kicked-off a seminar series of digital talks in the Second Nature island withind Second Life. The first talk by Philipp Holliger, entitled "New polymerases for old DNA" was about the engineering of new polymerases to amplify ancient DNA. Joanna Scott (working at Nature) has a very nice report on the talk in her blog.
Continuing on with virtual talks, in the past month there were another 3 sessions of the series SciFoo Lives On, organized by Jean-Claude Bradley and hosted also in Second Nature. JC Bradley covered the sessions on his blog: Sept 4 - Definitions in Open Science,Sept 10 - Communicating Science with Video, Sept 24 - Open Notebook Science Case Studies. Additional coverage by other bloggers can be found via the wiki page.
Blog articles
What are some of the most frustrating bottlenecks in bioinformatics research ? Where do we really spend most of our time ? Given that we work with digitized information it should in principle be mostly about the ideas. Thinking about interesting questions, crossing information and interpreting the results. At least for me this is typically not the case. What usually takes time is gathering all the necessary information in a way that can be analyzed. Three blog posts this month discuss this problem. Hari Jayaram and Neil Saunders posted about the problems they faced when attempting to do conceptually simple tasks. In response Deepak wrote a thoughtful post on how science databases should focus also on making the information easily accessible via appropriate APIs.
From online discussions to great examples of open science we start off with Jeremiah Faith's post were he describes an idea to determine the effect of sequence level mutations on transcription, translation, and noise.
Michael Barton from Bioinformatics Zen created a new blog dedicated to posting about his research on gene expression in yeast. Jump over there to read the many blog posts that he has already there, to provide feedback and maybe find common ground for collaborations.
Also this month, RPM from Evolgen re-started his attempt to publish original research on the blog. He is trying to study the evolution of a duplicated gene in Drosophila. There are two posts covering the introduction to the problem (part 1, part 2).
The last post highlighted in this month's edition is from Benjamin M Good. He has been working on a tool called Entity Describer to add semantic controlled vocabularies to Connotea and he has posted the manuscript they will try to publish on his blog and in Nature Precedings (10101/npre.2007.945.2).
This is it for this month. As usual, if anyone is interesting in serving as editor for any future edition, tell me by email.
I complained a while ago that there was very little expansion of the bioinformatics blogging community but at least in the last couple of months it looks like this is changing. Although not necessary started last month here are three blogs that I only recently noticed: At the end of the day from Stephen Spiro (Spiro lab homepage), Paradoxus and Saaien Tist from Jan Aerts.
Not only are there more blogs there are many more examples of bloggers posting original ideas and research. Most people agree that being open about research should foster collaboration but so far few people have really tried to do it. It is inspiring to read trough these examples and trying to imagine how we might be doing science in the next couple of years.
This month was also marked by the many conference reports that we had available to read and by the experiments of taking real life conferences into Second Life.
Keeping this short and to the point this edition of Bio::Blogs focuses on these conference reports and on the ongoing experiments of using blogs to post about original research. I hope this nudges more people to go ahead and give blogging and open science a try.
Conference Reports
Neil Saunders was at the ComBio2007 conference and posted his notes about it in a four part series (1,2,3,4).
Allyson from Systems Biology & Bioinformatics provided a very extensive coverage of Integrative Bioinformatics 2007. Read all about it in chronological order from parts 1 to 10 (1,2,3,4,5,6,7,8,9,10).
From my blog here are two blog posts on the FEBS workshop - "The Biology of Modular Protein Domains" (1,2). This was not really about bioinformatics but I hope it will be interesting from the perspective of what data is coming that requires good integration strategies.
I'll jump know from real life to virtual talks. Those creative people at Nature keep testing out the potential of the web to improve interchange of knowledge. They kicked-off a seminar series of digital talks in the Second Nature island withind Second Life. The first talk by Philipp Holliger, entitled "New polymerases for old DNA" was about the engineering of new polymerases to amplify ancient DNA. Joanna Scott (working at Nature) has a very nice report on the talk in her blog.
Continuing on with virtual talks, in the past month there were another 3 sessions of the series SciFoo Lives On, organized by Jean-Claude Bradley and hosted also in Second Nature. JC Bradley covered the sessions on his blog: Sept 4 - Definitions in Open Science,Sept 10 - Communicating Science with Video, Sept 24 - Open Notebook Science Case Studies. Additional coverage by other bloggers can be found via the wiki page.
Blog articles
What are some of the most frustrating bottlenecks in bioinformatics research ? Where do we really spend most of our time ? Given that we work with digitized information it should in principle be mostly about the ideas. Thinking about interesting questions, crossing information and interpreting the results. At least for me this is typically not the case. What usually takes time is gathering all the necessary information in a way that can be analyzed. Three blog posts this month discuss this problem. Hari Jayaram and Neil Saunders posted about the problems they faced when attempting to do conceptually simple tasks. In response Deepak wrote a thoughtful post on how science databases should focus also on making the information easily accessible via appropriate APIs.
From online discussions to great examples of open science we start off with Jeremiah Faith's post were he describes an idea to determine the effect of sequence level mutations on transcription, translation, and noise.
Michael Barton from Bioinformatics Zen created a new blog dedicated to posting about his research on gene expression in yeast. Jump over there to read the many blog posts that he has already there, to provide feedback and maybe find common ground for collaborations.
Also this month, RPM from Evolgen re-started his attempt to publish original research on the blog. He is trying to study the evolution of a duplicated gene in Drosophila. There are two posts covering the introduction to the problem (part 1, part 2).
The last post highlighted in this month's edition is from Benjamin M Good. He has been working on a tool called Entity Describer to add semantic controlled vocabularies to Connotea and he has posted the manuscript they will try to publish on his blog and in Nature Precedings (10101/npre.2007.945.2).
This is it for this month. As usual, if anyone is interesting in serving as editor for any future edition, tell me by email.
ICSB 2007
I am attending the eighth International Conference on Systems Biology (ICSB 2007) in Long Beach. I typically prefer smaller conferences but this one is probably the best one to get an overview of the recent progress in systems biology. As expected the program has a broad scope and unlike last year's meeting there are no parallel sessions so I will have a chance to ear more from others fields. Any other bloggers attending ?
Saturday, September 29, 2007
Modular protein domains (an overdue wrap-up)
I did not even cover 1/3 of the Module Protein Domain workshop in my previous blog post. I will not attempt to do it know after so much time. The organizers were clearly concerned about keeping the information withing the participants so I will just post some of the general impressions that I took from the meeting.
Specificity profiling in high gear
There were several sessions dedicated to particular protein domains (SH3, SH2 and PDZ in particular) and for all of these there are several projects under way (or mostly completed) to determine the binding specificity of a large number of these domains (although in different species) using either phage display, spotted peptides and other methods. We should project ahead and start planning what to do with this information. How to combine this to predict pathways and pathway models with dynamical information. The work of Rune Linding is a a very good start at this (see NetworKIN).
Given that the methods are set up I suspect that the emphasis might shift now on exploring the evolution of binding specificities and the impact of disease causing mutations (i.e. profiling binding specificities of domain variants).
Good integration of different methods
Compared to the same meeting two years ago I had an impression that there was a better integration of different approaches (biochemical, structural, computational, etc). A particularly good example was the work of Michael B. Yaffe. There were plenty of structural talks (probably a bit too much) but I found particularly interesting the work of Ivan Dikic that presented extensive novel work on ubiquitin binding domains and Charalampos Kalodimos that presented his lab's work on potential functional roles of proline isomerization (Pubmed).
The computational part was well represented too and it was fun to see again Gary Bader and to get to know Philip Kim.
I hope to be there again in two years time to see how the field changed.
I did not even cover 1/3 of the Module Protein Domain workshop in my previous blog post. I will not attempt to do it know after so much time. The organizers were clearly concerned about keeping the information withing the participants so I will just post some of the general impressions that I took from the meeting.
Specificity profiling in high gear
There were several sessions dedicated to particular protein domains (SH3, SH2 and PDZ in particular) and for all of these there are several projects under way (or mostly completed) to determine the binding specificity of a large number of these domains (although in different species) using either phage display, spotted peptides and other methods. We should project ahead and start planning what to do with this information. How to combine this to predict pathways and pathway models with dynamical information. The work of Rune Linding is a a very good start at this (see NetworKIN).
Given that the methods are set up I suspect that the emphasis might shift now on exploring the evolution of binding specificities and the impact of disease causing mutations (i.e. profiling binding specificities of domain variants).
Good integration of different methods
Compared to the same meeting two years ago I had an impression that there was a better integration of different approaches (biochemical, structural, computational, etc). A particularly good example was the work of Michael B. Yaffe. There were plenty of structural talks (probably a bit too much) but I found particularly interesting the work of Ivan Dikic that presented extensive novel work on ubiquitin binding domains and Charalampos Kalodimos that presented his lab's work on potential functional roles of proline isomerization (Pubmed).
The computational part was well represented too and it was fun to see again Gary Bader and to get to know Philip Kim.
I hope to be there again in two years time to see how the field changed.
Bio::Blogs #15 - call for submission
Since there were no volunteers :) I will be hosting the 15th edition of Bio::Blogs here in the blog. I will be gathering some posts from around the web on bioinformatics and other science related topics from the last month and will post about in on the 1st of October. Suggestions are more than welcome. Please email any links to interesting blog posts to bioblogs at gmail dot com.
On a personal note, I have defended my PhD :). This mostly explains the low volume blogging.
Since there were no volunteers :) I will be hosting the 15th edition of Bio::Blogs here in the blog. I will be gathering some posts from around the web on bioinformatics and other science related topics from the last month and will post about in on the 1st of October. Suggestions are more than welcome. Please email any links to interesting blog posts to bioblogs at gmail dot com.
On a personal note, I have defended my PhD :). This mostly explains the low volume blogging.
The ephemeral journal II
(via Deepak) Earlier this month I posted about how re-grouping of content after publication could be used to foster the creation of more focused online scientific communities. My impression is that these "places" could more easily attract a group of people of similar interests that would more likely engage in discussions, in contrast to a place like PLoS ONE that covers way to many topics.
There are several names for these groupings (Nature/BMC gateways, Nature Reports, a topics page) and PLoS came out with another one - Hub. They launched a re-grouping of content focused on Clinical Trials that they call PLoS Hub for Clinical Trials. It is built on Topaz so it has everything that PLoS ONE has (comments, ratings, trackbacks,etc).
They mention in the home page of this Hub that they plan to in the future also "feature open-access articles from other journals plus user-generated content". I suspect that they could go even a bit further on this and give more control to the users for the creation of content for the Hubs and even to create new Hubs. One thing that I like in traditional journals that also creates a feeling of identity and community is the more personal news and views and editorials. PLoS could commission/invite scientists/bloggers to help create this type of content for their Hubs. This would be something like a community blog centered on this Hubs' research.
Once upon a time (before Digg if I remember right), we tried to do this in Nodalpoint. For a while we had a queue from bioinformatic related journals that we could vote on to upgrade it to the front page of the blog. At the time it did not work very well because of lack of users and participation but it in essence it was not very different from what the publishers are trying to do now. Maybe we could try it again :).
(via Deepak) Earlier this month I posted about how re-grouping of content after publication could be used to foster the creation of more focused online scientific communities. My impression is that these "places" could more easily attract a group of people of similar interests that would more likely engage in discussions, in contrast to a place like PLoS ONE that covers way to many topics.
There are several names for these groupings (Nature/BMC gateways, Nature Reports, a topics page) and PLoS came out with another one - Hub. They launched a re-grouping of content focused on Clinical Trials that they call PLoS Hub for Clinical Trials. It is built on Topaz so it has everything that PLoS ONE has (comments, ratings, trackbacks,etc).
They mention in the home page of this Hub that they plan to in the future also "feature open-access articles from other journals plus user-generated content". I suspect that they could go even a bit further on this and give more control to the users for the creation of content for the Hubs and even to create new Hubs. One thing that I like in traditional journals that also creates a feeling of identity and community is the more personal news and views and editorials. PLoS could commission/invite scientists/bloggers to help create this type of content for their Hubs. This would be something like a community blog centered on this Hubs' research.
Once upon a time (before Digg if I remember right), we tried to do this in Nodalpoint. For a while we had a queue from bioinformatic related journals that we could vote on to upgrade it to the front page of the blog. At the time it did not work very well because of lack of users and participation but it in essence it was not very different from what the publishers are trying to do now. Maybe we could try it again :).
Wednesday, September 19, 2007
Vote for your favorite life science blogs
(via Science Hacker and Postgenomic) The Scientist wants to compile a list of life science blogs that people enjoy reading as a reference. It is really not a good question to ask since there are so many different fields and styles of writing.
(via Science Hacker and Postgenomic) The Scientist wants to compile a list of life science blogs that people enjoy reading as a reference. It is really not a good question to ask since there are so many different fields and styles of writing.
Tuesday, September 18, 2007
More on open science
I am still catching up with a backlog of feeds and e-tocs but I just noticed that Benjamin Good posted his manuscript on E.D. in Nature Precedings. I wend back to his post where he first presented the manuscript to have a look at the comments and there is a nice discussion going on there. It is a good example of the usefulness of posting our work online. There might be still few people knowledgeable about particular interests to gain very good feedback in all areas but this will tend to grow with time.
Michael Barton from Bioinformatics Zen started a new blog to use as an open science notebook about his own research.
I have a mini project in mind about the evolution of domain families that I will start describing and working on here in the blog soon.
I am still catching up with a backlog of feeds and e-tocs but I just noticed that Benjamin Good posted his manuscript on E.D. in Nature Precedings. I wend back to his post where he first presented the manuscript to have a look at the comments and there is a nice discussion going on there. It is a good example of the usefulness of posting our work online. There might be still few people knowledgeable about particular interests to gain very good feedback in all areas but this will tend to grow with time.
Michael Barton from Bioinformatics Zen started a new blog to use as an open science notebook about his own research.
I have a mini project in mind about the evolution of domain families that I will start describing and working on here in the blog soon.
Sunday, September 09, 2007
The biology of modular domains (day1 and morning of day2)
I am attending the 3rd (I think it is just the third) conference on modular protein domains. It is a small conference of just 80 people with a very nice environment for discussions. Given the nature of the conference I suspect that a lot of the talks will be about unpublished material so I will be light on the details since I have not personally asked people if I may post about their work.
In the first day of the conference on modular protein domains we had the opening lecture by Wendell Lim. It was a very light and interesting discussion of the evolution and engineering of signaling pathways. Lim started by discussing some interesting results coming from the sequencing of M. brevicollis, a unicellular choanoflagellate that is related to Metazoa and might provide some information about their evolution. It is a continuation of an analysis done by Nicole King and Sean B. Carroll that first identified a receptor tyrosine kinase in M. brevicollis, the first time one was identified outside of the Metazoa. The discussion was generally about the evolution of kinase signaling and how such a system of what Lim was naming "readers"~phospho-binding domains, "writers"~kinases and "erasers"~phosphotases can arise in evolution.
The second part of his talk was about the efforts to understand the evolutionary capacity of signaling networks by trying to engineer new or altered pathways. In this case the focus was on how with few components and small changes in these components it is possible to shape the dynamic responses of signaling networks.
Morning session of the second day
Synthetic biology
The Synthetic Biology sessions started off with a talk by David Searls on "A linguistic view of modularity in macromolecules and networks" (that was not very related to synthetic biology but nevertheless interesting). Searls detailed his views on the analogies between linguistics and biology. Here is a recent review by Mario Gimona on this analogy. At the protein level we could think of sequence, structure, function and protein role as similar to lexical, syntatic, semantic and pragmatic levels of linguistic analysis:
The general idea of building these bridges over topics is to be able to take existing methods and discussions from one side to the other (see review).
The second talk was by Kalle Saksela and again it had little to do with synthetic biology. Saksela's group is working on high-throughput interaction mapping for human SH3 domains against full proteins (human and viral proteins). They mentioned their progress in expressing and analyzing a subset of these interactions. He mentioned an interesting example were the Nck and Eps8L1 SH3 domain binding site in CD3epsilon overlaped with an ITAM motif such that the phosphorylation of the ITAM motif abolished binding by the SH3 domains. It is a nice example of signaling mediated by different types of peptide binding domains (see paper for details).
The third talk was by Rudolf Volkmer. He gave a short talk on a library of coiled coil proteins. The library contains many single mutant variants of the GCN4 leucine-zipper sequence. They then tested pairs mutants for heterodimerization by SPOT assays. Aside from a extending the knowledge of these domain family the library can also be used know as a toolkit of binding domains for synthetic biology (the work is already published).
The final talk on this panel was from Samantha Sutton from the Drew Endy lab. This was more like what one would expect from a synthetic biology talk . Samantha Sutton is interested in developing what she calls Post Translational Devices, general abstract devices that can regulate the post translational state of proteins in a predictable fashion. She has a page in OpenWetWare detailing her thoughts on this.
The second panel in the morning was about In silico computational methods.
Cesareni presented their ongoing efforts to experimentally determine human SH3 and SH2 interactions with spotted peptides. He then showed how this data can be used to search for examples where there is overlapping recognition by different domain types. The work is similar in methodology to the paper published by Christiane Landgraf and colleagues in PLoS Biology but know using two domain families and the human proteome.
Vernon Alvarez from AxCell Biosciences, gave a talk about a proprietary database called ProChart (that I cannot find online) containing many domain-peptide interactions tested by the company. He was basically promoting the database for anyone interested in collaborations.
The third talk was by Norman Davey author of SLIMDisc a linear motif discovery method. He is trying to improve their method, mostly by improving the statistics.
I gave the second short talk of the session. It was on predicting binding specificity of peptide binding domains using structural information. It is basically a continuation of some of the work I mentioned before here in the blog about the use of structures in systems biology but know applied to domain-peptide interactions.
I am attending the 3rd (I think it is just the third) conference on modular protein domains. It is a small conference of just 80 people with a very nice environment for discussions. Given the nature of the conference I suspect that a lot of the talks will be about unpublished material so I will be light on the details since I have not personally asked people if I may post about their work.
In the first day of the conference on modular protein domains we had the opening lecture by Wendell Lim. It was a very light and interesting discussion of the evolution and engineering of signaling pathways. Lim started by discussing some interesting results coming from the sequencing of M. brevicollis, a unicellular choanoflagellate that is related to Metazoa and might provide some information about their evolution. It is a continuation of an analysis done by Nicole King and Sean B. Carroll that first identified a receptor tyrosine kinase in M. brevicollis, the first time one was identified outside of the Metazoa. The discussion was generally about the evolution of kinase signaling and how such a system of what Lim was naming "readers"~phospho-binding domains, "writers"~kinases and "erasers"~phosphotases can arise in evolution.
The second part of his talk was about the efforts to understand the evolutionary capacity of signaling networks by trying to engineer new or altered pathways. In this case the focus was on how with few components and small changes in these components it is possible to shape the dynamic responses of signaling networks.
Morning session of the second day
Synthetic biology
The Synthetic Biology sessions started off with a talk by David Searls on "A linguistic view of modularity in macromolecules and networks" (that was not very related to synthetic biology but nevertheless interesting). Searls detailed his views on the analogies between linguistics and biology. Here is a recent review by Mario Gimona on this analogy. At the protein level we could think of sequence, structure, function and protein role as similar to lexical, syntatic, semantic and pragmatic levels of linguistic analysis:
The general idea of building these bridges over topics is to be able to take existing methods and discussions from one side to the other (see review).
The second talk was by Kalle Saksela and again it had little to do with synthetic biology. Saksela's group is working on high-throughput interaction mapping for human SH3 domains against full proteins (human and viral proteins). They mentioned their progress in expressing and analyzing a subset of these interactions. He mentioned an interesting example were the Nck and Eps8L1 SH3 domain binding site in CD3epsilon overlaped with an ITAM motif such that the phosphorylation of the ITAM motif abolished binding by the SH3 domains. It is a nice example of signaling mediated by different types of peptide binding domains (see paper for details).
The third talk was by Rudolf Volkmer. He gave a short talk on a library of coiled coil proteins. The library contains many single mutant variants of the GCN4 leucine-zipper sequence. They then tested pairs mutants for heterodimerization by SPOT assays. Aside from a extending the knowledge of these domain family the library can also be used know as a toolkit of binding domains for synthetic biology (the work is already published).
The final talk on this panel was from Samantha Sutton from the Drew Endy lab. This was more like what one would expect from a synthetic biology talk . Samantha Sutton is interested in developing what she calls Post Translational Devices, general abstract devices that can regulate the post translational state of proteins in a predictable fashion. She has a page in OpenWetWare detailing her thoughts on this.
The second panel in the morning was about In silico computational methods.
Cesareni presented their ongoing efforts to experimentally determine human SH3 and SH2 interactions with spotted peptides. He then showed how this data can be used to search for examples where there is overlapping recognition by different domain types. The work is similar in methodology to the paper published by Christiane Landgraf and colleagues in PLoS Biology but know using two domain families and the human proteome.
Vernon Alvarez from AxCell Biosciences, gave a talk about a proprietary database called ProChart (that I cannot find online) containing many domain-peptide interactions tested by the company. He was basically promoting the database for anyone interested in collaborations.
The third talk was by Norman Davey author of SLIMDisc a linear motif discovery method. He is trying to improve their method, mostly by improving the statistics.
I gave the second short talk of the session. It was on predicting binding specificity of peptide binding domains using structural information. It is basically a continuation of some of the work I mentioned before here in the blog about the use of structures in systems biology but know applied to domain-peptide interactions.
Saturday, September 08, 2007
The Biology of Modular Protein Domains
From tomorrow on I will be in Austria for a small conference on the biology of protein domains. I might post some short notes about the meeting in the next few days. I'll get a chance to present some of the things I have been working on about the prediction of domain-peptide interactions from structural data.
Here is one of these modular protein domains, an SH3 domain, in complex with a peptide:
The very short summary of it is that it is possible to take the structure of one of these domains in complex with a peptide (ex: SH3, phospho binding domains, kinases, etc) and predict their binding specificity. To some extent it is also possible to take a sequence, obtain a model (depends on structural coverage) and determine its specificity. I'll talk more about the details (hopefully) soon.
From tomorrow on I will be in Austria for a small conference on the biology of protein domains. I might post some short notes about the meeting in the next few days. I'll get a chance to present some of the things I have been working on about the prediction of domain-peptide interactions from structural data.
Here is one of these modular protein domains, an SH3 domain, in complex with a peptide:
Tuesday, September 04, 2007
Scifoo Lives On: Definitions in Open Science
I am having a quick look at the session Definition in Open Science, going on in Second Nature (I'm Duriel Akula in Second Life). The place looks very different from the first time I had a look around the island. It is full of posters and other interesting material. Here is a picture as some of the first people started gathering:

Live coverage of the event by Berci (also in the picture).
I am having a quick look at the session Definition in Open Science, going on in Second Nature (I'm Duriel Akula in Second Life). The place looks very different from the first time I had a look around the island. It is full of posters and other interesting material. Here is a picture as some of the first people started gathering:
Live coverage of the event by Berci (also in the picture).
Wednesday, August 29, 2007
Bio::Blogs #14 - Update
The 14th edition of Bio::Blogs will be hosted by Ricardo at My Biotech Life. It will be made available on the 1st of September and submissions can be sent by email as mentioned in his blog post.
Update: The 14th edition is now posted at My Biotech Life. With all the deadlines I had this past month I left it almost until the end to organize a host. Thanks again to everyone that contributed on such short notice.
Is anyone interested in serving as host for the October edition ?
The 14th edition of Bio::Blogs will be hosted by Ricardo at My Biotech Life. It will be made available on the 1st of September and submissions can be sent by email as mentioned in his blog post.
Update: The 14th edition is now posted at My Biotech Life. With all the deadlines I had this past month I left it almost until the end to organize a host. Thanks again to everyone that contributed on such short notice.
Is anyone interested in serving as host for the October edition ?
SciVee.tv background info
A while ago SciVee was announced via several blog posts. Here is a link to the first one I read by Deepak and a link to the cluster in Postgenomic.
I thought at first glance that this was a partnership between some small start-up and a content provider (PLoS). After browsing a couple of the videos I noticed that most are from papers authored by Philip E. Bourne. Given the connections to both PLoS and SDSC (two of the site's partners) I thought that this might be an academic effort after all.
A couple of searches tells us that abailey was responsible for a Scivee mailing list at SDCS that no longer exists. abailey apparently stands for Apryl Bailey, someone involved in the SDSC CI Channel, a "webcast video service and resource for the scientific communities" (from their about page).
Apryl Bailey also appears listed in the Scivee Team in one of the slides of a talk (PDF) that Philip Bourne gave in June this year. According to this recent news story it looks like the launch was actually premature and triggered by this talk:
"According to one founder, Philip Bourne of the University of California–San Diego (UCSD) and founding editor in chief of PLoS Computational Biology, he talked about the project at a scientific meeting and the buzz began prematurely."
It is an academic effort, probably related to this CI Channel mentioned above:
"The project began with some pilot pubcasts done at UCSD to test video formats and has involved the other PLoS editors. There are currently eight people on the SciVee team. The SDSC is providing the site hosting."
From one of the slides of the talk:
Developmental Phases
• Phase I (One Year) – Invite authors of papers published in PLoS journals to upload a video or podcast to SciVee.tv describing the motivation, key results and major conclusions of the published study. Establish linkage between literature and video – source of metadata etc. – September 2007
• Phase II (Years 2- 3) - Scrape PubMed on a daily basis and extend the invitation to authors of all papers in the life sciences; develop video authoring server; provide
ratings and virtual community comment
• Phase III (Year 4- ) - Extend to other scientific disciplines
A while ago SciVee was announced via several blog posts. Here is a link to the first one I read by Deepak and a link to the cluster in Postgenomic.
I thought at first glance that this was a partnership between some small start-up and a content provider (PLoS). After browsing a couple of the videos I noticed that most are from papers authored by Philip E. Bourne. Given the connections to both PLoS and SDSC (two of the site's partners) I thought that this might be an academic effort after all.
A couple of searches tells us that abailey was responsible for a Scivee mailing list at SDCS that no longer exists. abailey apparently stands for Apryl Bailey, someone involved in the SDSC CI Channel, a "webcast video service and resource for the scientific communities" (from their about page).
Apryl Bailey also appears listed in the Scivee Team in one of the slides of a talk (PDF) that Philip Bourne gave in June this year. According to this recent news story it looks like the launch was actually premature and triggered by this talk:
"According to one founder, Philip Bourne of the University of California–San Diego (UCSD) and founding editor in chief of PLoS Computational Biology, he talked about the project at a scientific meeting and the buzz began prematurely."
It is an academic effort, probably related to this CI Channel mentioned above:
"The project began with some pilot pubcasts done at UCSD to test video formats and has involved the other PLoS editors. There are currently eight people on the SciVee team. The SDSC is providing the site hosting."
From one of the slides of the talk:
Developmental Phases
• Phase I (One Year) – Invite authors of papers published in PLoS journals to upload a video or podcast to SciVee.tv describing the motivation, key results and major conclusions of the published study. Establish linkage between literature and video – source of metadata etc. – September 2007
• Phase II (Years 2- 3) - Scrape PubMed on a daily basis and extend the invitation to authors of all papers in the life sciences; develop video authoring server; provide
ratings and virtual community comment
• Phase III (Year 4- ) - Extend to other scientific disciplines
Saturday, August 11, 2007
The ephemeral journal
Recently I mentioned the start of yet another journal covering one of the topics I would place on the top of a hype cycle curve. This together with the apparent ever increasing number of journals everywhere got me thinking of birth/death of science journals. The cost of starting up a new journal is so low that the turn-over can only be higher. Still, we don't typically see a lot of "journal death". They are meant to be respected and built up reputation among the public audience they serve.
It looks however inevitable that with a limited attention capacity and ever increasing number of journals that science hype cycles might have a strong influence on a journals activities. If hyped up subjects sprout out new journals quickly (i.e stem cells, systems biology, synthetic biology), underperforming science memes will suffer from lack of attention. If I had a biomedical related science publishing house I would probably be thinking of launching a journal to cover metagenomics and another to cover personalized medicine.
Creating and destroying journals based on hype cycles sounds a bit exaggerated but at least there is no reason to think that a journal is here to stay. This can also happen via in a more subtle way, trough re-grouping of content after publication. Call it a gateway, a report, a topics page,a portal (harder to find), the idea is there are several ways one can group published papers to serve a target audience. Digital works are not things, they can be in several places and we can slice and dice the views as we wish. One great thing about these views is that they are more likely to attract discussion since there is more likely a group of people around with similar interests. This would be even more so if the users had some power to control the content. Nature Reports allow users to submit papers and to vote on them but it is still too soon to tell if discussions in topic pages are more frequent than on a site like PLoS ONE.
Instead of subscribing to the high impact journals, and lower impact journals of our topics of interest, we would state our interests in the views/portals/gateways we select to participate in and hopefully the works would be distributed to target audiences as fitting. Things that are of very high perceived impact would be cross-posted to many more views than more specific works. The value could still be perceived either pre or post publication.
The main advantage for the publisher is many more pages with well targeted audiences. Some of these views could even be of interest to a very wide non scientific audience. All of these should improve advertisement revenue.
Recently I mentioned the start of yet another journal covering one of the topics I would place on the top of a hype cycle curve. This together with the apparent ever increasing number of journals everywhere got me thinking of birth/death of science journals. The cost of starting up a new journal is so low that the turn-over can only be higher. Still, we don't typically see a lot of "journal death". They are meant to be respected and built up reputation among the public audience they serve.
It looks however inevitable that with a limited attention capacity and ever increasing number of journals that science hype cycles might have a strong influence on a journals activities. If hyped up subjects sprout out new journals quickly (i.e stem cells, systems biology, synthetic biology), underperforming science memes will suffer from lack of attention. If I had a biomedical related science publishing house I would probably be thinking of launching a journal to cover metagenomics and another to cover personalized medicine.
Creating and destroying journals based on hype cycles sounds a bit exaggerated but at least there is no reason to think that a journal is here to stay. This can also happen via in a more subtle way, trough re-grouping of content after publication. Call it a gateway, a report, a topics page,a portal (harder to find), the idea is there are several ways one can group published papers to serve a target audience. Digital works are not things, they can be in several places and we can slice and dice the views as we wish. One great thing about these views is that they are more likely to attract discussion since there is more likely a group of people around with similar interests. This would be even more so if the users had some power to control the content. Nature Reports allow users to submit papers and to vote on them but it is still too soon to tell if discussions in topic pages are more frequent than on a site like PLoS ONE.
Instead of subscribing to the high impact journals, and lower impact journals of our topics of interest, we would state our interests in the views/portals/gateways we select to participate in and hopefully the works would be distributed to target audiences as fitting. Things that are of very high perceived impact would be cross-posted to many more views than more specific works. The value could still be perceived either pre or post publication.
The main advantage for the publisher is many more pages with well targeted audiences. Some of these views could even be of interest to a very wide non scientific audience. All of these should improve advertisement revenue.
Quotes
Another interesting SciView interview is available at Blind.Scientist. Here is one quote from Alexei Drummond (Chief Scientist of Biomatters) that I liked:
"I think that bioinformatics has to become a field where people without programming skills can contribute substantially. I would argue that all of the programmers in bioinformatics should be working very hard to program themselves out of their jobs (and into more satisfying jobs)."
Science advances quickly and so do the computational needs. Can we ever do away with these one off scripts if there are always new data types and innovative ways of analyzing them ? I guess the ideas around workflows and such could lead to very visual oriented programing that anyone can do.
Another interesting SciView interview is available at Blind.Scientist. Here is one quote from Alexei Drummond (Chief Scientist of Biomatters) that I liked:
"I think that bioinformatics has to become a field where people without programming skills can contribute substantially. I would argue that all of the programmers in bioinformatics should be working very hard to program themselves out of their jobs (and into more satisfying jobs)."
Science advances quickly and so do the computational needs. Can we ever do away with these one off scripts if there are always new data types and innovative ways of analyzing them ? I guess the ideas around workflows and such could lead to very visual oriented programing that anyone can do.
Thursday, August 09, 2007
First issue of IET Synthetic Biology
The first issue of (yet) another journal related to systems&synthetic biology is now online. IET Synthetic Biology will be freely available during this year. This issue covers several works from iGEM and the editorial is worth a read to have a look at the future direction of the journal.
In addition to conventional research and review articles, we see an important need for practical articles describing technical advances and innovative methods useful in synthetic biology. We will encourage submission of technical articles that might describe novel BioBrick components, construction techniques, characterisation of a new biological circuit, new software or a practical ‘hands-on’ guide to the construction of new instrumentation or a biological device.
In addition to the print journal, we are developing associated web resources. These will include a repository of online video resources, specialised review material and research tools for synthetic biology.
Some journals tracking similar fields:
Molecular Systems Biology
BMC Systems Biology
Systems and Synthetic Biology
HSFP Journal
IET Systems Biology
The first issue of (yet) another journal related to systems&synthetic biology is now online. IET Synthetic Biology will be freely available during this year. This issue covers several works from iGEM and the editorial is worth a read to have a look at the future direction of the journal.
In addition to conventional research and review articles, we see an important need for practical articles describing technical advances and innovative methods useful in synthetic biology. We will encourage submission of technical articles that might describe novel BioBrick components, construction techniques, characterisation of a new biological circuit, new software or a practical ‘hands-on’ guide to the construction of new instrumentation or a biological device.
In addition to the print journal, we are developing associated web resources. These will include a repository of online video resources, specialised review material and research tools for synthetic biology.
Some journals tracking similar fields:
Molecular Systems Biology
BMC Systems Biology
Systems and Synthetic Biology
HSFP Journal
IET Systems Biology
Tuesday, August 07, 2007
A quick post to link out to two new bioinformatic related blogs:
Freelancing science (by Paweł Szczęsny)
Open.nfo (by Keith)
I will be happy the day there are too many to track :).
Updated: It could the official month of "start your own bioinformatics blog". The bio.struct blog is the third one so far.
Saturday, August 04, 2007
SciFoo starts ...
and I am not there :). No fun ! The Science Foo Camp 2007 has started at Googleplex and there is already some blog coverage. To have a look at what is going on at camp here is a tip from Andrew Walkingshaw:
* http://www.lexical.org.uk/planetscifoo/ - participants’ blogs
* http://flickr.com/photos/tags/scifoo/ - photos
* http://www.technorati.com/tags/scifoo/ - general blogosphere commentary
There is also some live Twitter feeds from Deepak and Nat Torkington.
To start off go have a look at pictures posted by Bora, you might recognize one or two of these bloggers.
Maybe next year we can try to organize a Science Barcamp :) Why should they have all the fun.
and I am not there :). No fun ! The Science Foo Camp 2007 has started at Googleplex and there is already some blog coverage. To have a look at what is going on at camp here is a tip from Andrew Walkingshaw:
* http://www.lexical.org.uk/planetscifoo/ - participants’ blogs
* http://flickr.com/photos/tags/scifoo/ - photos
* http://www.technorati.com/tags/scifoo/ - general blogosphere commentary
There is also some live Twitter feeds from Deepak and Nat Torkington.
To start off go have a look at pictures posted by Bora, you might recognize one or two of these bloggers.
Maybe next year we can try to organize a Science Barcamp :) Why should they have all the fun.
Friday, August 03, 2007
Bio::Blogs#13
A great edition of the monthly Bio::Blogs is up at Neil's blog. This month there are plenty of tutorials and a round up of blog coverage about the ISMB/ECCB 2007 conference.
PDF version for offline reading of the editorial and highlighted posts is here and here (Box.net copy).
If someone wants to give it a try at editing future editions of Bio::Blogs let me know.
Speaking of community projects, the list of webservers published in that last NAR webserver edition are in this Nodalpoint wiki webpage. If you try one of these services spend a minute noting down if it was even available, if it worked well, etc.
A great edition of the monthly Bio::Blogs is up at Neil's blog. This month there are plenty of tutorials and a round up of blog coverage about the ISMB/ECCB 2007 conference.
PDF version for offline reading of the editorial and highlighted posts is here and here (Box.net copy).
If someone wants to give it a try at editing future editions of Bio::Blogs let me know.
Speaking of community projects, the list of webservers published in that last NAR webserver edition are in this Nodalpoint wiki webpage. If you try one of these services spend a minute noting down if it was even available, if it worked well, etc.
Wednesday, August 01, 2007
Microattribution
(via Peter Suber) An editorial in Nature Genetics discusses the need to establish microatribution systems:
"When requiring authors to deposit data in public databases, journals, databases and funders should ensure that quantitative credit for the use of every data entry will accrue to the relevant members of the data-producing and annotating teams. In an era in which consortia are producing more (and more useful) papers than individuals and small groups, the careers of individuals are as much in need of specific credit as those of the scientific visionaries and wranglers who hold the consortia together."
This sounds great. From the journals point of view this would mean "encouraging" the authors to link to all resources used. This information would then need to be aggregated and made available to everyone. This and other measures would help to change the current credit system that tends to reward researchers for producing papers in high impact factor journals (that does not correlate with individual paper citations) instead of rewarding scientists for the usefulness of their research.
(via Peter Suber) An editorial in Nature Genetics discusses the need to establish microatribution systems:
"When requiring authors to deposit data in public databases, journals, databases and funders should ensure that quantitative credit for the use of every data entry will accrue to the relevant members of the data-producing and annotating teams. In an era in which consortia are producing more (and more useful) papers than individuals and small groups, the careers of individuals are as much in need of specific credit as those of the scientific visionaries and wranglers who hold the consortia together."
This sounds great. From the journals point of view this would mean "encouraging" the authors to link to all resources used. This information would then need to be aggregated and made available to everyone. This and other measures would help to change the current credit system that tends to reward researchers for producing papers in high impact factor journals (that does not correlate with individual paper citations) instead of rewarding scientists for the usefulness of their research.
Sunday, July 29, 2007
Bio::Blogs #13 call for submissions
Neil has kindly agreed to host the next edition of Bio::Blogs, due out on the first of August. Send in links to blog posts of bioinformatics/chemioinformatics/omics/open science related content to bioblogs at gmail and they will be re-directed to him.
Neil has kindly agreed to host the next edition of Bio::Blogs, due out on the first of August. Send in links to blog posts of bioinformatics/chemioinformatics/omics/open science related content to bioblogs at gmail and they will be re-directed to him.
Friday, July 27, 2007
Trade books vs Nature publishing
(via Richard Charkin blog) Richard Charkin is the Chief Executive of Macmillan (Nature Publishing is a subsidiary of Macmillan). He posted his thoughts on digital books are not as successful as the digital publishing going on at Nature.
I can't help noticing the second reason (my emphasis):
and read it as "higher profit margins".
(via Richard Charkin blog) Richard Charkin is the Chief Executive of Macmillan (Nature Publishing is a subsidiary of Macmillan). He posted his thoughts on digital books are not as successful as the digital publishing going on at Nature.
I can't help noticing the second reason (my emphasis):
2. Scientific publishing has been intrinsically more profitable than trade book publishing. This allowed the major publishers and societies to invest the significant sums needed to create electronic delivery and storage platforms for scientific information. These platforms are a cornerstone for the creation of a new business and communication model.
and read it as "higher profit margins".
Google code for educators
(via the Google Blog) Google started a website to gather teaching materials for CS educators, covering some of the most recent technologies. Right now it has some material for AJAX Programming, Distributed Systems and Web Security. There are some video lectures and presentations. There is already some material on parallel programming (mostly related to their MapReduce) that should be of use to bioinformatics.
One a related topic Tiago has on his blog started a multipart series about "Bioinformatics, multi-core CPUs and grid computing". The first and second part are already available.
(via the Google Blog) Google started a website to gather teaching materials for CS educators, covering some of the most recent technologies. Right now it has some material for AJAX Programming, Distributed Systems and Web Security. There are some video lectures and presentations. There is already some material on parallel programming (mostly related to their MapReduce) that should be of use to bioinformatics.
One a related topic Tiago has on his blog started a multipart series about "Bioinformatics, multi-core CPUs and grid computing". The first and second part are already available.
Tuesday, July 24, 2007
Slideshare adds voice
(via TechCrunch) Slideshare, a site to share presentations online has added voice synchronization. We can now provide a link to an mp3 file and Slideshare provides with some tools to sync the audio to the slides, such that each slide is linked to part of the audio track. More information and examples can be found in this FAQ page.
In related news, Bioscreencast has now a group in Facebook.
(via TechCrunch) Slideshare, a site to share presentations online has added voice synchronization. We can now provide a link to an mp3 file and Slideshare provides with some tools to sync the audio to the slides, such that each slide is linked to part of the audio track. More information and examples can be found in this FAQ page.
In related news, Bioscreencast has now a group in Facebook.
Saturday, July 14, 2007
Another Open lab book
(Via Open Reading Frame) Jeremiah Faith is given open notebook science a try and compiling some tips. He joins Rosie Redfield (microbiology) and Jean-Claude Bradley (chemistry) in exposing most of their research online and leading the way to changing the mindset towards open science.
Jeremiah Faith also has an interesting idea about using conference money to pay for advertisement. He figures that well targeted ads can get you more attention than a talk. He like the idea because it is thinking out of box but I think that the type of connection that one can create on a conference with other people is not so easy to recreate online. Also, there might not be any need to spend money on advertisement if the blogs keeps on topic and is interesting enough to get incoming links. The blog can be a good personal marketing tool.
(Via Open Reading Frame) Jeremiah Faith is given open notebook science a try and compiling some tips. He joins Rosie Redfield (microbiology) and Jean-Claude Bradley (chemistry) in exposing most of their research online and leading the way to changing the mindset towards open science.
Jeremiah Faith also has an interesting idea about using conference money to pay for advertisement. He figures that well targeted ads can get you more attention than a talk. He like the idea because it is thinking out of box but I think that the type of connection that one can create on a conference with other people is not so easy to recreate online. Also, there might not be any need to spend money on advertisement if the blogs keeps on topic and is interesting enough to get incoming links. The blog can be a good personal marketing tool.
Subscribe to:
Comments (Atom)