Cellular Consequences of Genetic variation

Monday, June 12, 2006

Science blog carnivals

What is a blog carnival ? In my opinion a blog carnival is just a meta-blog, a link aggregation supervised by an editor. They have been around for some time and there are already some rules to what usually is common to expect from a blog carnival. You can read this nice post on Science and Politics to get a better understanding of blog carnivals.

Here is a short summary I found on this FAQ:

Blog Carnivals typically collect together links pointing to blog articles on a particular topic. A Blog Carnival is like a magazine. It has a title, a topic, editors, contributors, and an audience. Editions of the carnival typically come out on a regular basis (e.g. every monday, or on the first of the month). Each edition is a special blog article that consists of links to all the contributions that have been submitted, often with the editors opinions or remarks.

There are of course science carnivals and I would say that their numbers are increasing with more people joining the science blogosphere. To my knowledge (please correct me :) the first scientific blog carnival was the Tangled Bank that I think started on the 21st of April 2004 and is still up and running.

These carnivals could also be seen as a path of certification (as discussed in the previous post). The rotating editor reviews submissions and bundles some of them together. This should guaranty that the carnival has the best of what has been posted on the subject in the recent past. The authors gain the attention of anyone interested in the carnival and the readers get supposably good quality posts on the subject. With time, and if there are more blog posts than carnivals we will likely see some carnivals gaining reputation.

Maybe one day having one of your discovery posts appear in one of the carnivals will be the equivalent of today having a paper published in a top journal.

With that said, why don't we start a computational biology/bioinformatics carnival ? :) There might not be enough people for it but we can make it monthly or something like this. Any suggestion for a name ?

Thursday, June 08, 2006

The peer review trial

The next day after finding about PLoS One I saw the announcement for the Nature peer review trial. For the next couple of months any author submitting to Nature can opt to go trough a parallel process of open peer review. Nature is also promoting the discussion on the issue online in a forum where anyone can comment. You can also track the discussion going on the web through Connotea under the tag of "peer review trial", or under the "peer review" tag in Postgenomic.

I really enjoyed reading this opinion on "Rethinking Scholarly Communication", summarized in one of the Nature articles. Briefly, the authors first describe (from Roosendaal and Geurts) the required functions any system of scholarly communication:
* Registration, which allows claims of precedence for a scholarly finding.
* Certification, which establishes the validity of a registered scholarly claim.
* Awareness, which allows actors in the scholarly system to remain aware of new claims and findings.
* Archiving, which preserves the scholarly record over time.
* Rewarding, which rewards actors for their performance in the communication system based on metrics derived from that system.

The authors then try to show that it is possible to build a science communication system where all these functions are not centered in the journal, but are separated in different entities.

This would speed up science communication. There is a significant delay between submitting a communication and having it accessible to others because all the functions are centered in the journals and only after the certification (peer reviewing) is the work made available.

Separating the registration from the certification also has the potential benefit of exploring parallel certifications. The manuscripts deposited in the pre-print servers can be evaluated by the traditional peer-review process in journals but on top of this there is also the possibility of exploring other ways of certifying the work presented. The authors give the example of Citabase but also blog aggregation sites like Postgenomic could provide independent measures of the interest of a communication.

More generally and maybe going a bit of-topic, this reminded me of the correlation between modularity and complexity in biology. By dividing a process into separate and independent modules you allow for exploration of novelty without compromising the system. The process is still free to go from start to end in the traditional way but new subsystems can be created to compete with some of modules.

For me this discussion, is relevant for the whole scientific process , not just communication. New web technologies lower the costs of establishing collaborations and should therefore ease the recruitment of resources required to tackle a problem. Because people are better at different task it does make some sense to increase the modularity in the scientific process.

Monday, June 05, 2006

PLoS One

There is an article in Wired about open access in scientific publishing. It focuses on the efforts of the Public library of Science (PLoS) to make content freely available by transferring the costs of publication to the authors. What actually caught my attention was this little paragraph:

The success of the top two PLoS journals has led to the birth of four more modest ones aimed at specific fields: clinical trials, computational biology, genetics, and pathogens. And this summer, Varmus and his colleagues will launch PLoS One, a paperless journal that will publish online any paper that evaluators deem Â“scientifically legitimate.Â” Each article will generate a thread for comment and review. Great papers will be recognized by the discussion they generate, and bad ones will fade away.

The emphasis is mine. I went snooping around for the upcoming PLoS One and I found a page to subscribe to a mailing list. It has curious banner with a subtitle of open access 2.0.

I found some links in the source code that got me to the prototype webpage. It sounds exactly like what a lot of people have been pushing for: rapid scientific communication, community peer reviewing, continuous revision of the paper (they call it interactive papers) and open access. This will be hard to implement but if successful it will do much to bring more transparency to the scientific process and increase the cooperation between scientist.

There is also something about the name PLoS ONE. They are really betting a lot on this launch if they are calling it ONE. It implicitly states that ONE will be the flagship of PLoS, where any paper (not just Biology) can be published.

Wednesday, May 31, 2006

Bringing democracy to the net

Democracy is most often thought of as in opposition to totalitarianism. In this case I mean democracy in opposition to anarchy. Some people are raising their voices against the trend of collective intelligence/wisdom of the crows that have been the hype of the net for the past few years. Wikipedia is the crown jewel of this trend of empowering people for the common good and probably as a result of the project's visibility it has been the one to take the heat from the backlash.

Is Wikipedia dead like Nicholas Carr suggests in his blog ? His provocative title was a flame bait but he does call attention to some interesting things happening at Wikipedia. The wikipedia is not dead , it is just changing. It has to change to cope with the increase in visibility, vandalism and to deal with situations where no real consensus is possible.
The system is evolving by restricting anonymous posting and allowing editors to apply temporary editing restrictions to some pages. It is evolving to become more bureaucratic in nature with disputes and mechanisms to deal with the discord. What Nicholas Carr said is dead is the ideal that anyone can edit anything in wikipedia and I would say this is actually good news.

Following his post on the death of Wikipedia, Carr points to an assay by Jaron Lanier entitled Digital Maoism. It is a bit long but I highly recommend it.
Some quotes from the text:
"Every authentic example of collective intelligence that I am aware of also shows how that collective was guided or inspired by well-meaning individuals. These people focused the collective and in some cases also corrected for some of the common hive mind failure modes. The balancing of influence between people and collectives is the heart of the design of democracies, scientific communities, and many other long-standing projects. "

Sites like Wikipedia are important online experiments. They are trying to develop the tools that allow useful work to come out from millions of very small contributions. I think this will have to go trough some representative democracy systems. We still have to work on ways to establish the governing body in these internet systems. Essentially to whom we decide to deposit trust for that particular task or realm of knowledge. For this we will need better ways to define identity online and to establish trust relationships.

Further reading:
Wiki-truth

Friday, May 26, 2006

The Human Puppet (2)

In November I rambled about a possible sci-fi scenario. It was about a human person giving away their will to be directed by the masses in the internet. A vessel for the "collective intelligence". A voluntary and extreme reality show.

Well, there goes the sci-fi, you can participate in it in about 19 days. Via TechCrunch I found this site:

Kieran Vogel will make Internet television history when he becomes the first person to give total control of his life to the Internet.
(...)
Through an interactive media platform Kieran will live by the decisions the internet decides such as:

# What time he wakes up
# What he wears
# What he eats
# Who he dates
# What he watches

I get a visceral negative response to this. Although this is just a reality show and it is all going to happen inside a house I think it will be important to keep this in mind. In the future technology will make web even more pervasive then today and there are scenarios along the lines of this human puppet idea that could have negative consequences.
I guess what I am thinking is that the same technologies that helps us to collaborate can also be use to control (sounds a bit obvious). In the end the only difference is on how much do the people involved want to (or can) exercise their will power.

Thursday, May 25, 2006

Using viral memes to request computer time

Every time we direct the browser somewhere, dedicating your attention, some computer processing time is used to display the page. This includes a lot of client side processing like all the javascript in all that nice looking AJAX stuff. What if we could harvest some of this computer processing power to solve very small tasks, something like grid computing.
How would this work ? There could be a video server that would allow me to put a video on my blog (like google video) or a simple game or whatever thing that people would enjoy and spend a little time doing. During this time there would be a package downloaded from the same server, some processing done on the client side and a result sent back. If people enjoy the video/game/whatever and it goes viral then it spreads all over the blogs and any person dedicating their attention to it is contributing computer power to solve some task. Maybe this could work as an alternative to advertising ? Good content would be traded for computer power. To compare, Sun is selling computer power in the US for 1 dollar an hour. Of course this type of very small scale grid processing would be worth much less.

Wednesday, May 24, 2006

Conference blogging and SB2.0

In case you missed the Synthetic Biology 2.0 meeting and want a quick summary of what happened there you can take a look at some blogs. There were at least 4 bloggers at the conference. Oliver Morton (chief news and features editor of Nature) has a series of posts in his old blog. Rob Carlson described how he and Drew Endy were calling the field intentional biology. Alex Mallet from Drew Endy's lab has a quick summary of the meeting and finally Mackenzie has in his cis-action by far the best coverage with lots more to read.

I hope they put up on the site the recorded talks since I missed a lot of interesting things during the live webcast.

In the third day of the meeting (that was not available in the live webcast) there was a discussion about possible self-regulation in the field (as in the 1975 Asilomar meeting). According to an article in NewScientist the attending researchers decided against self-regulation measures.

Saturday, May 20, 2006

Synthetic Biology & best practices

There is a Synthetic Biology conference going on in Berkeley (webcast here) and they are going to talk about the subject of best practices in one of the days. There is a document online with an outline of some of the subjects up for discussion. In reaction to this, a group of organization published an open letter for the people attending the meeting.
From the text:
We are writing to express our deep concerns about the rapidly developing field of Synthetic Biology that is attempting to create novel life forms and artificial living systems. We believe that this potentially powerful technology is being developed without proper societal debate concerning socio-economic, security, health, environmental and human rights implications. We are alarmed that synthetic biologists meeting this weekend intend to vote on a scheme of voluntary self-regulation without consulting or involving broader social groups. We urge you to withdraw these self-governance proposals and participate in a process of open and inclusive oversight of this technology.

Forms of self-regulation are not incompatible with open discussion with the broader society nor with state regulation. Do we even need regulation at this point ?

The internet and the study of human intelligence

I started reading a book on machine learning methods last night and my mind floated away to thinking about the internet and artificial intelligence (yes the book is a bit boring :).
Anyway, one thing that I thought about was how the internet might become (or is already) a very good place to study (human) intelligence. Some people are very transparent on the net and if anything the trend is for people to start sharing their lives or at least their view of the world earlier. So it is possible to get an idea of what someone is exposed to, what people read, films they see, some of their life experiences, etc. In some sense you can access someone's input in life.
On the other hand you can also read this person's opinions when presented with some content. Person X with known past experiences Y was exposed to Z and reacted in this way. With this information we could probably learn a lot about human thought processes.

A little bit of this a little bit of that ...

What do you get when you mix humans/sex/religion/evolution? A big media hype.
Also, given that a big portion of the scientist currently blogging are working on evolution you also get a lot of buzzing in the science blogosphere. No wonder then why this paper reached the top spot in postgenomic.

This one is a very good example of the usefulness of blogs and why we should really promote more science communication online. The paper was released in advanced online publication and some days after you can already read a lot of opinions about it. It is not just the blog entries but also all the comments on these blog posts. As a result of this we not only get the results and discussion from the paper but the opinion of whoever decided to participate in the discussion.

Wednesday, May 17, 2006

Postgenomic greasemonkey script (2)

I have posted the Postgenomic script I mentioned in the previous post in the Nodalpoint wiki page. There are some instructions there on how to get it running. If you have some problems or suggestions leave some comments here or in the forum in Nodalpoint. Right now it is only set to work with the Nature journals but it should work with more.

Saturday, May 13, 2006

Postgenomics script for Firefox

I am playing around with greasemonkey to try to add links to Postgenomic to journal websites. The basic idea is to search the webpage you are seeing (like a Nature website for example) for papers that have been talked about in blogs and are tracked by Postgenomic. When one is found a little picture is added with a link to the Postgenomic page talking about the paper.
The result is something like this (in the case of the table of contents):

Or like this when viewing the paper itself:

In another journal:

I am more comfortable with Perl, but anyway I think it works as a proof or principle. If Stew agrees I'll probably post the script in Nodalpoint for people to improve or just try it out.

Thursday, May 11, 2006

Google Trends and Co-Op

There some new Google services up and running and buzzing around the blogs today. I only briefly took a look around them.
Google Trends is like Google Finance for anything search trend than you want to analyze. Very useful for someone wanting to waste time instead of doing some productive work ;). You can compare the search and news volume for different terms like:

It gets the data from all the google searches so it really does not reflect the trends within the scientific community.

The other new tool out yesterday is Google Co-Op, the start of social search for Google. It looks as obscure as Google Base so I can again try to make some weird connection to how researcher might use it :). It looks like Google Co-Op is a way for users to further personalize their search. User can subscribe to providers that offer their knowledge/guidance to shape some of the results you see in your search. If you search for example for alzheimer's you should see on the top of the results some refinement that you can do. For example you can look only at treatment related results. This was possible because a list of contributors have labeled a lot of content according to some rules.

Anyone can create a directory and start labeling content following an XML schema that describes the "context". So anyone or (more likely) any group of people can add metadata to content and have it available in google. The obvious application for science would be to have metadata on scientific publications available. Maybe getting Connotea and CiteULike data into a google directory for example would be useful. These sites can still go on developing the niche specific tools but we could benefit from having a lot of the tagging metadata available in google.

Wednesday, May 10, 2006

Nature Protocols

Nature continues clearly the most innovative of the publishing houses in my view. A new web site is up in beta phase called Nature Protocols:

Nature Protocols is a new online web resource for laboratory protocols. The site, currently in beta phase, will contain high quality, peer-reviewed protocols commissioned by the Nature Protocols Editorial team and will also publish content posted onto the site by the community

They accept different types of content:
* Peer-reviewed protocols
* Protocols related to primary research papers in Nature journals
* Company Protocols and Application notes
* Non peer-reviewed (Community) protocols

There are already several protocol websites already out there so what is the point ? For Nature I guess it is obvious. Just like most portal websites they are creating a very good place to put ads. I am sure that all these protocols will have links to products on their Nature products and a lot of ads. The second advantage for Nature is the stickiness of the service. More people will come back to the website to look for protocols and stumble on to Nature content, increasing visibility for the journals and their impact.

A little detail is that, as they say above, the protocols from the papers published in the Nature journals will be made available on the website. On one hand this sounds great because the methods sections in the papers are usually so small (due to restrictions for publication) that they are most of the times incredibly hard to decipher (and usually put into supplementary materials). On the other hand, this will increase even further the tendency to hide away from the paper the really important pars of the research, the results and how these where obtained (methods) and to show only the subjective interpretations of the authors.
This reminds me of a recent editorial by Gregory A Petsko in Genome Biology (sub only). Here is how is states the problem :) - "The tendency to marginalize the methods is threatening to turn papers in journals like Nature and Science into glorified press releases."

For scientists this will be a very useful resource. Nature has a lot of appeal and will be able to quickly create a lot of really good content by inviting experienced scientists to write up their protocols full with tips and tricks accumulated over years of experience. This is the easy part for science portals, the content comes free. If somebody went to Yahoo and told them that scientist actually pay scientific journals to please please show our created content they would probably laugh :). Yahoo/MSN and other web portals have to pay people to create the content that they have on their sites.

web2.0@EMBL

The EMBL Centre for Computational Biology has announced a series of talks related to novel concepts and easy-to-use web tools for biologists. So far there are three schedule talks:

Session 1 - Using new web concepts for more efficient research - an introduction for the less-techy crowd
Time/place: Tue, May 16th, 2006; 14:30; Small Operon

This one I think will introduce the concepts around what is called web2.0 and the potential impact these might have for researchers. I am really curious to see how big will the "less-techy crowd" really be :).

The following sessions are a bit more specific dealing with particular problems we might have in our activities and how can some of the recent web technologies help us deal with them.

Session 2 - Information overflow? Stay tuned with a click (May 23rd, 2006; 14:30;)
Session 3 - Tags: simply organize and share links and references with keywords (May 30th, 2006; 14:30)
Session 4 - Stop emailing huge files: How to jointly edit manuscripts and share data (June 6th, 2006; 14:30;)
All in the Small Operon, here in the EMBL Heidelberg

I commend the efforts of the EMBL CCB and I hope that a lot of people turn up. Let's see if the open collaborative ideas come up on the discussions. If you are in the neighborhood and are interested, come on by and help with the discussion (map).

Tuesday, April 25, 2006

Engineering a scientific culture

In a commentary in Cell, Gerald Rubin describes Janelia Farm, the new research campus of the Howard Hughes Medical Institute. If you cannot access the commentary, there is a lot of information available on the website such as this flash presentation (oozing with PR talk).

In summary (as I understood it) the objective is to create a collaborative working environment where scientist can explore risky and long term projects without having to worry about applying for grants and publishing on very regular basis.
Group leaders in Janelia Farm will
- have small groups (two to six)
- not be able to apply to outside funding
- still work in the bench

Unless you are really interested in managing resources and all the hassle of applying for grants, this sounds very appealing.

Also, there is no limit on the amount of time the group leader can stay at Janelia Farm, as long as they pass a review process every 5 years. This is unlike for example here at EMBL where most people are forced to move after 9 years (there is a review process after 5 years).

Since the main objectives of Janelia Farm is to work on long term projects that can have significant impact, the review process will not focus on publications but on more subjective criteria like:
"(1) the ability to define and the willingness to tackle difficult and important problems; (2) originality, creativity, and diligence in the pursuit of solutions to those problems; and (3) contributions to the overall intellectual life of the campus by offering constructive criticism, mentoring, technical advice, and in some cases, collaborations with their colleagues and visiting scientists"

Sounds like a researchers paradise :), do the science we will do the rest for you.
It will be interesting to see in some years if they manage to create such an environment. The lack of very objective criteria and no limit on the stay in the campus might lead to some corruption.

Friday, April 21, 2006

Posting data on your blog

From Postgenomic I read this blog post in Science and Politics on science blogs. Bora Zivkovic describes in his post the different types of science blogging with several examples. The most interesting part for me was his discussion of posting hypothesis and unpublished data. I was very happy to see that he already had some post with his own unpublished data and that the discussion about science communication online is coming up in different communities.

His answer to the scoop problem :
But, putting data on a blog is a fast way of getting the data out with a date/time stamp on it. It is a way to scoop the competition. Once the data are published in a real Journal, you can refer back to your blog post and, by doing that, establish your primacy.

There are some problems with this. For example, people hosting their blogs can try to forge the dates, so it would be best to have a third party time-stamping the data. Postgenomic would be great for this, there could be another section in the aggregator to track posts with data. Some journals will probably complain about prior publication and decline to publish something already seen in a blog.

The problems with current publishing systems and the agonizing feeling of seeing your hard work published by other people will probably help drive some change in science communication. Blogging data would make science communication more real-time and transparent, hopefully reducing the number of wasted resources and frustrations with overlapping research.

This is a topic I usually come back to once in while so I have mentioned this here before. The stream like format of the blog makes it hard to keep posting all the relevant links on the topic so I think from now on I will just link to the last post on the topic to at least form a connected chain.

Tuesday, April 11, 2006

Stable scientific databases

The explosion of scientific data coming from high throughput experimental methods has lead to the creation of several new databases for biological information (protein structures, genomes, metabolic networks and kinetic rates, expression data, protein interactions, etc). Given that funding is generally attributed for a limited time and for defined projects it is possible to obtain money to start a database project but it very difficult to obtain a stable source of funding to sustain a useful database. I mentioned this before more than once when talking about the funding problems of BIND.
In this issue of The Scientist there is a short white paper entitle "Save our Data!". It details the recommendations of The Plant Genome Database Working Group for the problems currently faced by the life science databases.

I emphasize here four point they make:
2. Develop a funding mechanism that would support biological databases for longer cycle times than under current mechanisms.
3. Foster curation as a career path.
6. Separate the technical infrastructure from the human infrastructure. Many automated computational tasks do not require specialized species- or clade-specific knowledge.
7. Standardize data formats and user interfaces.

The first and last points were also discussed a recent editorial in Nature Biotech.

What was a bit of a surprise for me is their 3rd point on fostering curation as career path. Is it really necessary to have professional curators ? I am a bit divided between a more conservative approach at data curation with a team of professional curators or a wisdom of the crowds type of approach were tools are given to the communities and they solve the curation problems. I think it would be more efficient to find ways to have the people producing the data, curating it automatically into the databases. To have this happen it has to be really easy and immediate to do. I still think that journals are the only ones capable of enforcing this process.

The 6th point they make is surely important even if the curation effort are to be pushed back to the people producing the data. It is important to make the process of curating the data as automatic and easy as possible.

Friday, April 07, 2006

Retracted scientific work still gets cited

Science has a news focus on scientific misconduct. A particular study tracked the citation of papers that were already retracted. They found that scientists keep citing retracted papers.
Some editors contacted by Science said that they do not have the resources to look up every citation in every paper to help purge the literature of citations to retracted work. In my opinion this is not such a complicated problem. If journals agreed to submit to a central repository all retractions, then the citations could very easily be checked against the database and removed. Even with such an automatic system , scientists should have the responsibility to keep up with the works being retracted in their fields.
Since retractions are publicly announced by the journals pubmed has already some of this information available. If you search for retraction in the title in pubmed you can see several of these announcements (not all are retractions). In some cases, when you search for a the title of a retracted paper you can see in pubmed a link to the retraction but this is not always the case. All that is needed is for publishing houses to agree on a single format to publish retractions and repositories to make sure all retractions are appended to the former entries to the same publication.

Tuesday, April 04, 2006

Viral marketing gone wrong

The social internet has emerged as an ideal ground for marketing. People enjoy spreading news and in the internet meme spreading sometimes resembles a viral infection propagating throughout the network.
Some companies like Google have made their success on this type of word-of-mouth marketing. If you can get a good fraction of the social internet to be attached to your products in such a way that they want to tell their friends all about it , you don't have to spend money in marketing campaigns.
The important point here is that a fraction of people must be engaged in the meme, they must find it so cool and interesting that they just have to go and tell their friends and infect them with the enthusiasm. How do you do this ? That's the hard part I guess.
So, the marketing geniuses of Chevrolet decided that they would try their hands at viral marketing. To get people engaged they decided to have the masses build the ads. We usually like what we build and we want to show it to our friends, so the idea actually does not sound so bad right ?! :) well , this would have been a fantastic marketing idea, if most people actually had good things to say about the product.

Here is an example of the videos coming out from the campaign:

I worried before that this type of marketing could be a negative consequence of science communication online but these examples just show that directing attention alone is not enough, people will judge what they find and are free to criticize.

Monday, April 03, 2006

The Human interactome project

Marc Vidal has a letter in The Scientist urging scientist and funding agencies to increase efforts to map all human protein interactions. He suggests that different labs work on different parts of the huge search space (around 22000^2 excluding splice variants) and of course that funding agencies give out more money to support the effort. He makes an interesting point when he compares funding for genome projects with interactome mapping. I also think that the interactome mapping should be view in the same way has genome sequencing and that the money invested would certainly result in significant progress in basic and medical research.
The only thing I would add to my own wish list is that some groups would start comparative projects at the same time. Even if it takes longer to complete the human interactome it would be much more informative to have of map of the ortholog proteins in a sufficiently close species to compare with (like mouse). Alternatively some funding could go specifically to comparative projects studying for example the interactomes of different yeasts (it is easy to guess that I would really really like to have this data for analysis :).

Friday, March 31, 2006

Get ready for the 1st of April storm

Tomorrow is April fools day and there is a long tradition in the media to put out jokes on this day. Some years ago this was, for me, almost not noticeable. I knew that the newscasts in the different TV channels would have at least one spoof story. Maybe I would notice the joke in one or two newspapers if I actually read one that day. These days I get almost everything from the internet and it is no longer just from a handful of sources, it comes from tons of media sites, blogs and aggregators. So every year that I am more connect I notice more the 1st of April as the day where everybody goes nuts on the web. This year it even starts early has you can see by this gold fish story in the economist. Maybe spishine's post on quitting blogging was also an example of early April fools ;).

Tuesday, March 21, 2006

Wiki-Science

From Postgenomic (now on Seed Media Group servers), I picked up this post with some speculations on the future of science. It is a bit long but interesting. It was written by the former editor of Wired magazine so it is naturally biased to speculations on technology changes.

My favorite prediction is what he called Wiki-Science:

"Wiki-Science - The average number of authors per paper continues to rise. With massive collaborations, the numbers will boom. Experiments involving thousands of investigators collaborating on a "paper" will commonplace. The paper is ongoing, and never finished. It becomes a trail of edits and experiments posted in real time - an ever evolving "document." Contributions are not assigned. Tools for tracking credit and contributions will be vital. Responsibilities for errors will be hard to pin down. Wiki-science will often be the first word on a new area. Some researchers will specialize in refining ideas first proposed by wiki-science."

I am trying to write a paper right now and just last week the thought crossed my mind of just doing it online in Nodalpoint's wiki pages and inviting some people to help/evaluate/review. However I am not sure that my boss would agree with the idea and honestly I am a bit afraid of losing the change of publishing this work as a first author. Maybe when I get this off my hands I'll try to start an open project on a particular example of network evolution.

Links on topic:
Nodalpoint - projects ; collaborative research post
Science 2.0

Looking back two years ago - M$ vs GOOG

I was reading a story today about the keynote lecture by Bill Gates on the Mix'06 conference and I remembered posting something on the blog when I first saw a story about Microsoft moving into the search market. This is one of the funny things about having the blog is that I can go back to what I was reading and thinking back some time in the past. So from the previous post I guess Microsoft started reacting to the rise of Google more than two years ago. Retrospectively it was really hard to predict the impact of web2.0 and free software/add model. Judging by Gates' speech , only now is Microsoft really completed turned into this direction so I guess it takes some time to turn such a big boat. They managed before (see browser wars) to turn the company into the internet era and maintain dominance, let's see how they keep up this time with Google, Yahoo, Amazon, etc.

Looking back on some of the post of that time I realize how I changed my blogging habits. In the beginning I used the blog more like a link repository with short comments while currently I tend to blog more about my opinion on a topic. I'll check again in some years from now if I don't quit in the meantime :).

Wednesday, March 08, 2006

Comparative Interactomics on the rise

I am sorry for the buzzwords but I just wanted to make the point of the exaggerated trend. Following the post on Notes from the Biomass I picked up the paper from Gandhi et al in Nature Genetics. The authors analyzed the human interactome from the Human Protein Reference Database, comparing it to other protein interaction networks from different species. Honestly I was a bit surprised to see so few new ideas on the paper and I agree with the post in Notes that they should have cited some previous works. For example the paper by Cesareni et al in FEBS Letters includes a similar analysis between S. cerevisiae and D. melanogaster. Also the people working on PathBlast have shown that maybe it is more informative to look for conserved sub-networks instead of the overlap between binary-interactions. I am personally very interested in network evolution and I was hoping the authors would elaborate a bit more on the subject. As usual they just dismiss the small overlap to low coverage. Is it so obvious that species that diverged 900My to 1By ago should have such similar networks ?

Like it was the case with comparative genomics, the ability to compare cellular interaction networks of different species should be far more informative than looking at individual maps. Unfortunately it is still not so easy to map a cellular interaction network has it is to get a genome.

Just out of curiosity, I think the first time the buzz words "comparative interactomics" were used in a journal was in a news and views by Uetz and Pankratz in 2004. Since then I think two papers picked up on the term, as you can see in this pubmed search (might change with time).

Monday, March 06, 2006

Marketing and science

I just spent 48 minutes seeing this video where Seth Godin spoke to Google about marketing. He talks a lot about how it is important to put out products that have a story, that compels people to go and tell their friends. This type of networking marketing is usually referred to as viral marketing (related to memetics). It is a really nice talk (he knows how to sell his ideas :) and it got me thinking of marketing in science.

The number of journals and publications keep growing at a fast pace. Out of curiosity I took from pubmed the number of publications published every year for the past decade and we can clearly see that the trend for the near future is, if anything, for further acceleration in the rate of publication.

The other important point is that internet is probably changing the impact that an individual paper might have, irrespective of where it is published. It is easier with internet, for word-of-mouth (meaning emails, blogs, forums,etc) to raise awareness to good or controversial work than before.
So what I am getting at is that, on one hand the internet will likely help individual publications to get their deserved attention but on the other hand it will increase the importance of marketing in science. Before, to have attention, your work needed to be published in journals that were available in the libraries, now and I suspect, increasingly so in the future, you will have to have people talking about your work so that it raises above the thousands of publications published every year. It is too soon to say for sure what I prefer.

Tuesday, February 28, 2006

Track your comments with coComment

I am finally giving the coComment service a try and I will be experimenting with it for a while here on the blog. coComment aims to help us track the conversations we have on other blogs by aggregating comments and checking for replies. You decide to track or not a comment before submitting it and the comments tracked appear on your homepage at coComments. Alternatively you can read them via RSS feed or with a coComment box on your own blog. You can customize a lot the box so you could do a much better job than I did in trying to integrate it into your blog :).

You can find a lot more about this in these two posts on TechCrunch

Wednesday, February 15, 2006

Postgenomic

I have been wishing someone would come up with a science.memeorandum for some time and now there is one: Postgenomic. The site created by Stew (still in beta) aims to aggregate discussions going on in the life science blogs about papers, conferences and general science news. This adds a needed feedback to the science blogosphere and therefore will, in my opinion, increase the quality of discussion.
This site can for example become an excellent repository for comments on papers. Instead of adding a comment on a paper in the journal website now you can just blog about it and the content gets aggregated on postgenomic. I am not sure but I think we could make a greasemonkey script to check the current web page for a DOI and see if there are reviews about it in postgenomic and add a little link somewhere.

Some more links about it:
Nodalpoint
Notes and more Notes

Tuesday, February 14, 2006

The search wars turn ugly

What will convince you to change your search engine ? So far it as been all about who gives the best results and who indexes the biggest number of pages. It looks like number two (Yahoo!) and number three (MSN) search engines are considering paying you to switch. How does MSNSearchAndWin sound like ? I also taught it was some kind of joke but you can try it yourself.
To be fair, someone mentioned also that Google is thinking of paying Dell to have Google software pre-installed on the new computers.

I would prefer it was about new innovation and not just about how as more cash to give to the users. It even sounds a bit ridiculou. It is not only free they are thinking of paying us to use it. Very competitive market.

Monday, February 13, 2006

BIND in the news

There is another editorial in the last issue of Nature Biotech about database funding. It focuses on BIND, explaining the growth and later decline (due to lack of funding) of this well known interaction database. Last December, BIND and other Blueprint Initiative intellectual property was bought by Unleashed Informatics but as far as I can understand, this deal merely keeps the database available on the site and there will be no further updating for now. Knowing that both BIND and unleashed were created within the Blueprint Initiative led by principal investigator Christopher Hogue (also Chief Scientific Officer of Unleashed Informatics) then this deal was probably just symbolic and a way to increase the value of the company.

According to the Nature Biotech article BIND used up "$17.3 million in federal and Ontario government funding and another $7.8 million from the private sector" to create it's value. Without the details it looks strange that so much value, mostly built with public money, ends up in a private company. Unleashed had to agreed to maintain the access to the existing value free for all and I guess it will use BIND to attract possible buyers to their tools.

Christopher Hogue posted a pessimistic comment here sometime ago about the future of databases in general. This editorial in Nature Biotech argues that it would take two important steps to allow for more permanent databases. The first step would be for the major funding agencies to accept and discuss the need for longer lived databases. The second step would be to create mechanism to decide what databases should be recognized as matured standards.

I thought that with examples like pubmed, the sequence databases and the PDB that the need for long lived databases was obvious by know to the funding bodies. The second step is a bit more tricky. Creating a minimal and stable standard for a type of data is a complicated process and it is not obvious when a database supports such a community of researchers that it would make sense to give it maintenance funding.

Some toughts from Neil, Spitshine

A similar discussion in Nodalpoint

Monday, February 06, 2006

Become a fonero and change the world

Today I read about FON, a global community of people that share wi-fi access. They just made the news because they announced support from several well known companies (Google , Skype, Sequoia Capital, and Index Ventures) that will surely catapult FON into the sky. The basic idea is to turn any wifi router into a hotspot and have people share their internet connection by installing some software on their routers or buying pre-configured wireless routers from the company. You can only use other people's FON hotspots if you are paying for one ISP at home so this is also good for the internet service providers. You can try to make money with your FON hotspot (they call these users Bills) or you can be more utopian and give away your internet connection for free (and be called a Linus). If you do not have a FON account you are called an alien but you can still connect to a FON hotspot and you will have to pay just like at any hotspot (and the ISPs get some money from this as well).
At first glance it looks like an all win cenario but only time will tell. It is certainly one case where the more that join the better the service will become and if this gets of the ground then once you pay for a connection at home you have it almost everywhere.

This is one of those simple utopian ideas with enough practical sense to make an impact so I think I will give it a try :).

Monday, January 30, 2006

I usually don't do this but ..

This is a really good blonde joke. Got love infectious silly memes.

Sunday, January 29, 2006

BioPerl has a new site

If you use BioPerl go have a look at the re-designed site. From the full announcement at OBF:

"I am pleased to announce the release of a new website for BioPerl. The site is based on the mediawiki software that was developed for the wikipedia project. We intend the site to be a place for community input on documentation and design for the BioPerl project. There is also a fair amount of documentation started surrounding bioinformatics tools and techniques applicable to using BioPerl and some of the authors who created these resources."

Friday, January 27, 2006

Meta bloguing

I changed a couple of things on the blog template. If anybody reads this with an aggregator and all previous posts appear as updated please let me know.
I added a new section on the right bar were I plan to keep some previous post that might be interesting to discuss. I had this change in mind after reading this post in Notes from the Biomass about blogging. It is true that blogging platforms don't make it easy to revisit ideas. I'll try to find other ways to do this.

I also updated the blogroll with some links. Neil's blog and Yakafokon on bioinformatics, some tech blogs I particularly like and a the blog of a portuguese friend of mine.

Our Collective Mind II

Some time ago I posted an unusual short text about collective intelligence. I think it was motivated by the web2.0 explosion, all the blogging, social websites and the layer of other services tracking these human activities in real time. The developments in the last 2-3 years were not so much a question of technical innovation since most of the tools were already developed but it was mostly a massification effect. A lot more people started to participate online instead of just browsing. This participation is very easy to track and we have automatic services that can, for example, tell us what people are currently talking about. One can think of these services as a form of self awareness. If you go to tech.memeorandum you can see a computer algorithm tracking the currently most talked about subjects in technology and organizing them into conversations. This does not mean that the web can understand what is being talked about but it is self aware.

I read today a (very long) post by Nova Spinack about this subject of self awareness and how he proposes that we should build this on a large scale. Although I agree that this type of services are very useful I am not sure that one should try to purposely build some form of collective intelligence on such abstract terms. This idea of having everything collected under the same service feels to restrictive and not very functional. I would prefer a diversity and selection approach, just let the web decide. There is a big marked for web services right now and I don't see it fading any time soon. Therefore if collective intelligence is possible and useful then rapidly services will be built on top of each other to produce it.

If you have any interest on the topic and endorse his opinion write a post and trackback to him.

Wednesday, January 18, 2006

Power law distributions

Almost every time a lot of hype is built around an idea there is general backlash against the very same idea. In technology this happens regularly and it is maybe due to a snowball effect that leads to abuse. Initially a new concept is proposed that leads to useful new products and this in turn increases interest and funding (venture capital, etc). In response, several people copy the concept or merely tag their work with the same buzz to attract interest. Soon enough everyone is doing the same thing and the new concept reaches world fame. At this point it is hard to find actual good products based on the initial idea among all the noise. For a recent tech example just think of the web2.0 meme. Every startup now a days releases their projects in beta with some sort of tagging/social/mash-up thing. The backlash in already happening for web2.0.

What about the title ?
I had already mentioned a review article about power-law distributions. The author voiced some concern over the exaggerated conclusions researchers are making about the observation of these distributions in complex networks. Is the backlash coming for this hype?

Recently Oliveira and Barabasi published yet another paper on the ubiquity of power laws. This time it was about the correspondence patterns Darwin and Einstein where they claim that the time delays for the replies follow a power-law. This work is similar to earlier work by Barabasi about email correspondence. Quickly after, a comment was published in Nature suggesting that the data is a better fit for the lognormal distribution and this generated some discussion on the web. There is also some claims of similar previous work using the same data not properly cited.

The best summary of the whole issue comes in my opinion from Michael Mitzenmacher:
"While the rebuttal suggests the data is a better fit for the lognormal distribution, I am not a big believer in the fit-the-data approach to distinguish these distributions. The Barabasi paper actually suggested a model, which is nice, (...) anyone can come up with a power law model. The challenge is figuring out how to show your model is actually right."

Other papers have recently put questions also on the quality of the data underlying some of these studies. Is life all log-normal after all :) ?

What I actually want to discuss is the hype. Going back to the beginning of the post, how can we keep science from generating such hype around particular memes. People like Barabasi are capable of captivating the imagination of a broad audience and help bring society closer to science but usually at some cost. I think this is tied to science funding. What gets funded is what is perceived as the cutting edge, the trendy subjects. Trendy things get a lot of funding and more visibility until the whole thing crashes down with the weight of all the noise in the field.

In a brilliant paper (the one about a radio :) Lazebnik remembers some advice from David Papermaster:
"David said that every field he witnessed during his decades in biological research developed quite similarly. At the first stage, a small number of scientists would somewhat leisurely discuss a problem that would appear esoteric to others (...) Then, an unexpected observation (...) makes many realize that the previously mysterious process can be dissected with available tools and, importantly, that this effort may result in a miracle drug. At once, the field is converted into a Klondike gold rush with all the characteristic dynamics, mentality, and morals. (...) The assumed proximity of this imaginary nugget easily attracts both financial and human resources, which results in a rapid expansion of the field. The understanding of the biological process increases accordingly and results in crystal clear models that often explain everything and point at targets for future miracle drugs.(...) At some point, David said, the field reaches a stage at which models, that seemed so complete, fall apart, predictions that were considered so obvious are found to be wrong, and attempts to develop wonder drugs largely fail. (...) In other words, the field hits the wall, even though the intensity of research remains unabated for a while, resulting in thousands of publications, many of which are contradictory or largely descriptive."

Is this necessary ? Is there something about the way science is made that leads to this ? Can we change it?

Thursday, January 12, 2006

European Research Council (ERC)

For those of you who don't usually read about European research policies, the European Research Council is a projected European structure being designed to support basic research. It is now clear that the ERC will be formed but it is still unknown how much money the EU budget will reserve for it. Recently the Scientific Council of the future ERC was nominated and the chairman is none other than Fotis Kafatos, the former EMBL director. Kafatos term as EMBL director ended in May last year and his nomination as chairman of the ERC will, in my opinion, strengthen the research council and hopefully help it attract the funding required.

For further reading:
Kafatos named Chairman of ERC Council (EMBL announcement)
Chairman explains Europe's research council (interview for Nature)
Election of Chairman of Scientific Council (press release hidden among several other)

Saturday, December 10, 2005

Linking out (blogs@nature && workflows)

Rolf Apweiler called bloggers exhibitionist in a recent news special in Nature -"I have my doubts that blogging reduces information overload, but blogging will survive as it appeals to all the exhibitionists,". I hope this simplistic opinion is supported by more reasoning that was not included in the news piece because of lack of space. Blogging appeals to the easy creation of content, it makes it easier for people to have a voice. What gets people attention is how good (or bad) the content is, not the particular connections or any other bias. This makes blogs one the most democratic content medium I am aware of (compare it to newspapers, radio, tv). Discussion in Notes from the Biomass

Check out some interesting post on workflows in Hublog and Flags and Lollipops

Back to roots

I like bioinformatics because it is so useful at pointing out the next useful experiments and helping to extract knowledge from your data. This is why I think it is possible and useful to do experimental work alongside with computational work.
I have spent the last week back in the bench doing some biochemistry. I usually don't do much bench work although I have a biochemistry degree. It is at the moment not easy to keep up doing my computational work while doing the lab work but I want, until the end of my PhD, to find a way to keep doing both things at the same time. I should divide my time between the two mind sets but I am not sure of the best way.
Any ideas ?

Wednesday, November 30, 2005

Firefox 1.5

A quick post to promote the release of a new version of firefox. If you already have it, go get it here. If you don't have it yet, give it a try, it takes one or two minutes to install and has nice advantages compared to some other popular browsers (just an example of the top of my head ... it is better than the internet explorer :) ).
There are going to be some potentially funny films to see in Spreadfirefox.com.

I am still playing around with but the first surprise is immediate, much quicker to move between tabs. You can now move the tabs around with drag and drop to re-order them. New features listed here.

Monday, November 28, 2005

Meta Blogging

If by any strange reason you are searching for some blogs to read allow me to make a suggestion. Via NyTimes I found this site called TravelBlog for people blogging while traveling. From the site :"Travel Blog is a collection of travel journals, diaries, stories and photos from all around the world, ordinary people doing extraordinary things. For travellers, this site includes lots of features that help keep family and friends back home up to date with your adventure."

I would not put any of these in my usual reads but maybe I will check back to this page before my next long holidays ... umm .. sometime after I finish my phd.

Sunday, November 27, 2005

SyntheticBiology@Nature.com

This week Nature has a special issue on <buzz>Synthetic Biology</buzz>. I have currently a kind of love/hate relationship with trends in biology. It is easy to track the trends (in the recent past: genomics, proteomics, bioinformatics, systems biology, nanotechnology, synthetic biology) and it is somehow fascinating to follow them and watch them propagate. It holds for me a similar fascination has seeing a meme propagate in the web. Someone will still write a thesis on how a kid was able to put up a webpage like this one and make truck load of money selling pixels just because he ignited the curiosity of people on a global scale.
There is always a reason behind each rising trend in biology, but they are clearly too short lived to deliver on their expectations, so what is the point ? Why do these waves of buzz exist in research ? The mentality of engineering in biology is not new so why the recent interest in synthetic biology ?
I am too young to know if this has always been like this but I am inclined to think that this is just to product of increasing competition for resources (grant applications). Every once in a while scientist have to re-invent the pressing reasons why society has to invest in them. The big projects that will galvanize the masses, the next genome project.

I personally like the engineering approach to biology. Much of the work that is done in the lab where I am doing my phd is engineering oriented. Synthetic biology (or whatever it was called in the past and will be called in the future) could deliver things like cheap energy (biological solar panels),cheaper chemicals (optimized systems of production), cheap food (GMOs or some even weirder tissue cultures), clean water, improved immune-systems, etc. A quick look at the two reviews that are in this week's issue of Nature will tell you that we are still far from all of this.

The review by David Sprinzak and Michael Elowitz tries to cover broadly what as been achieved in engineering biological systems in the last couple of years (references range from 2000 to 2005). Apart from the reference to a paper on the engineering a mevalonate pathway in Escherichia coli, most of the work so far done in the field is preliminary. People have been trying to assemble simple systems and end up learning new things along the way.

The second review is authored by Drew Endy and is basically synthetic biology evangelism :). Drew Endy has been one of the voices shouting louder in support of this field and in looking for standardization and open exchange of information and materials (some notes from the biomass). The only new thing he says in this review that I have not heard before from him is a short paragraph on evolution. We are used to engineering things that do not replicate (cars, computers, tv sets, etc) and the field will have to start thinking of the consequences of evolution of the systems it tinkers with. Are the systems sustainable ? Will they change within their useful life time ?

There is one accompanying research paper reporting on a chimeric light sensing protein that is de-phosphorylated in the presence of red light. The bacteria produce lacZ in the dark and the production is decreased with increasing amounts of red light. You can make funny pictures with these bacteria but has for the real scientific value of this discovery I can link to two comments in Slashdot. Maybe that is exaggerated. Making chimeric protein receptors that work can be tricky and it is very nice that something started by college students can end up in a Nature paper.

Last but not least there is a comic ! The fantastic: "Adventures in Synthetic Biology". Ok, here is where I draw the line :) Who is this for ? Since when do teens read Nature ? How would they have access to this ? I like comics, I do ... but this is clearly not properly targeted.

Monday, November 21, 2005

BIND database runs out of funding

I only noticed today that BIND as ran out of funding. They say so on the home page and there are links to several paper regarding the issue of sustainable database funding (has of 16 November 2005).

From the frontpage of BIND:
"Finally, I would like to reiterate my conviction that public databases are essential requirements for the future of life sciences research. The question arises will these be free or will they require a subscription. Should BIND/Blueprint be sustained as a public-funded open-access database and service provider? "

I am not sure actually what would be a good way out for BIND. They could try to charge institutional access like the Faculty1000 or ISI. The other possibility would be to try to secure support from a place like NCBI or EBI. The problem is that there are several other databases available that do the same thing (MINT,DIP,GRID, IntAct,etc) so why should we pay for this service ? Why don't the protein-interaction databases fuse for example? I now that they agreed to share the data in the same format, so maybe there is not enough space for so many different databases developing new tools. The question is probably more of the curation effort then. Who should pay for the curation effort ? The users of the databases? The major institutions ? The journals (they could at least force the authors to submit interaction data directly) ?

There is also a link to the blog of Christopher Hogue called BioImplement. He expresses his views of the problem.

Saturday, November 19, 2005

Google Base simple tricks

I was playing with Gbase, just browsing for content and I noticed that when you search for content that already has a lot of entries you can restrict the outcome of the search very much like you do in a structured database. For example when you look for jobs you notice that in top you have "Refine your search" and you can click for example "job type" and if you select for example "permanent" you get something like all jobs where job type is permanent. It is all above in the URL so it is very simple to mess around until you can guess what most of those things are doing up there.

From this:
http://base.google.com/base/search?q=jobs&a_r=1&nd=0&scoring=r&us=0&a_n194=job+type&a_y194=1&a_s194=0&a_o194=0&a_v194=permanent&a_v194=

You really just need:
http://base.google.com/base/search?a_n194=job+type&a_y194=1&a_o194=0&a_v194=permanent
to get the same effect. Basically this gets all entries with "job type" equal "permanent". The 194 is not even important as long as the number is equal in all of the variables.
So this also gives the same:
http://base.google.com/base/search?a_n1=job+type&a_y1=1&a_o1=0&a_v1=permanent
a_n[identifier]=NAME
a_v[identifier]=VALUE
a_y[identifier]= ? (I think it is a boolean of some sort)
a_o[identifier]= how to evaluate the value 0=equal 1=less than 2=greater than

You can add construction like this to get an AND construction but so far I did not find an equivalent to an OR construction. This is almost good enough to work with.

So all protein sequences from S.cerevisiae would be:
http://base.google.com/base/search?a_n1=sequence+type&a_y1=1&a_o1=0&a_v1=protein&a_n2=species&a_y2=1&a_o2=0&a_v2=s.cerevisiae

Thursday, November 17, 2005

Google Base and Bioinformatics II

The Google Base service is officially open in beta (as usual). Is is mostly disappointing because you can do nothing with it really (read previous post). You can load tons of data, very rapidly although they take a lot of time to process the bulk uploads. Maybe this will speed up in the future. The problem is once you have your structured data in Google Base you cannot do anything with it apart from searching and looking at it with the browser. I uploaded a couple of protein sequences just for fun. I called the item "biological sequence" and I gave it very simple attributes like sequence, id, and type. The upload failed because I did not have a title so I added title and just copied the id field. Not very exciting right.

I guess you can scrape the data off it automatically but that is not very nice. This for example gets the object ids for the biological sequences I uploaded:


use LWP::UserAgent;
use HTTP::Request;
my $url = "http://base.google.com/base/search?q=biological+sequence";
my $ua = new LWP::UserAgent();
my $req = HTTP::Request->new('GET',$url);
my $res = $ua->request($req);
open(DATA, ">google.base.temp") || die "outputfile didn't open $!";
print DATA $res->content;
close DATA;
open (IN,"<google.base.temp")|| die "Error in input $!";
grep(/oid=([0-9]+)\">(\S+)</ && ($data{$1}=$2) ,<IN>);
close IN;
foreach $id (keys %data) {print $id,"\n";}

With the object ids then you can do the same to get the sequences.

Anyway, everybody is half expecting that one day google will release an API to do this properly. So coming back to scientific research, is this useful for anything ? Even with a proper API this is just a database. It will make it easy for people to rapidly set up a database and maybe google can make a simple template webpage service do display the content of the structured database. It would be a nice add-on to blogger for example. You could get a tile to put in your blog with an easy way to display the content of your structured database.

For virtual online collaborative research (aka science 2.0 :)?) this is potentially useful because you get a free tool to set up a database for a given project. Apart from this I don't see potential applications but like the name says it is just the base for something.

Monday, November 14, 2005

The Human Puppet

One of the current trends in our changing internet is the phenomena of "collective intelligence" (web2.0 buzz) where the rise and ease of individual participation can result in amazing collective efforts. The usual example for collective intelligence is the success of Wikipedia but more examples are sure to follow.
This sets the grounds for a possibly strange scenario in a kind of sci fi "what if" game. What if a human being decided that he/she did not want to decide anymore ? (funny paradox :) - "I'll be a vessel for the collective intelligence of the web, I'll be the human puppet". Taken to the extreme this someone would walk around with a webcam and with easy tools to constantly interact with the web. The ultimate big brother but voluntary. The masses in the web would constantly discuss and decide the life of the puppet. This someone would benefit from the knowledge and experience of a huge group of people and could, in theory really stand on the shoulders of giants.

Of course this is an extreme scenario that might not come to pass, but sci fi is useful to think of the possible consequences of a trend. Lighter versions of this scenario probably occur already in the blogosphere when people talk online about their daily lives and receive council from anonymous people.

Would someone ever give up their individuality to be directed by a collective intelligence ? Would a group of people be attracted by the chance of directing someone's life ?

Thursday, November 10, 2005

In the latest issue of Current Biology there is a short two-page interview (sub-only) with Ronald Plasterk, current director of the Hubrecht Laboratory in Utrecht.
He had some very funny things to say about systems biology :
"The fundamental misconception of systems biology advocates is that one could create a virtual cell, and use big computers to model life and make discoveries. None of these modellers ever predicted that small microRNAs would play a role. One makes discoveries by watching, working, checking. They want to be Darwin, but do not want to waste years on the Beagle. They want sex but no love, icing but no cake. Scientific pornography."

I had a great laugh with this one :), however I happen to be working in a lab that is making software to exactly this and I disagree with this analogy. Of course you cannot discover something with your model about biological mechanisms that we know nothing about, but for sure that modeling approaches can help guide experimental work. If you model fails to explain an observation you can use the model to guide your next experiment. You go on by perfecting your model based on the results and so on. These cycles are not much different from what biologist have been doing intuitively but I think that few people would disagree that formalizing this process with the help of computational tools is a good idea.

Sunday, November 06, 2005

The internet strategies of scientific journals

After a post in Nodalpoint about Nature's podcast I was left thinking a bit about the different responses of the different well known science journals to the increase of internet usage and changes in the technologies available.
I took a quick look at the publishing houses behind nature (Nature Publishing Group), cell (Cell Press), science (AAAS), PLoS and the BMC journals. There are a lot more publishers but these are sufficient to make the point.
What is the first impact? Only a fraction of these have the portal attitude (mostly Nature and the BMC journals) with content in the first page and gateways of specialized content. The rest have almost no real content apart from linking to the respective journals.
What if we try to dig further ? Well they all have an RSS feed to the content. Funny enough almost all of them have a jobs listing (except PLoS). Almost all have a list of most accessed articles (except Science).
Only Science and Nature produce news content for the general public that are good to attract other people than researchers to their sites. The equivalent in BMC would be the content of The Scientist that they have on the site and in PLoS it would be the synopsis that come with all papers.
How many allow for comments ? Only the two most recent publishers (BMC and PLoS) but PLoS is a bit more formal about it, and Science allow for comments online.
Then it comes downs to some particular content and services. BMC has several possible interesting services like the Peoples Archive, images MD, Primers in Biology. Then there is Nature with Connotea, Nature podcast, Nature products and Nature events.

So what is the point ? In the tech world first it was all about portals and creating content to keep people coming back. Nowadays it seems to be more about free services and there are very few of these publishers following the trend. Good services build brand and attract viewers.
The simple conclusion is that only Nature and BMC are building their sites and playing with new services like a tech company would and although the impact at present time is minimal, when researchers start using more online services these sites will have a head start.

Thursday, November 03, 2005

Recent reads - two useful applications of bioinformatics

Is bioinformatics actually producing any useful tools or discovering anything new ? I would like to think so :). Here is a table from The Scientist showing the top ten cited papers of the last 2 years, last 10 years and of all time. Blast and Clustal are among the two ten cited papers in the last 10 years and MFold is within the top ten cited papers of the last two years.

Keeping in the spirit of practical applications of computational biology here two recent papers I read.

One is about the computational design of ribozymes. The authors computationally designed different ribozymes that could perform different logical functions. For example they were able to design a AND rybozyme that would self cleave only in the presence of two different particular oligos. They experimentally validated the results in vitro. These ribozymes can be combined to make more complicated circuits and could ultimately be used inside the cells to interfere with the networks in a rational matter or maybe to act as sensors,etc. They don't discuss how applicable this results are for in-vivo studies since ion content, pH and a lot of other things cannot be controlled in the same way.

Another interesting paper is about predicting the specificity of protein-DNA binding using structural models. They did this by developing a model for the free energy of protein-DNA interactions. With the model developed they could calculate the binding energy for structures of proteins bound to DNA and to any such complex after changing the bases in the DNA sites in contact with the protein. This results in a position specific scoring matrix that informs us of what are the preferred nucleotides at each positions for a particular DNA binding protein domain.
The protein-DNA interaction module is incorporated into the ROSETTA package. The authors provide all experimental datasets used in the supplementary material that other people might use to compare with other methods. The lab that I am working in has a similar software package called Fold-X.

Assuming that the structural coverage of biological parts will continue the current growing trend these structure based methods will become even more useful since one can in principle apply them by modeling the domain of interest by homology.

Tuesday, November 01, 2005

Our collective mind

As I sit here quietly blogging my thoughts away you are there listening. One click away and I share this with the world. Millions of clicks sharing their feelings, showing what they are seeing, calling out for attention, collectivly understanding the world. Amazing conversations are being automatically tracked around the whole world and we can participate. People are thinking that one day we will see emergent properties in the web. Something like it becoming alive. What do you mean .. one day ? One click more and another neuron fires, another pulse in the live wires connecting us all. We are just awaking up.