Wednesday, May 31, 2006

Bringing democracy to the net

Democracy is most often thought of as in opposition to totalitarianism. In this case I mean democracy in opposition to anarchy. Some people are raising their voices against the trend of collective intelligence/wisdom of the crows that have been the hype of the net for the past few years. Wikipedia is the crown jewel of this trend of empowering people for the common good and probably as a result of the project's visibility it has been the one to take the heat from the backlash.

Is Wikipedia dead like Nicholas Carr suggests in his blog ? His provocative title was a flame bait but he does call attention to some interesting things happening at Wikipedia. The wikipedia is not dead , it is just changing. It has to change to cope with the increase in visibility, vandalism and to deal with situations where no real consensus is possible.
The system is evolving by restricting anonymous posting and allowing editors to apply temporary editing restrictions to some pages. It is evolving to become more bureaucratic in nature with disputes and mechanisms to deal with the discord. What Nicholas Carr said is dead is the ideal that anyone can edit anything in wikipedia and I would say this is actually good news.

Following his post on the death of Wikipedia, Carr points to an assay by Jaron Lanier entitled Digital Maoism. It is a bit long but I highly recommend it.
Some quotes from the text:
"Every authentic example of collective intelligence that I am aware of also shows how that collective was guided or inspired by well-meaning individuals. These people focused the collective and in some cases also corrected for some of the common hive mind failure modes. The balancing of influence between people and collectives is the heart of the design of democracies, scientific communities, and many other long-standing projects. "


Sites like Wikipedia are important online experiments. They are trying to develop the tools that allow useful work to come out from millions of very small contributions. I think this will have to go trough some representative democracy systems. We still have to work on ways to establish the governing body in these internet systems. Essentially to whom we decide to deposit trust for that particular task or realm of knowledge. For this we will need better ways to define identity online and to establish trust relationships.

Further reading:
Wiki-truth

Friday, May 26, 2006

The Human Puppet (2)

In November I rambled about a possible sci-fi scenario. It was about a human person giving away their will to be directed by the masses in the internet. A vessel for the "collective intelligence". A voluntary and extreme reality show.

Well, there goes the sci-fi, you can participate in it in about 19 days. Via TechCrunch I found this site:

Kieran Vogel will make Internet television history when he becomes the first person to give total control of his life to the Internet.
(...)
Through an interactive media platform Kieran will live by the decisions the internet decides such as:

# What time he wakes up
# What he wears
# What he eats
# Who he dates
# What he watches


I get a visceral negative response to this. Although this is just a reality show and it is all going to happen inside a house I think it will be important to keep this in mind. In the future technology will make web even more pervasive then today and there are scenarios along the lines of this human puppet idea that could have negative consequences.
I guess what I am thinking is that the same technologies that helps us to collaborate can also be use to control (sounds a bit obvious). In the end the only difference is on how much do the people involved want to (or can) exercise their will power.

Thursday, May 25, 2006

Using viral memes to request computer time

Every time we direct the browser somewhere, dedicating your attention, some computer processing time is used to display the page. This includes a lot of client side processing like all the javascript in all that nice looking AJAX stuff. What if we could harvest some of this computer processing power to solve very small tasks, something like grid computing.
How would this work ? There could be a video server that would allow me to put a video on my blog (like google video) or a simple game or whatever thing that people would enjoy and spend a little time doing. During this time there would be a package downloaded from the same server, some processing done on the client side and a result sent back. If people enjoy the video/game/whatever and it goes viral then it spreads all over the blogs and any person dedicating their attention to it is contributing computer power to solve some task. Maybe this could work as an alternative to advertising ? Good content would be traded for computer power. To compare, Sun is selling computer power in the US for 1 dollar an hour. Of course this type of very small scale grid processing would be worth much less.

Tags: , ,

Wednesday, May 24, 2006

Conference blogging and SB2.0

In case you missed the Synthetic Biology 2.0 meeting and want a quick summary of what happened there you can take a look at some blogs. There were at least 4 bloggers at the conference. Oliver Morton (chief news and features editor of Nature) has a series of posts in his old blog. Rob Carlson described how he and Drew Endy were calling the field intentional biology. Alex Mallet from Drew Endy's lab has a quick summary of the meeting and finally Mackenzie has in his cis-action by far the best coverage with lots more to read.

I hope they put up on the site the recorded talks since I missed a lot of interesting things during the live webcast.

In the third day of the meeting (that was not available in the live webcast) there was a discussion about possible self-regulation in the field (as in the 1975 Asilomar meeting). According to an article in NewScientist the attending researchers decided against self-regulation measures.


Saturday, May 20, 2006

Synthetic Biology & best practices

There is a Synthetic Biology conference going on in Berkeley (webcast here) and they are going to talk about the subject of best practices in one of the days. There is a document online with an outline of some of the subjects up for discussion. In reaction to this, a group of organization published an open letter for the people attending the meeting.
From the text:
We are writing to express our deep concerns about the rapidly developing field of Synthetic Biology that is attempting to create novel life forms and artificial living systems. We believe that this potentially powerful technology is being developed without proper societal debate concerning socio-economic, security, health, environmental and human rights implications. We are alarmed that synthetic biologists meeting this weekend intend to vote on a scheme of voluntary self-regulation without consulting or involving broader social groups. We urge you to withdraw these self-governance proposals and participate in a process of open and inclusive oversight of this technology.

Forms of self-regulation are not incompatible with open discussion with the broader society nor with state regulation. Do we even need regulation at this point ?


The internet and the study of human intelligence

I started reading a book on machine learning methods last night and my mind floated away to thinking about the internet and artificial intelligence (yes the book is a bit boring :).
Anyway, one thing that I thought about was how the internet might become (or is already) a very good place to study (human) intelligence. Some people are very transparent on the net and if anything the trend is for people to start sharing their lives or at least their view of the world earlier. So it is possible to get an idea of what someone is exposed to, what people read, films they see, some of their life experiences, etc. In some sense you can access someone's input in life.
On the other hand you can also read this person's opinions when presented with some content. Person X with known past experiences Y was exposed to Z and reacted in this way. With this information we could probably learn a lot about human thought processes.


A little bit of this a little bit of that ...

What do you get when you mix humans/sex/religion/evolution? A big media hype.
Also, given that a big portion of the scientist currently blogging are working on evolution you also get a lot of buzzing in the science blogosphere. No wonder then why this paper reached the top spot in postgenomic.

This one is a very good example of the usefulness of blogs and why we should really promote more science communication online. The paper was released in advanced online publication and some days after you can already read a lot of opinions about it. It is not just the blog entries but also all the comments on these blog posts. As a result of this we not only get the results and discussion from the paper but the opinion of whoever decided to participate in the discussion.

Wednesday, May 17, 2006

Postgenomic greasemonkey script (2)

I have posted the Postgenomic script I mentioned in the previous post in the Nodalpoint wiki page. There are some instructions there on how to get it running. If you have some problems or suggestions leave some comments here or in the forum in Nodalpoint. Right now it is only set to work with the Nature journals but it should work with more.


Saturday, May 13, 2006

Postgenomics script for Firefox

I am playing around with greasemonkey to try to add links to Postgenomic to journal websites. The basic idea is to search the webpage you are seeing (like a Nature website for example) for papers that have been talked about in blogs and are tracked by Postgenomic. When one is found a little picture is added with a link to the Postgenomic page talking about the paper.
The result is something like this (in the case of the table of contents):


Or like this when viewing the paper itself:


In another journal:


I am more comfortable with Perl, but anyway I think it works as a proof or principle. If Stew agrees I'll probably post the script in Nodalpoint for people to improve or just try it out.

Thursday, May 11, 2006

Google Trends and Co-Op

There some new Google services up and running and buzzing around the blogs today. I only briefly took a look around them.
Google Trends is like Google Finance for anything search trend than you want to analyze. Very useful for someone wanting to waste time instead of doing some productive work ;). You can compare the search and news volume for different terms like:

It gets the data from all the google searches so it really does not reflect the trends within the scientific community.

The other new tool out yesterday is Google Co-Op, the start of social search for Google. It looks as obscure as Google Base so I can again try to make some weird connection to how researcher might use it :). It looks like Google Co-Op is a way for users to further personalize their search. User can subscribe to providers that offer their knowledge/guidance to shape some of the results you see in your search. If you search for example for alzheimer's you should see on the top of the results some refinement that you can do. For example you can look only at treatment related results. This was possible because a list of contributors have labeled a lot of content according to some rules.

Anyone can create a directory and start labeling content following an XML schema that describes the "context". So anyone or (more likely) any group of people can add metadata to content and have it available in google. The obvious application for science would be to have metadata on scientific publications available. Maybe getting Connotea and CiteULike data into a google directory for example would be useful. These sites can still go on developing the niche specific tools but we could benefit from having a lot of the tagging metadata available in google.


Wednesday, May 10, 2006

Nature Protocols

Nature continues clearly the most innovative of the publishing houses in my view. A new web site is up in beta phase called Nature Protocols:

Nature Protocols is a new online web resource for laboratory protocols. The site, currently in beta phase, will contain high quality, peer-reviewed protocols commissioned by the Nature Protocols Editorial team and will also publish content posted onto the site by the community

They accept different types of content:
* Peer-reviewed protocols
* Protocols related to primary research papers in Nature journals
* Company Protocols and Application notes
* Non peer-reviewed (Community) protocols

There are already several protocol websites already out there so what is the point ? For Nature I guess it is obvious. Just like most portal websites they are creating a very good place to put ads. I am sure that all these protocols will have links to products on their Nature products and a lot of ads. The second advantage for Nature is the stickiness of the service. More people will come back to the website to look for protocols and stumble on to Nature content, increasing visibility for the journals and their impact.

A little detail is that, as they say above, the protocols from the papers published in the Nature journals will be made available on the website. On one hand this sounds great because the methods sections in the papers are usually so small (due to restrictions for publication) that they are most of the times incredibly hard to decipher (and usually put into supplementary materials). On the other hand, this will increase even further the tendency to hide away from the paper the really important pars of the research, the results and how these where obtained (methods) and to show only the subjective interpretations of the authors.
This reminds me of a recent editorial by Gregory A Petsko in Genome Biology (sub only). Here is how is states the problem :) - "The tendency to marginalize the methods is threatening to turn papers in journals like Nature and Science into glorified press releases."

For scientists this will be a very useful resource. Nature has a lot of appeal and will be able to quickly create a lot of really good content by inviting experienced scientists to write up their protocols full with tips and tricks accumulated over years of experience. This is the easy part for science portals, the content comes free. If somebody went to Yahoo and told them that scientist actually pay scientific journals to please please show our created content they would probably laugh :). Yahoo/MSN and other web portals have to pay people to create the content that they have on their sites.

web2.0@EMBL

The EMBL Centre for Computational Biology has announced a series of talks related to novel concepts and easy-to-use web tools for biologists. So far there are three schedule talks:

Session 1 - Using new web concepts for more efficient research - an introduction for the less-techy crowd
Time/place: Tue, May 16th, 2006; 14:30; Small Operon

This one I think will introduce the concepts around what is called web2.0 and the potential impact these might have for researchers. I am really curious to see how big will the "less-techy crowd" really be :).

The following sessions are a bit more specific dealing with particular problems we might have in our activities and how can some of the recent web technologies help us deal with them.

Session 2 - Information overflow? Stay tuned with a click (May 23rd, 2006; 14:30;)
Session 3 - Tags: simply organize and share links and references with keywords (May 30th, 2006; 14:30)
Session 4 - Stop emailing huge files: How to jointly edit manuscripts and share data (June 6th, 2006; 14:30;)
All in the Small Operon, here in the EMBL Heidelberg

I commend the efforts of the EMBL CCB and I hope that a lot of people turn up. Let's see if the open collaborative ideas come up on the discussions. If you are in the neighborhood and are interested, come on by and help with the discussion (map).

Tags: ,

Tuesday, April 25, 2006

Engineering a scientific culture

In a commentary in Cell, Gerald Rubin describes Janelia Farm, the new research campus of the Howard Hughes Medical Institute. If you cannot access the commentary, there is a lot of information available on the website such as this flash presentation (oozing with PR talk).

In summary (as I understood it) the objective is to create a collaborative working environment where scientist can explore risky and long term projects without having to worry about applying for grants and publishing on very regular basis.
Group leaders in Janelia Farm will
- have small groups (two to six)
- not be able to apply to outside funding
- still work in the bench

Unless you are really interested in managing resources and all the hassle of applying for grants, this sounds very appealing.

Also, there is no limit on the amount of time the group leader can stay at Janelia Farm, as long as they pass a review process every 5 years. This is unlike for example here at EMBL where most people are forced to move after 9 years (there is a review process after 5 years).

Since the main objectives of Janelia Farm is to work on long term projects that can have significant impact, the review process will not focus on publications but on more subjective criteria like:
"(1) the ability to define and the willingness to tackle difficult and important problems; (2) originality, creativity, and diligence in the pursuit of solutions to those problems; and (3) contributions to the overall intellectual life of the campus by offering constructive criticism, mentoring, technical advice, and in some cases, collaborations with their colleagues and visiting scientists"

Sounds like a researchers paradise :), do the science we will do the rest for you.
It will be interesting to see in some years if they manage to create such an environment. The lack of very objective criteria and no limit on the stay in the campus might lead to some corruption.

Friday, April 21, 2006

Posting data on your blog

From Postgenomic I read this blog post in Science and Politics on science blogs. Bora Zivkovic describes in his post the different types of science blogging with several examples. The most interesting part for me was his discussion of posting hypothesis and unpublished data. I was very happy to see that he already had some post with his own unpublished data and that the discussion about science communication online is coming up in different communities.

His answer to the scoop problem :
But, putting data on a blog is a fast way of getting the data out with a date/time stamp on it. It is a way to scoop the competition. Once the data are published in a real Journal, you can refer back to your blog post and, by doing that, establish your primacy.

There are some problems with this. For example, people hosting their blogs can try to forge the dates, so it would be best to have a third party time-stamping the data. Postgenomic would be great for this, there could be another section in the aggregator to track posts with data. Some journals will probably complain about prior publication and decline to publish something already seen in a blog.

The problems with current publishing systems and the agonizing feeling of seeing your hard work published by other people will probably help drive some change in science communication. Blogging data would make science communication more real-time and transparent, hopefully reducing the number of wasted resources and frustrations with overlapping research.

This is a topic I usually come back to once in while so I have mentioned this here before. The stream like format of the blog makes it hard to keep posting all the relevant links on the topic so I think from now on I will just link to the last post on the topic to at least form a connected chain.

Tuesday, April 11, 2006

Stable scientific databases

The explosion of scientific data coming from high throughput experimental methods has lead to the creation of several new databases for biological information (protein structures, genomes, metabolic networks and kinetic rates, expression data, protein interactions, etc). Given that funding is generally attributed for a limited time and for defined projects it is possible to obtain money to start a database project but it very difficult to obtain a stable source of funding to sustain a useful database. I mentioned this before more than once when talking about the funding problems of BIND.
In this issue of The Scientist there is a short white paper entitle "Save our Data!". It details the recommendations of The Plant Genome Database Working Group for the problems currently faced by the life science databases.

I emphasize here four point they make:
2. Develop a funding mechanism that would support biological databases for longer cycle times than under current mechanisms.
3. Foster curation as a career path.
6. Separate the technical infrastructure from the human infrastructure. Many automated computational tasks do not require specialized species- or clade-specific knowledge.
7. Standardize data formats and user interfaces.


The first and last points were also discussed a recent editorial in Nature Biotech.

What was a bit of a surprise for me is their 3rd point on fostering curation as career path. Is it really necessary to have professional curators ? I am a bit divided between a more conservative approach at data curation with a team of professional curators or a wisdom of the crowds type of approach were tools are given to the communities and they solve the curation problems. I think it would be more efficient to find ways to have the people producing the data, curating it automatically into the databases. To have this happen it has to be really easy and immediate to do. I still think that journals are the only ones capable of enforcing this process.

The 6th point they make is surely important even if the curation effort are to be pushed back to the people producing the data. It is important to make the process of curating the data as automatic and easy as possible.

Friday, April 07, 2006

Retracted scientific work still gets cited

Science has a news focus on scientific misconduct. A particular study tracked the citation of papers that were already retracted. They found that scientists keep citing retracted papers.
Some editors contacted by Science said that they do not have the resources to look up every citation in every paper to help purge the literature of citations to retracted work. In my opinion this is not such a complicated problem. If journals agreed to submit to a central repository all retractions, then the citations could very easily be checked against the database and removed. Even with such an automatic system , scientists should have the responsibility to keep up with the works being retracted in their fields.
Since retractions are publicly announced by the journals pubmed has already some of this information available. If you search for retraction in the title in pubmed you can see several of these announcements (not all are retractions). In some cases, when you search for a the title of a retracted paper you can see in pubmed a link to the retraction but this is not always the case. All that is needed is for publishing houses to agree on a single format to publish retractions and repositories to make sure all retractions are appended to the former entries to the same publication.

Tuesday, April 04, 2006

Viral marketing gone wrong

The social internet has emerged as an ideal ground for marketing. People enjoy spreading news and in the internet meme spreading sometimes resembles a viral infection propagating throughout the network.
Some companies like Google have made their success on this type of word-of-mouth marketing. If you can get a good fraction of the social internet to be attached to your products in such a way that they want to tell their friends all about it , you don't have to spend money in marketing campaigns.
The important point here is that a fraction of people must be engaged in the meme, they must find it so cool and interesting that they just have to go and tell their friends and infect them with the enthusiasm. How do you do this ? That's the hard part I guess.
So, the marketing geniuses of Chevrolet decided that they would try their hands at viral marketing. To get people engaged they decided to have the masses build the ads. We usually like what we build and we want to show it to our friends, so the idea actually does not sound so bad right ?! :) well , this would have been a fantastic marketing idea, if most people actually had good things to say about the product.

Here is an example of the videos coming out from the campaign:


I worried before that this type of marketing could be a negative consequence of science communication online but these examples just show that directing attention alone is not enough, people will judge what they find and are free to criticize.

Monday, April 03, 2006

The Human interactome project

Marc Vidal has a letter in The Scientist urging scientist and funding agencies to increase efforts to map all human protein interactions. He suggests that different labs work on different parts of the huge search space (around 22000^2 excluding splice variants) and of course that funding agencies give out more money to support the effort. He makes an interesting point when he compares funding for genome projects with interactome mapping. I also think that the interactome mapping should be view in the same way has genome sequencing and that the money invested would certainly result in significant progress in basic and medical research.
The only thing I would add to my own wish list is that some groups would start comparative projects at the same time. Even if it takes longer to complete the human interactome it would be much more informative to have of map of the ortholog proteins in a sufficiently close species to compare with (like mouse). Alternatively some funding could go specifically to comparative projects studying for example the interactomes of different yeasts (it is easy to guess that I would really really like to have this data for analysis :).


Friday, March 31, 2006

Get ready for the 1st of April storm

Tomorrow is April fools day and there is a long tradition in the media to put out jokes on this day. Some years ago this was, for me, almost not noticeable. I knew that the newscasts in the different TV channels would have at least one spoof story. Maybe I would notice the joke in one or two newspapers if I actually read one that day. These days I get almost everything from the internet and it is no longer just from a handful of sources, it comes from tons of media sites, blogs and aggregators. So every year that I am more connect I notice more the 1st of April as the day where everybody goes nuts on the web. This year it even starts early has you can see by this gold fish story in the economist. Maybe spishine's post on quitting blogging was also an example of early April fools ;).

Tuesday, March 21, 2006

Wiki-Science

From Postgenomic (now on Seed Media Group servers), I picked up this post with some speculations on the future of science. It is a bit long but interesting. It was written by the former editor of Wired magazine so it is naturally biased to speculations on technology changes.

My favorite prediction is what he called Wiki-Science:

"Wiki-Science - The average number of authors per paper continues to rise. With massive collaborations, the numbers will boom. Experiments involving thousands of investigators collaborating on a "paper" will commonplace. The paper is ongoing, and never finished. It becomes a trail of edits and experiments posted in real time - an ever evolving "document." Contributions are not assigned. Tools for tracking credit and contributions will be vital. Responsibilities for errors will be hard to pin down. Wiki-science will often be the first word on a new area. Some researchers will specialize in refining ideas first proposed by wiki-science."

I am trying to write a paper right now and just last week the thought crossed my mind of just doing it online in Nodalpoint's wiki pages and inviting some people to help/evaluate/review. However I am not sure that my boss would agree with the idea and honestly I am a bit afraid of losing the change of publishing this work as a first author. Maybe when I get this off my hands I'll try to start an open project on a particular example of network evolution.

Links on topic:
Nodalpoint - projects ; collaborative research post
Science 2.0