Wednesday, January 18, 2006

Power law distributions

Almost every time a lot of hype is built around an idea there is general backlash against the very same idea. In technology this happens regularly and it is maybe due to a snowball effect that leads to abuse. Initially a new concept is proposed that leads to useful new products and this in turn increases interest and funding (venture capital, etc). In response, several people copy the concept or merely tag their work with the same buzz to attract interest. Soon enough everyone is doing the same thing and the new concept reaches world fame. At this point it is hard to find actual good products based on the initial idea among all the noise. For a recent tech example just think of the web2.0 meme. Every startup now a days releases their projects in beta with some sort of tagging/social/mash-up thing. The backlash in already happening for web2.0.

What about the title ?
I had already mentioned a review article about power-law distributions. The author voiced some concern over the exaggerated conclusions researchers are making about the observation of these distributions in complex networks. Is the backlash coming for this hype?

Recently Oliveira and Barabasi published yet another paper on the ubiquity of power laws. This time it was about the correspondence patterns Darwin and Einstein where they claim that the time delays for the replies follow a power-law. This work is similar to earlier work by Barabasi about email correspondence. Quickly after, a comment was published in Nature suggesting that the data is a better fit for the lognormal distribution and this generated some discussion on the web. There is also some claims of similar previous work using the same data not properly cited.

The best summary of the whole issue comes in my opinion from Michael Mitzenmacher:
"While the rebuttal suggests the data is a better fit for the lognormal distribution, I am not a big believer in the fit-the-data approach to distinguish these distributions. The Barabasi paper actually suggested a model, which is nice, (...) anyone can come up with a power law model. The challenge is figuring out how to show your model is actually right."

Other papers have recently put questions also on the quality of the data underlying some of these studies. Is life all log-normal after all :) ?

What I actually want to discuss is the hype. Going back to the beginning of the post, how can we keep science from generating such hype around particular memes. People like Barabasi are capable of captivating the imagination of a broad audience and help bring society closer to science but usually at some cost. I think this is tied to science funding. What gets funded is what is perceived as the cutting edge, the trendy subjects. Trendy things get a lot of funding and more visibility until the whole thing crashes down with the weight of all the noise in the field.

In a brilliant paper (the one about a radio :) Lazebnik remembers some advice from David Papermaster:
"David said that every field he witnessed during his decades in biological research developed quite similarly. At the first stage, a small number of scientists would somewhat leisurely discuss a problem that would appear esoteric to others (...) Then, an unexpected observation (...) makes many realize that the previously mysterious process can be dissected with available tools and, importantly, that this effort may result in a miracle drug. At once, the field is converted into a Klondike gold rush with all the characteristic dynamics, mentality, and morals. (...) The assumed proximity of this imaginary nugget easily attracts both financial and human resources, which results in a rapid expansion of the field. The understanding of the biological process increases accordingly and results in crystal clear models that often explain everything and point at targets for future miracle drugs.(...) At some point, David said, the field reaches a stage at which models, that seemed so complete, fall apart, predictions that were considered so obvious are found to be wrong, and attempts to develop wonder drugs largely fail. (...) In other words, the field hits the wall, even though the intensity of research remains unabated for a while, resulting in thousands of publications, many of which are contradictory or largely descriptive."

Is this necessary ? Is there something about the way science is made that leads to this ? Can we change it?