Web 2.0

A new look at who writes Wikipedia?

Aaron Swartz in his article Who Writes Wikipedia? takes a new look at the oft-repeated claim by Jimmy Wales, and thus almost everybody else, that Wikipedia’s content is really mainly contributed by a small core group of about 500 people.

Swartz, at Stanford, cleverly unpicks the claim that “the most active 2%, which is 1400 people, have done 73.4% of all the edits” by actually looking at the actual contributions in some randomly chosen articles and anaylses them not by number of edits, but by content of edits.

To investigate more formally, I purchased some time on a computer cluster and downloaded a copy of the Wikipedia archives. I wrote a little program to go through each edit and count how much of it remained in the latest version.† Instead of counting edits, as Wales did, I counted the number of letters a user actually contributed to the present article.

If you just count edits, it appears the biggest contributors to the Alan Alda article (7 of the top 10) are registered users who (all but 2) have made thousands of edits to the site. Indeed, #4 has made over 7,000 edits while #7 has over 25,000. In other words, if you use Wales’s methods, you get Wales’s results: most of the content seems to be written by heavy editors.

But when you count letters, the picture dramatically changes: few of the contributors (2 out of the top 10) are even registered and most (6 out of the top 10) have made less than 25 edits to the entire site. In fact, #9 has made exactly one edit — this one! With the more reasonable metric — indeed, the one Wales himself said he planned to use in the next revision of his study — the result completely reverses.

I don’t have the resources to run this calculation across all of Wikipedia (there are over 60 billion edits!), but I ran it on several more randomly-selected articles and the results were much the same. For example, the largest portion of the Anaconda article was written by a user who only made 2 edits to it (and only 100 on the entire site). By contrast, the largest number of edits were made by a user who appears to have contributed no text to the final article (the edits were all deleting things and moving things around).

[UPDATE – more studies on this from Wikimania 2006 (via Ross Mayfield)]