By Alan Haskvitz

The following is from László Kozma, a grad-student at the Helsinki University of Technology. He has created a most entertaining site. It enables the viewer to see who is editing Wikipedia and from where. Here is the site he created:

The section below if from his Frequently Asked section.“WikipediaVision is a visualization of edits to the English Wikipedia, almost the same time as they happen. The idea came after seeing flickervision and twittervision, both created by David Troy. WikipediaVision, however, was designed and implemented by me alone.

For each wikipedia edit I display the title of the article, the summary of the edit (if the person who made it gave any summary), link to the changes that were made to the article, geographical location of the wikipedia user and the time the edit happened.

How do you know about who edits what?

There is a public page on wikipedia that is automatically updated and contains all the recent edits, as well as the user name or IP address of anyone who edits a page.

How do you find out the geographical locations?

There are open APIs for translating IP addresses to their corresponding geographical locations. I have used the Google Maps API, and GoNew’s IP to country service. A big thanks to all of them.

Are the locations always accurate?

No. They are just as accurate as the services above. Some of the locations can be mistaken, for some addresses only the country name is found, for others the location can not be found at all.

Are all the edits displayed?

No. First of all, edits on wikipedia happen at a faster rate, than what could be comfortably readable, so I have to skip some of them. Second, a good part of the edits are done by registered users. Their IP address is protected by wikipedia, therefore I could only display anonymous edits. Thirdly, those edits, where the IP address could not be located are skipped. Fourth, edits that are similar or identical to recent edits are often skipped. This still leaves more than enough to be visualized.

Is this at least a statistically representative sample of wikipedia edits (whatever that means)?

No. There are many biases introduced. We only see anonymous edits. We only see edits from IP addresses that could be located. If the location found is very generic (such as European Union), then it is not visualized at all. Hopefully WikipediaVision still captures a general sense of what people are thinking about all over the world.

You say that the visualization is almost real-time. So why do you show stuff several minutes old?

From a technical perspective it wouldn’t be too difficult to show data with only a few seconds delay. I have to pre-fetch a few edits at once, however, because I am constrained from several sides: my monthly bandwidth with my hosting provider is very low, so I try to avoid having too frequent page requests from users; I am using online geolocation and I have a limit on the number of IPs I can geolocate every hour; I don’t want to make requests to too often, not to put unnecessary burden on their servers and risk being banned from the site. In my opinion, introducing a few minutes delay doesn’t hurt the experience too much.

Can you do a similar visualization for my site?

Contact me and maybe we can work out some solution. If your site is comparable to Wikipedia in how important and interesting it is for the general public, I might consider doing it for fun. Otherwise, I might have to say no, unless you make an offer I cannot refuse. Perhaps in the future I’ll make an open API to let people visualize their own stuff or release the whole thing under an open source license. Considering the amount of free time I have, that could take some time.

Who are you, anyway?

László Kozma, I am a grad-student at the Helsinki University of Technology. If you want more info about me, my projects, or want to contact me, check out my page.”