Skip to: Site menu | Main content

Lecture by Brion Vibber, CTO of Wikipedia

I recently attended a lecture by Brion Vibber, chief technology officer of Wikipedia. The Orlando Java User's Group hosted Brion at its monthly meeting and Brion gave a great overview of Wikipedia's technology infrastructure.

No big surprise, Wikipedia runs on a typical LAMP stack. They use tools like Squid and memcached to help manage the traffic, but in the end, it's LAMP. Brion explained that while the main servers for Wikipedia are located in Florida, they also have two donated server farms in Seoul and Amsterdam.

Brion mentioned that the great thing about the donated server farms is that they don't have to pay for bandwidth from those servers, so they've put a lot of effort into cacheing their content as efficiently as possible.

Perhaps the most impressive (or surprising) fact that Brion mentioned was that the entire English-language Wikipedia text content was only about 250GB. He said some things that led me to believe that the text is stored compressed in MySql and that this figure includes the complete revision history as well. He made a point to mention that this figure doesn't include the countless photos and video items that are attached to the various entries.

Overall, it was a good talk, very reminicent of a Drupal performance seminar I attended at the Yahoo! Drupalcon earlier this year. Lots of great information about cacheing and redundancy.

Submitted by michael on Mon, 10/01/2007 - 10:14am
Filed under: