Every day I wake up happy that one of my favorite websites, Wikipedia, is free, open, and well-supported. My wiki-timeline summarized:
Today, I am still a student of Wikipedia. I use it dozens of times a day, and you'll see it cited throughout anything I write. I also continue to proudly host and administrate MediaWiki, though my extension-writing skills have languished. But this is mostly a story about building on Wikipedia: the story of Hatnote.
Hatnote is an ongoing umbrella project organized around Wikipedia as a social and data platform. Or, as we like to put it, finding new perspectives on wiki life.
I realized, while everyone I met from the foundation was talented and well-intentioned, they simply did not have the resources to push the innovation envelope on Wikipedia. This isn't meant as a controversy-inducing criticism of the WMF; Wikipedia and other Wikimedia projects have always relied much more on the community for regulation and development. To this point, look at the numbers:
Compare this to other top 100 websites worldwide. I hate to put it in such economic terms, but Wikipedia's utility per capita, even counting community members, is off the charts. Besides, when it comes to innovation, is there even such a thing as "enough"?
The reality is that the WMF is a nonprofit formed to steward these sites. They keep the servers up, keep the sites usable, and keep it all above board legally and financially. They're rightfully more focused on increasing accessibility, with campaigns and features like Wikipedia Zero and VisualEditor. Take all of that, add in community organization of chapters and various events, and it's not so surprising that Wikipedia doesn't keep up flashy appearances next to for-profit Silicon Valley neighbors.
So, Stephen and I formed Hatnote to do what we could to promote Wikipedia among the Internet's established power users. To add new types of interaction for new generations of Wikipedia users, to help people remember that Wikipedia is more than just the first, best result on every search site, and to keep it all free.
Most Hatnote projects revolve around editing and other interactive Wikipedia activities. With top.hatnote.com, we turned that around and sought to offer clean and simple insight into the reading habits of Wikipedia's biggest user group: its readers.
Updated daily, the Top 100 is a chart of the most-visited articles on Wikipedia. Nearly 20 billion times per month, around 500 million people read articles in over 200 languages. The Top 100’s daily statistics offer a window into where Wikipedia readers are focusing their attention. It also makes for a great way to discover great chapters of Wikipedia one wouldn’t normally read or edit.
Clear ordering, images, sparklines, and approachable statistics make data approachable for casual readers. Structured data feeds, including JSON and RSS, keep the site relevant for developers and power users. Socialites of all skill levels share discoveries via Twitter button integrations on individual tiles, or automatically through an IFTTT recipe based on the RSS feed.
Personally, writing this a couple months after launch, I still visit top.hatnote.com first thing in the morning, often before I get out of bed.
IFTTT (IF This Then That) is a web service for connecting and automating the sites that make up our online ecosystem. Wikipedia didn't have a channel, so with a bit of help from some friends, we built one.
My feelings toward IFTTT are mixed, but because the walls of the Internet corporate gardens keep growing taller, I do use it. And if sites like Facebook and Buzzfeed have a channel, then Wikipedia deserves one, too.
Last I checked there are tens of thousands of daily users on the Wikipedia IFTTT channel, resulting in millions of hits to the web application that services user Recipes. That application runs on Wikimedia Labs and the code is on GitHub.
If you're intrigued and would like to give it a spin, I've written up two guides:
We hit some decent milestones. Over a million requests per day and IFTTT's Top Chef #89 ain't bad:
Wikipedia's community is unlike any other online. Something about the system's radical user inclusionism, combined with a mission to realize the original intent of the Internet, an interconnected knowledgebase that anyone can edit, has attracted people from all walks and corners.
But even unique communities expectations will evolve as people join from surrounding communities. Toward that end, Wikipedia Social Search adds some familiar functions back to Wikipedia: #hashtags and @mentions. Now if you make an edit to Wikipedia with hashtags or mentions, we parse it out and index it (with this batch job). Perfect for editathons and tracking ad hoc organized editing. This feature also makes an appearance in the IFTTT channel, with the hashtags trigger.
As much as one might like Wikipedia, it moves so quickly that it can be hard to track when major editing events occur. Email digests are a common solution to this problem, and are more relevant than ever. Many social networks have email to thank for retaining active users, who might otherwise forget they have an account (looking at you LinkedIn and Twitter).
The Weeklypedia is an aptly-named weekly summary of the most edited Wikipedia articles, available in 15+ languages. Skimming an issue only takes a couple minutes and can yield surprising results. The data used to generate The Weeklypedia is also available. Monitoring is achieved with cronfed.
Listen to Wikipedia is Hatnote's most popular project. And while the mobile site worked on the phones we tested, the sheer number of user emails and messages we got prove that native apps have a certain je ne sais quoi for some people.
A play the common "See Also" heading in Wikipedia articles, See, Also is a virtual gallery of Wikipedia-derived interactions and visualizations. After the success of Recent Changes Map and Listen to Wikipedia, Stephen and I figured we should leverage some of that Hatnote fame to get exposure for other wiki-based projects that we found inspiring. The architecture of See, Also partially inspired chert, the application that renders this page.
"Strangely melodic" and "oddly mesmerizing". If you haven't seen and heard it yet, Listen to Wikipedia is a real-time auralization of Wikipedia growing, one edit at a time.
The site is literally self-explanatory. With around 2 million unique users since 2013, it's been a joy to build and run. In addition to extensive news and blog coverage, Stephen and I have made appearances everywhere from major media outlets like NPR, BBC radio, and French TV to the halls and walls of libraries, museums. It also won a Kantar Information is Beautiful award, in the Interactive Visualization category, and was used to conduct multiple yoga and meditation sessions.
Listen to Wikipedia has remarkable staying power. It has tens of thousands of regular monthly users, and new people discover it every day. To accomodate that, L2W has an uptime over 99% that of its upstream services. It gets its data from Hatnote's websocket streaming from Wikimon.
Recent Changes Map is a real-time visualization of Wikipedia edits by their city of origin.
Around 10% of edits to Wikipedia are made by unregistered users. No other major site so faithfully puts trust in humanity to build and rebuild more often than destroy. Millions of articles later, Wikipedia stands as a testament.
Registered Wikipedia users are only known by their user alias, so Recent Changes Map uses the IPs of these anonymous users to establish an approximate location. The results amazed. So many unlikely pairings. South American interest in American Idol, North American interest in Japanese animation and wrestling, Commonwealth countries interest in each other, and much less predictable results amazed the Internet and generated quite a buzz.
Wikipedia Open Metrics Platform, or WOMP, was created as a console to fetch, extract, and organize data, using Wapiti, Hatnote's Wikipedia API client. The idea was to build an application you didn't have to be a programmer to use. WOMP was created for and inspired by Adrianne Wadewitz's research into Wikipedia's community dynamics. Development took a break when her data was fetched, and with her passing, is on indefinite hiatus. I hope I get back to working on it someday.
Wikipedia's querying API is one of the richest and most complex available. And what Wikipedia lacks in semantic content, it buries even further with complicated and inconsistent access patterns.
Wapiti is an experimental client which rationalizes these functional-if-confusing APIs into a Python interface with an highly consistent and recombinable API. Wapiti mostly works, but has been in the backseat for a while due to more pressing projects.
Wikipedia grows quickly, almost 0.1% per week. With 700 new articles per day, older pages can barely keep up. One way Wikipedia experiences growing pains is this:
Let's say there's a "Mars" article, talking about the planet, and all the astronomy articles link there. When Wikipedia's definition of "Mars" grows to encompass the Roman god and chocolate bar, the "Mars" is replaced with a "disambiguation" page, like this one. Now it might not be clear from reading the article linking to "Mars" which Mars is intended. Imagine that problem, but in the context of names like "John Smith" and so forth.
This is a hard problem for Wikipedians, and we decided to tackle it by gamification1. Suffice to say, Disambiguity was a very challenging, very fun, and very niche game to both play and build. It was featured at Wikimedia's 2012 Maker Faire booth. Stephen has the story, in photos.
We were young! It was 2012! ↩
The project that started it all. Originally proposed and implemented in a two-day Wikimedia Foundation hackathon, the creatively named Qualityvis aimed to solve one of the hardest problems on Wikipedia: finding something to edit.
Originally based on hand-picked heuristics, Qualityvis grew into a full-scale Big Data + machine learning project. We extracted hundreds of dimensions for hundreds of thousands of pages and revisions to establish a baseline quality evolution gradient. We looked at:
Qualityvis won us second place at the WMF hackathon, and ended up on indefinite hiatus, as we stalwartly marched into the bog of automation and repeatability, before being swept into wide-appeal projects like Listen to Wikipedia and RCMap.
Qualityvis taught me a lot about machine learning, the quality of open source (especially Node.js and certain unnamed Python libraries), and project management.
Hatnote has a lot of projects under its collective belt and it's taken a lot of work from a lot of people. Hatnote credits roll:
I hope I'm not missing anyone. Going on three years and a dozen projects, keeping track is tough! But thanks to everyone who's ever helped, whether through code, filing issues, or simple promotion. I really appreciate it.