October 13, 2006

More from UWO

Bill Turkel has a fantastic post about the ways people search for history online. Using search data released by AOL and some statistical methods, Bill has been able to tell us a lot about how ordinary Internet users think about history and what topics interest them most. Clearly this is very important stuff for Found History, and I hope he takes it further. I’d be particularly interested in how the history searches of AOL users compare to those of Google and Yahoo! users, but I suppose (thankfully) that Google and Yahoo! have more respect for their users’ privacy and that this won’t happen anytime soon.

One thing Bill notices is how many searches for “history” relate not to the study of the past, but to the web browser’s cache and how to delete it. Though Bill’s methods are statistical and mine are anecdotal, this is something I have noticed as well. I do a lot of searching around the web for the pieces of found history I post in this blog, and I often find myself sifting through lots of web pages and blog posts about clearing Internet Explorer’s history files on my way to finding a truly historical nugget.

This suggests a converse research question to the one Bill has asked of his data set. It would be interesting to compare the kinds of history people are searching for with the kinds of history they’re posting about. I suppose you could do this by pulling three months’ worth of feeds for blog posts containing the word “history” (easily done through Bloglines or blogsearch.google.com) and running some similar text mining operations on them. Analyzing how “history” is used in titles could be particularly enlightening in that titles and search terms share a similar descriptive intent. And you could easily ask the same kinds of information distance questions of both.

Obviously this has me thinking. Many thanks to Bill.

Leave a Comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.