Who's the Biggest Jerk on the Internet ? |
Well, there are two huge server-farms running in undisclosed locations,
and a team of mathematicians from the former Soviet Union ... no, no, sorry ...
A Java program uses the Google Search API to search for phrases such as "jerk, name" on the internet. A tricky part is enumerating all the forms of a person's name (Bush, George Bush, President Bush, President George Bush, Dubya, etc). Another tricky part is avoiding overlapping searches (for example, if we searched for both "X is an idiot" and "George X is an idiot", the phrase "George X is an idiot" on a web page would give two hits). And we have to avoid spurious matches; who knew that the "nut rush" was a plant, not a comment about Rush Limbaugh ? And "nut bush" and "Bush-Nut" are a plant and a company; I decided to give up on searching for "nut". There are some inaccuracies: the program can't distinguish between references to Bush Senior and Bush Junior, for example, if they just say "Bush". Does "that @*!^=~ Clinton" refer to Bill or Hilary ? It assumes a reference to "Osama" is to "Osama Bin Laden". And I didn't put in the N different ways to spell "Bin Laden". It doesn't give extra weight to finding the phrase in a web page title or heading, which might be a reasonable thing to do. And Google Search doesn't always return an exact hit-count; sometimes it's an estimate. The counts are under-counts; the program searches only for phrases like "jerk X" and "X is a jerk". So something like "X is a huge jerk" is not counted. The program can't find images and videos that say who's a jerk. The "winner" in that area seems to be George Bush; for example: on about.com, on typepad.com, on branchez-vous.com, on shirtsweb.com, on bant-shirts.com, on photobucket.com, on thetalentshow.org, on huffingtonpost.com. I'll run the program every week or so, and paste the results into the web page. The web page doesn't have a "live" connection to the data (something like AJAX, for example). |
To learn a bit about the Google Search API, to refresh my knowledge of Javascript and Java and HTML, and to make some money from displaying advertisements. |