Researchers Track Web-Wide Words

BERKELEY — Computer scientist Jon Kleinberg is taking a virtual stroll down the information superhighway, surfing cyberspace for verbal megatrends.

Did you wince?

Those hopelessly passe terms were passably hip just a few years back. Then, because of overuse or a feckless public, they fell out of fashion.

Kleinberg's research is actually scientific. He uses algorithms to identify sudden jumps in the use of words, offering a glimpse into the mechanics of language evolution -- what makes a word hot, or not.

"It's a fun tool to aim at things and see what happens," said Kleinberg, a Cornell University associate professor.

Search engines that scour Web pages for specific words work fairly well, although there is a lot of weeding out of old and weird results. Kleinberg's software is different. It looks at data without being given a keyword and reports back on significant topics.

For instance, the program scanned State of the Union addresses going back to 1790 (they are all online) and produced a list of "word bursts," or words that jumped in frequency.

The program found "depression," "banks" and "recovery" on presidential lips in the 1930s. In the late '40s and '50s, "atomic" was the explosive catchword.

The speech scan was a test to show that the software could come up with results that correlate to the real world.

The program is intended to look at data about which the searcher has no clue -- say a mountain of unread e-mail or documents -- and divulge a list of what topics were hot and when they started to heat up.

So far, the software detects trends in retrospect. Kleinberg is making it more predictive.

Prabhakar Raghavan, chief technology officer of Sunnyvale, Calif.-based software company Verity Inc., has used Kleinberg's software to analyze Web logs, online journals commonly known as "blogs."

Seeking emerging trends among cutting-edge bloggers, Raghavan looked for bursts of references and links to other people's Web sites. Raghavan found that the software successfully identified such bursts, a skill that could ultimately help advertisers target their sales pitches.

To Web word watcher Paul McFedries, the burst software sounds like a great idea. He's been using "wetware" -- his wits -- to trawl Internet databases for new uses of language.

McFedries posts the results on his site, the Word Spy.


<< Previous Page | Next Page >>
 
 
Business