Olivier Travers

Free cash flow for the win
Home > Archives > 2004 > March > 11 > What Is Latent Semantic Indexing?
What Is Latent Semantic Indexing?

Since I read this phrase for the first time today, here's what it means:

"Latent semantic indexing adds an important step to the document indexing process. In addition to recording which keywords a document contains, the method examines the document collection as a whole, to see which other documents contain some of those same words. LSI considers documents that have many words in common to be semantically close, and ones with few words in common to be semantically distant. This simple method correlates surprisingly well with how a human being, looking at content, might classify a document collection. Although the LSI algorithm doesn't understand anything about what the words mean, the patterns it notices can make it seem astonishingly intelligent."

While artificial intelligence seems to remain within the realm of science fiction, brute force approaches yield actual results.


Category(s): search engines ·
Post a comment






Remember personal info?
Your e-mail address is used to send you future comments to this entry, but I won't use it for any other purpose and it won't appear on the site. I prefer you comment using your real identity, thanks.
Email this Story to a Friend




This form is used only to email this story to your friend. I won't save the email addresses you type in, or use them in the future in any way.


About
Contact



Web Feed

Powered by Movable Type

My profiles: