Science Fair Projects Ideas - Latent semantic analysis

All Science Fair Projects

      

Science Fair Project Encyclopedia for Schools!

  Search    Browse    Forum  Coach    Links    Editor    Help    Tell-a-Friend    Encyclopedia    Dictionary     

Science Fair Project Encyclopedia

For information on any area of science that interests you,
enter a keyword (eg. scientific method, molecule, cloud, carbohydrate etc.).
Or else, you can start by choosing any of the categories below.

Latent semantic analysis

(Redirected from Latent semantic indexing)

Latent semantic analysis (LSA) is a technique in information retrieval invented in 1990 [1]. It is sometimes called latent semantic indexing (LSI).

LSA is a preprocessing step, used before the classification or search of documents. The purpose of LSA is to make documents easier to classify and search. LSA is meant to solve two fundamental problems in natural language processing: synonymy and polysemy. In synonymy, different writers use different words to describe the same idea. Thus, a person issuing a query in a search engine may use a different word than appears in a document, and may not retrieve the document. In polysemy, the same word can have multiple meanings, so a searcher can get unwanted documents with the alternate meanings.

LSA starts with a document-term matrix, a sparse matrix whose rows correspond to documents and whose columns correspond to terms (typically stemmed words that appear in the documents). The values of the matrix are typically tf-idf : they are proportional to the number of times the terms appear in the matrix, where rare terms are upweighted to reflect their relative importance.

LSA then finds a low-rank approximation to the document-term matrix, through the use of singular value decomposition (SVD). In LSA, this SVD is truncated, so that each document and term is represented by a vector of much lower dimensionality than the total number of words in the vocabulary. Thus, when a query is issued by a user, it gets mapped into this low-dimensional space, and gets compared to documents in that same space.

Because it uses a low-dimensional representation for terms and documents, it must represent meaning in documents, rather than simply which terms occur. Thus, document and terms with similar meaning are close in the low-dimensional space. This can mitigate polysemy (by using more than one word in the query to disambiguate in the low-dimensional space) and synonymy (because the synonymous words map similarly in the low-dimensional space).

Recently, LSA has come under criticism, because its probabilistic model does not match the observed data. LSA assumes that words and documents form a joint Gaussian model. However, Gaussian models can generate negative values, and it is impossible to have a negative number of words in a document. Thus, a newer alternative is probabilistic latent semantic analysis , based on a multinomial model, which is reported to give better results than standard LSA. However, LSA still remains a standard algorithm in information retrieval.

External links and references

12-03-2008 10:22:39
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details
Science kits, science lessons, science toys, maths toys, hobby kits, science games and books - these are some of many products that can help give your kid an edge in their science fair projects, and develop a tremendous interest in the study of science. When shopping for a science kit or other supplies, make sure that you carefully review the features and quality of the products. Compare prices by going to several online stores. Read product reviews online or refer to magazines.

Start by looking for your science kit review or science toy review. Compare prices but remember, Price $ is not everything. Quality does matter.
Science Fair Coach
What do science fair judges look out for?
ScienceHound
Science Fair Projects for students of all ages
All Science Fair Projects.com Site
All Science Fair Projects Homepage
Search | Browse | Links | From-our-Editor | Books | Help | Contact | Privacy | Disclaimer | Copyright Notice