Science Fair Project Encyclopedia
Document classification is a problem in information science. The task is to assign a document to one or more categories, based on its contents. Document classification tasks can be divided into two sorts: supervised document classification where some external mechanism (such as human feedback) provides information on the correct classification for documents, and unsupervised document classification, where the classification must be done entirely without reference to external information.
Document classification techniques include:
and approaches based on natural language processing.
- Rafael A. Calvo, Jae-Moon Lee and Xiaobo Li. Managing Content with Automatic Document Classification. Journal of Digital Information, Volume 5 Issue 2, Article No. 282, 2004-06-08
- Introduction to document classification
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details