# All Science Fair Projects

## Science Fair Project Encyclopedia for Schools!

 Search    Browse    Forum  Coach    Links    Editor    Help    Tell-a-Friend    Encyclopedia    Dictionary

# Science Fair Project Encyclopedia

For information on any area of science that interests you,
enter a keyword (eg. scientific method, molecule, cloud, carbohydrate etc.).
Or else, you can start by choosing any of the categories below.

# Pattern recognition

For the William Gibson novel, see: Pattern Recognition (novel).

Pattern recognition (also known as classification or pattern classification) is a field within the area of machine learning and can be defined as "the act of taking in raw data and taking an action based on the category of the data" [1]. As such, it is a collection of methods for supervised learning.

Typical applications are automatic speech recognition, classification of text into several categories (e.g. spam/non-spam email messages), the automatic recognition of handwritten postal codes on postal envelopes, or the automatic recognition of images of human faces. The last three examples form the subtopic image analysis of pattern recognition that deals with digital images as input to pattern recognition systems.

 Contents

## Pattern recognition techniques

Pattern recognition is typically an intermediate step in a longer process. These steps generally are acquisition of the data (image, sound, text, etc.) to be classified, preprocessing to remove noise or normalize the data in some way (image processing, stemming text, etc.), computing features, classification and finally post-processing based upon the recognized class and the confidence level.

Pattern recognition itself is primarily concerned with the classification step. In some cases, such as in neural networks , feature selection and extraction may also be partially or fully automated.

While there are many methods for classification, they are solving one of three related mathematical problems.

The first is to find a map of a feature space (which is typically a multi-dimensional vector space) to a set of labels. This is equivalent to partitioning the feature space into regions, then assigning a label to each region. Such algorithms (e.g., the nearest neighbour algorithm) typically do not yield confidence or class probabilities, unless post-processing is applied.

The second problem is to consider classification as an estimation problem, where the goal is to estimate a function of the form

$P({\rm class}|{\vec x}) = f\left(\vec x;\vec \theta\right)$

where the feature vector input is $\vec x$, and the function f is typically parameterized by some parameters $\vec \theta$. In the Bayesian approach to this problem, instead of choosing a single parameter vector $\vec \theta$, the result is integrated over all possible thetas, with weighted by how likely they are given the training data D:

$P({\rm class}|{\vec x}) = \int f\left(\vec x;\vec \theta\right)P(\vec \theta|D) d\vec \theta$

The third problem is related to the second, but the problem is to estimate the class-conditional probabilities $P(\vec x|{\rm class})$ and then use Bayes' rule to produce the class probability as in the second problem.

Examples of classification algorithms include: