Science Fair Project Encyclopedia
Polymorphism (computer science)
In computer science, polymorphism is the idea of allowing the same code to be used with different classes of data (which classes in typed languages correspond to types), resulting in more general and abstract implementations.
The concept of polymorphism applies to functions as well as types. A function that can evaluate to and be applied to values of different types is known as a polymorphic function. A datatype that contains elements of an unspecified type is known as a polymorphic datatype.
There are two fundamentally different kinds of polymorphism. If the range of actual types that can be used is finite and the combinations must be specified individually prior to use, it is called ad-hoc polymorphism. If all code is written without mention of any specific type and thus can be used transparently with any number of new types, it is called parametric polymorphism.
Programming using the latter kind is called generic programming, particularly in the object-oriented community. However, in many statically typed functional programming languages the notion of parametric polymorphism is so deeply ingrained that most programmers simply take it for granted.
Polymorphism gained most of its momentum when object-oriented programming became a buzzword.
Using parametric polymorphism, a function or datatype can be written generically so that it can deal equally well with objects of various types. For example, a function
append that joins two lists can be constructed so that it does not depend on one particular type of list: it can append lists of integers, lists of real numbers, lists of strings, and so on. Let a denote the type of elements in the lists. Then
append can be typed [a] × [a] → [a], where [a] denotes a list of elements of type a. We say that
append is parameterized by a. (Note that since there is only one type parameter, the function cannot be applied to just any pair of lists: they must consist of the same type of elements.)
Parametric polymorphism was the first type of polymorphism developed, first identified by Christopher Strachey in 1967. It was also the first type of polymorphism to appear in an actual programming language, ML in 1976. It exists today in Standard ML, O'Caml, Haskell, and others. Some argue that templates should be considered an example of parametric polymorphism, though instead of actually reusing generic code they rely on macros to generate specific code (which can result in code bloat ).
Parametric polymorphism is a way to make a language more expressible, while still maintaining full static type-safety. It is thus irrelevant in dynamically typed languages, since by definition they lack static type-safety. However, any dynamically typed function f that takes n arguments can be given a static type using parametric polymorphism: f : p1 × ... × pn → r, where p1, ..., pn and r are type parameters. Of course, this type is completely insubstantial and thus essentially useless.
This is a more powerful form of parametric polymorphism. The most common form of this in general use today is ML-style polymorphism (also called let-polymorphism).
This type of polymorphism is similar to the impredicative polymorphism but a little less powerful in the sense that the type applications can only take actual concrete types as arguments while in impredicative polymorphism, type application can be passed arbitrary type schemas as argument(s).
Some languages employ the idea of subtypes to restrict the range of types that can be used in a particular case of parametric polymorphism. In these languages, subtyping polymorphism (sometimes referred to as dynamic polymorphism or dynamic typing) allows a function to be written to take an object of a certain type T, but also work correctly if passed an object that belongs to a type S that is a subtype of T (according to the Liskov substitution principle). This type relation is sometimes written S <: T. Conversely, T is said to be a supertype of S—written T :> S.
For example, if
Integer are types such that
Integer, a function written to take a
Number will work equally well when passed an
Rational as when passed a
Number. The actual type of the object can be hidden from clients into a black box, and accessed via object identity.
In fact, if the
Number type is abstract, it may not even be possible to get your hands on an object whose most-derived type is
Number (see abstract data type, abstract class). This particular kind of type hierarchy is known—especially in the context of the Scheme programming language—as a numerical tower, and usually contains a lot more types.
Object-oriented programming environments such as C++ and GObject implement subtyping polymorphism using subclassing (also known as inheritance). In C++, each class contains what is called a virtual table—a table of functions that implement the polymorphic part of the class interface—and each object contains a pointer to the "vtable" of its class, which is then consulted whenever a polymorphic method is called. This mechanism is an example of
- late binding, because virtual function calls are not bound until the time of invocation, and
- single dispatch (i.e., single-argument polymorphism), because virtual function calls are bound simply by looking through the vtable provided by the first argument (the
thisobject), so the runtime types of the other arguments are completely irrelevant.
Ad-hoc polymorphism usually refers to simple overloading (see function overloading), but sometimes automatic type conversion, known as coercion, is also considered to be a kind of ad-hoc polymorphism (see the example section below). Common to these two types is the fact that the programmer has to specify exactly what types are to be usable with the polymorphic function.
The name refers to the manner in which this kind of polymorphism is typically introduced: “Oh, hey, let’s make the
+ operator work on strings, too!” Some argue that ad-hoc polymorphism is not polymorphism in a meaningful computer science sense at all—that it is just syntactic sugar for calling
append_string, etc., manually. One way to see it is that
- to the user, there appears to be only one function, but one that takes different types of input and is thus type polymorphic; on the other hand,
- to the author, there are several functions that need to be written—one for each type of input—so there’s essentially no polymorphism.
Overloading allows multiple functions taking different types to be defined with the same name; the compiler or interpreter automatically calls the right one. This way, functions appending lists of integers, lists of strings, lists of real numbers, and so on could be written, and all be called append—and the right append function would be called based on the type of lists being appended. This differs from parametric polymorphism, in which the function would need to be written generically, to work with any kind of list. Using overloading, it is possible to have a function perform two completely different things based on the type of input passed to it; this is not possible with parametric polymorphism.
This type of polymorphism is common in object-oriented programming languages, many of which allow operators to be overloaded in a manner similar to functions (see operator overloading). It is also used extensively in the purely functional programming language Haskell. Many languages lacking ad-hoc polymorphism suffer from long-winded names such as
print_string, etc. (see Objective Caml).
An advantage that is sometimes gained from overloading is the appearance of specialization, e.g. a function with the same name can be implemented in multiple different ways, each optimized for the particular data types that it operates on. This can provide a convenient interface for code that needs to be specialized to multiple situations for performance reasons.
The type system of Haskell includes a construct called the type class that provides a powerful form of ad-hoc polymorphism. A type class is defined by giving a set of operations (or "methods") that must be implemented for every type in the class. For example, the predefined class
Ord contains types that are ordered, that is, types whose elements may be compared using
<=. A function to sort a list (using
<= for comparison) is given the type
Ord a => ([a] -> [a]). That is, it is a function that can take a list of elements of type
a and return a list of the same type, provided that
a is in the class
Ord. The parentheses in the type are unnecessary but make its meaning clearer: this type captures the fact that the sorting function can take elements of many different types (and is therefore polymorphic), but that the elements of a list to be sorted cannot be just anything: it must be possible to compare them.
A programmer can make any type
t a member of a given class
C using an instance declaration that defines implementations of all of
C's methods for the particular type
t. For instance, if a programmer defines a complex new data type, she may then make her new type an instance of
Ord by providing a function to compare values of that type in whatever way she considers appropriate. Once she has done this, she may use a sorting function of the type just given to sort lists of elements of her type. Programmers may also define new type classes of their own.
Note that type classes are rather different from classes in object-oriented languages; in particular,
Ord is not a type, so there is no such thing as a value of type
Ord. Thus the Haskell approach to a generic sorting function as outlined here is quite different from the subtyping-based approach often seen in object-oriented programming. Type classes are in fact much more closely related to parametric polymorphism (note that the type of the sorting function would be the parametrically polymorphic type
[a] -> [a] if it were not for the type class constraint "
Ord a =>"); however, Haskell programmers tend to consider them a form of ad-hoc polymorphism, probably because their most pervasive use is for the overloaded arithmetic and comparison operators.
Due to a concept known as coercion, a function can become polymorphic without being initially designed for it. Let f be a function that takes an argument of type T, and S be a type that can be automatically converted to T. Then f can be said to be polymorphic with respect to S and T.
Some languages (e.g., C, Java), provide a fixed set of conversion rules, while others (e.g., C++) allow new conversion rules to be defined by the programmer. While calling C “polymorphic” is perhaps stretching it, the facility for automatic conversion (i.e., casting operators) found in C++ adds a whole new class of polymorphism to the language.
This example aims to illustrate the three different kinds of polymorphism described in this article. Though overloading an originally arithmetic operator to do a wide variety of things in this way may not be the most clear-cut example, it allows some subtle points to be made. In practice, the different types of polymorphism are not generally mixed up as much as they are here.
Imagine, if you will, an operator
+ that may be used in the following ways:
1 + 2 → 3
3.14 + 0.0015 → 3.1415
1 + 3.7 → 4.7
[1, 2, 3] + [4, 5, 6] → [1, 2, 3, 4, 5, 6]
[true, false] + [false, true] → [true, false, false, true]
"foo" + "bar" → "foobar"
To handle these six function calls, four different pieces of code are needed—or three, if strings are considered to be lists of characters:
- In the first case, integer addition must be invoked.
- In the second and third cases, floating-point addition must be invoked.
- In the fourth and fifth cases, list concatenation must be invoked.
- In last case, string concatenation must be invoked, unless this too is handled as list concatenation (as in, e.g., Haskell).
Thus, the name
+ actually refers to three or four completely different functions. This is an example of overloading.
As we’ve seen, there’s one function for adding two integers and one for adding two floating-point numbers in this hypothetical programming environment, but note that there is no function for adding an integer to a floating-point number. The reason why we can still do this is that when the compiler/interpreter finds a function call
f(a1, a2, ...) that no existing function named
f can handle, it starts to look for ways to convert the arguments into different types in order to make the call conform to the signature of one of the functions named
f. This is called coercion. Both coercion and overloading are kinds of ad-hoc polymorphism.
In our case, since any integer can be converted into a floating-point number without loss of precision,
1 is converted into
1.0 and floating-point addition is invoked. There was really only one reasonable outcome in this case, because a floating-point number cannot generally be converted into an integer, so integer addition could not have been used; but significantly more complex, subtle, and ambiguous situations can occur in, e.g., C++.
Finally, the reason why we can concatenate both lists of integers, lists of booleans, and lists of characters, is that the function for list concatenation was written without any regard to the type of elements stored in the lists. This is an example of parametric polymorphism. If you wanted to, you could make up a thousand different new types of lists, and the generic list concatenation function would happily and without requiring any augmentation accept instances of them all.
It can be argued, however, that this polymorphism is not really a property of the function per se; that if the function is polymorphic, it is due to the fact that the list datatype is polymorphic. This is true—to an extent, at least—but it is important to note that the function could just as well have been defined to take as a second argument an element to append to the list, instead of another list to concatenate to the first. If this was the case, the function would indisputably be parametrically polymorphic, because it could then not know anything about its second argument, except that the type of the element should match the type of the elements of the list.
Polymorphism also refers to polymorphic code, computer code that mutates for each new time it is executed. This is sometimes used by computer viruses, computer worms and shellcodes to hide the presence of their decryption engines.
- Luca Cardelli, Peter Wegner. On Understanding Types, Data Abstraction, and Polymorphism, from Computing Surveys, (December, 1985) 
- Paul Hudak, John Peterson, Joseph Fasel. A Gentle Introduction to Haskell Version 98. 
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details