Science Fair Project Encyclopedia
Name and identifier
An identifier is a string that names an entity, such as a program, device, or system; in order that other entities can "call" that entity. In programming languages, identifiers are lexical units which name a language object, such as a variable, array, record, label, or procedure.
- Identifiers are placed within labels. Labels are attached to, or are part of, or remain associated with, the information identified. If a label becomes disassociated from its information, the information may not be accessible.
In telecommunications and data processing systems, an identifier is one (or more) characters used to identify the name, or characterize the nature of properties (or contents) of a set of data elements.
In computer programming, an identifier naming convention is a standardized method by which to name identifiers (variables, functions, procedures, and any other items that might need a name). There are several major problems the various naming conventions are intended to solve, and several solutions to each of them. At times the choice of naming convention can become an enormously controversial issue, with partisans of each holding theirs to be the best and others to be much inferior.
As most programming languages do not allow spaces in identifiers, some system must be devised when a programmer wishes to use a name containing multiple words. There are several in widespread use; each has a significant following, though sometimes one dominates amongst users of a particular programming language. There are also some programmers who eschew multiple-word names entirely, and so use none of these systems (see the section below on the amount of information in identifiers).
One approach is to replace spaces with another character. The two characters commonly used for this purpose are the hyphen ('-') and the underscore ('_'), so the two-word name two words would be represented as two-words or two_words. The hyphen is arguably the easier to type and more readable of these, and is used by nearly all programmers of Lisp, Scheme, and other languages that permit hyphens in identifiers. However, many other languages reserve the hyphen for use as the subtraction operator, and so do not permit it in identifiers. Thus some programmers of these languages use underscores instead. However, underscores are somewhat harder to type due to their location on most keyboards, and so this solution has not been universally adopted; it is, however, in fairly widespread use among programmers of C, Perl, and many scripting languages.
An alternate approach, developed mostly as an alternative to the underscore in languages that do not permit hyphens, is to omit the space and indicate word boundaries using capitalization, thus rendering two words as either twoWords or TwoWords. This is called CamelCase, among other names.
There are several methods of writing multi-word identifier names in computer languages that tokenize on whitespace. For example, a variable called "my favorite variable" could be written as:
- myFavoriteVariable (lower camel case)
- MyFavoriteVariable (upper camel case)
- my_favorite_variable (underscored)
- my-favorite-variable (dashed)
- MY_FAVORITE_VARIABLE (all caps)
- myfvvbl (some people would use this one)
Information in identifiers
There is significant disagreement over how much information to put in identifiers. This was driven initially by technical reasons, as some early programming languages only allowed identifiers of a certain length. Thus in the standard C library (C was initially one of those languages), one finds atoi as the name of a function that converts ASCII strings to integers. In Lisp, one would be more likely to encounter the same function named as ascii-to-integer or similar. However, the use of shorter identifiers has outlived those technical restrictions, partly as heritage (it continues more commonly in those languages that once had the restrictions), and partly out of ease of use -- it's simply easier to type shorter identifiers, especially when the identifier is used frequently. Those who prefer the longer identifiers argue that the difficulty of typing the longer identifiers is outweighed by the ease of reading code that is more descriptive rather than peppered with impenetrable acronyms and abbreviations.
In addition to the issue of length of identifiers in their descriptive capacity, there are also several systems for codifying specific technical aspects of a particular identifier in the name. Perhaps the most well-known is Hungarian notation, which encodes the type of a variable in its name. Several more minor conventions are widespread; one example is the convention of naming variables in C and C++ with an initial lowercase letter, and naming user-defined datatypes with an initial capital letter.
To be merged
Variables are referred to by a name which is used to refer to its contents. A variable's name can contain text and numbers, but there are restrictions to avoid problems in lexical analysis. For example, a variable in C might be called height or numberOfCats or cow_name. In some languages or programming practices, the name of a variable can tell you what kind of values you might find in it. For instance, in Fortran, the first letter in a variable's name indicates whether by default it is created as an integer or floating point variable. In BASIC, the suffix $ on a variable name indicates that its value is a string. Perl uses the prefixes $, @, %, and & to indicate scalar, array, hash, and subroutine variables. In Common Lisp, variables' names are not strings, but symbols -- a special data type. See also identifier.
The contents of this article is licensed from www.wikipedia.org under the GNU Free Documentation License. Click here to see the transparent copy and copyright details