The compile time typing problem is the problem of assigning types to
terms at compile-time (or "read time" - any time after the program is read but before its "main" function is executed). A term is essentially any expression within a program, including variables, parameters, data members of aggregates, and intermediate results from calculations or function calls. Each term in a language must be(*) assigned a specific type.
- "Must be" here is technically untrue: for StaticTypeSafety, it is sufficient to prove that the term will be compatible with all operations performed upon it, which doesn't inherently require assigning a type-descriptor to the term (see TypeSafe). Even for optimization of representation, it is sufficient to identify the set of operations that might be performed upon a representation along with their estimated (or profiled) frequencies and probabilities and temporal constraints, and arrange the bits for some choice of access time, storage cost, and lazy evaluation. Neither of these require that a type descriptor (e.g. a type name or type structure) be assigned to the terms. Assuming one must do so can introduce some unnecessary complexity into the TypeSystem. However, CompileTimeTypingProblem remains a valid problem for languages that choose (for reasons that may be outside of StaticTypeSafety) to assign type descriptors to terms.
One immediate question is "what is a type"? There are many different useful definitions (including
ThereAreNoTypes); here we treat types as sets; a type is a set of values and an object is an
instance of the type if it is a member of the set containing all such instances. This definition assumes immutable objects; though it does extend to objects with mutable state as well.
Note that this definition implies that an object can be an instance of a (usually) infinite number of types. Most programming languages limit the number of distinct types allowed in a program; languages with
nominal typing only recognize types that are explicitly declared by the programmer. Languages with
structural typing face more difficulty; especially if types can be mutated at runtime. (See
NominativeAndStructuralTyping). We will assume for the purposes of this discussion that the set of types in a program is a finite set. A type is
maximal with respect to some property if a) the property holds for the type; and b) there are no subtypes of the type (in the finite set which we consider) that also have the property.
A term is
well-typed if all possible values of the term (including all states of any mutable objects which might be bound to the term) are instances of the associated type. A program is well-typed if all terms in the program are well-typed; a program which is not well-typed may have typing errors. It's possible that such a program may be well-behaved, but many languages reject any program that is not well-typed. Others (including both
CeeLanguage and
CeePlusPlus) don't, or allow ill-typed programs to be written with explicit typecasts. In remaining languages, normally if a program is well-typed certain run-time errors are guaranteed not to happen, i.e., the type system is sound (see
StaticTypeSafety). Note that Java is not fully sound.
In languages with a universal supertype (a
TopType, called "Object" in most such languages), a trivial well-typing for any program is to assign the type Object to
all terms in the program; as all possible values/objects are instances of the universal supertype, it is easy to see that all terms are well-typed. Languages with
DynamicTyping do exactly this, however being well-typed gives no guarantee of safety about runtime behavior. One can also say (as usually done) that typechecking is deferred to runtime in such languages; the
RunTimeTypingProblem is the subject of another page. (And much easier to implement).
A term is "ideally typed" if a) it is well-typed; and b) the associated type is a maximal type for the term. Ideally typed terms are useful in that they enable numerous optimizations - data member lookup can be optimized;
DynamicDispatch can be turned into
StaticDispatch, functions can be inlined, or the term can be subject to
TypeErasure. On the other hand, these optimizations can cause
UndefinedBehavior if the typing assumptions ever become invalid. In programs written as separate modules (and especially those
deployed as separate modules), this can happen quite often. One instance of this problem is known as the
FragileBinaryInterfaceProblem.
A program is "ideally typed" if all terms are ideally typed - in other words, there is no
type subsumption. Some
FunctionalProgrammingLanguages with
StaticTyping (
HaskellLanguage,
CamlLanguage, but
not ObjectiveCaml) produce programs with ideal typings.
ObjectOrientedProgrammingLanguages, as a rule, do not - subsumption is a key component of
ObjectOrientedProgramming.
How are the types of terms determined? Generally, by one of three ways:
Note that some languages may allow different methods to be used - ML uses
TypeInference by default; but the user can attach type declarations if they want.
Very nicely done. Whoever created this page should sign it with pride, it's better than most.
Thanks. --
ScottJohnson
CategoryLanguageTyping