Strong type checking is gold
Normal type checking is silver
But casting is brass
The strong type system allows you to imbue typedefs with flexible type-checking properties and can perform dimensional analysis. For example, consider the law of universal gravitation:
The following code attempts to implement this, but contains a mistake:
1typedef double Meter, Second, Velocity, Acceleration; 2typedef double Kilogram, Newton; 3typedef double Area, Volume; 4typedef double GravitationalConstant; 5 6const GravitationalConstant G = 6.67e-11; 7 8Newton attraction(Kilogram mass1, Kilogram mass2, Meter distance) { 9 return (mass1 * mass2) / (distance * distance); 10}
A compiler is not interested in (and has no obligation to warn you about) the dimensional mismatch here caused by
forgetting to multiply by G. Running this example through lint with the appropriate strong type options produces
the following messages:
strong type mismatch: assigning '(Kilogram*Kilogram)/(Meter*Meter)' to 'Newton'
did you mean to multiply by a factor of type 'GravitationalConstant'?
If you are curious what options were used to get these units into the strong type system, see Full Source for the Gravitation Example.
The primary option used to interact with the strong type system is
-strong(flags[,name ...])
This option identifies each name as a strong type with properties specified by flags. Presumably there is a later typedef defining any such name to be a type. If no name is provided, the specified flags will be taken as the default for types without explicit -strong options. Flags are uppercase letters that indicate some aspect of a type’s behavior, and they can be modified by following them with softeners. Softeners are represented using lowercase letters and must immediately follow the flag they are modifying.
| A | Check strong types on Assignment. Issue a warning upon assignment (where assignment refers to using the assignment operator, returning a value, passing an argument or initializing a variable). A may be followed by one or more softening modifiers:
|
|||||||||||||||||||
| X | Check strong types on eXtraction. This flag issues warnings in the same contexts as the A flag, but checks on behalf of the value being assigned. The softeners for A cannot be used with X. |
|||||||||||||||||||
| J | Check strong types when Joining two operands of a binary operator. J may be followed by one or more of the following modifiers:
|
|||||||||||||||||||
| B | Designate a major boolean type. Only one boolean type may exist whether it comes from the B or b flag. The result of all boolean operators will be a value compatible with this type. Contexts that expect a boolean value will require their operands to be of the major boolean type. |
|||||||||||||||||||
| b | Designate a minor boolean type. Only one boolean type may exist whether it comes from the B or b flag. The result of all boolean operators will be a value compatible with this type. This flag places no requirement on the values used in contexts that expect a boolean, in contrast to B. |
|||||||||||||||||||
| l | Designate a type as inherently compatible with library functions. This includes assignment from library function return values and as library function arguments. |
|||||||||||||||||||
| f | Indicates bit-fields of length one are not automatically boolean (by default they are). This is a modifier that can only accompany one of the boolean flags (either B or b above). . |
Description
-index( flags, ixtype, sitype [, sitype ...] )
This option is supplementary to and can be used in conjunction with the -strong option. It specifies that ixtype is
the exclusive index type to be used with arrays of (or pointers to) the Strongly Indexed type sitype (or sitype’s if
more than one is provided). Please note: both the ixtype and the sitype are assumed to be names of types
subsequently defined by a typedef declaration. flags can be
| c | allow Constants as well as ixtype, to be used as indices. |
| d | allow array Dimensions to be specified without using an ixtype. |
Examples of -index
For example:
//lint -strong( AzJcX, Count, Temperature ) //lint -index( d, Count, Temperature ) // Only Count can index a Temperature typedef float Temperature; typedef int Count; Temperature t[100]; // OK because of d flag Temperature *pt = t; // pointers are also checked // ... within a function Count i; t[0] = t[1]; // Warnings, no c flag for( i = 0; i < 100; i++ ) t[i] = 0.0; // OK, i is a Count pt[1] = 2.0; // Warning i = pt - t; // OK, pt-t is a Count
In the above, Temperature is said to be strongly indexed and Count is said to be a strong index.
If the d flag were not provided, then the array dimension should be cast to the proper type as for example:
Temperature t[ (Count) 100 ];
However, this is a little cumbersome. It is better to define the array dimension in terms of a manifest constant, as in:
#define MAX_T (Count) 100 Temperature t[MAX_T];
This has the advantage that the same MAX_T can be used in the for statement to govern the range of the
for.
Note that pointers to the Strongly Indexed type (such as pt above) are also checked when used in array notation.
Indeed, whenever a value is added to a pointer that is pointing to a strongly indexed type, the value added is
checked to make sure that it has the proper strong index.
Moreover, when strongly indexed pointers are subtracted, the resulting type is considered to be the common Strong Index. Thus, in the example,
i = pt - t;
no warning resulted.
It is common to have parallel arrays (arrays with identical dimensions but different types) processed with similar indices. The -index option is set up to conveniently support this. For example, if Pressure and Voltage were types of arrays similar to the array t of Temperature one might write:
//lint -index( , Count, Temperature, Pressure, Voltage ) ... Temperature t[MAX_T]; Pressure p[MAX_T]; Voltage v[MAX_T]; ...
Multidimensional Arrays
The indices into multidimensional arrays can also be checked. Just make sure the intermediate type is an explicit typedef type. An example is Row in the code below:
/* Types to define and access a 25x80 Screen. a Screen is 25 Row's a Row is 80 Att_Char's */ /*lint -index( d, Row_Ix, Row ) -index( d, Col_Ix, Att_Char ) */ typedef unsigned short Att_Char; typedef Att_Char Row[80]; typedef Row Screen[25]; typedef int Row_Ix; /* Row Index */ typedef int Col_Ix; /* Column Index */ #define BLANK (Att_Char) (0x700 + ' ') Screen scr; Row_Ix row; Col_Ix col; void main() { int i = 0; scr[ row ][col ] = BLANK; /* OK */ scr[ i ][ col ] = BLANK; /* Warning */ scr[col][row] = BLANK; /* Two Warnings */ }
In the above, we have defined a Screen to be an array of Row’s. Using an intermediate type does not change the configuration of the array in memory. Other than for type-checking, it is the same as if we had written:
typedef Att_Char Screen[25][80];
Unlike other binary operators that expect their operands to agree in strong type, multiplication and division often
can and should handle different types in what is commonly referred to as dimensional analysis. But not all strong
types are the same in this regard. The strong type system recognizes three different kinds of treatment with regard
to multiplication and division.
A dimension is a strong type such that when two expressions are multiplied or divided and each type is a dimension,
then the resulting type will also be a dimension whose name will be a compound string representing the product or
quotient of the operands (reduced to lowest terms). The modulus operator % will have a resultant type equal to the
type of the numerator.
For example:
//lint -strong( AJdX, Sec ) typedef double Sec; Sec x, y; ... x = x * y; // warning: '(Sec*Sec)' is assigned to 'Sec' y = 3.6 / x; // warning: '1/Sec' is assigned to 'Sec'
Flags ’AJdX’ contain the Join phrase ’Jd’ designating that Sec is a dimension. Strictly speaking the ’d’ is not
necessary because the normal default is to make any strong type dimensional. However, there is a flag option -fdd
(turn off the Dimension by Default flag), which will reverse this default behavior, so it is probably wise to place the
’d’ in explicitly.
Dimensional types are treated in greater detail later.
A dimensionally neutral type is a strong type such that when multiplied or divided by a dimension will act as a non-strong type.
For example:
//lint -strong( AJdX, Sec ) typedef double Sec; //lint -strong( AJnX, Cycles ) typedef double Cycles; Cycles n; Sec t; ... t = n * t; // OK, Cycles are neutral t = t / n; // still OK. n = n / t; // warning: '1/Sec' assigned to 'Cycles'
The n softener of the J flag as in the AJnX sequence above designates that type Cycles is dimensionally neutral and
will drop away when combined multiplicably with the dimension Cycles as shown in the first two
assignments. However, Cycles acts as a strong type in every other regard. An illustration of this is
the last line in this example, which produces a warning that the type ’1/Sec’ is being assigned to
Cycles.
Thus, Cycles is playing the role that it traditionally plays in Physics and Engineering. It contains no physical units and when multiplied or divided by a dimension does not change the dimensionality of the result.
An antidimensional type is a strong type that when multiplied or divided is expected to be combined with the same
type, or one that is compatible through the usual strong type hierarchies. It functions in this regard much like
addition and subtraction.
For example:
//lint -strong( AJaX, Integer ) typedef int Integer; Integer k; int n; ... k = k * k; // OK k = n * k; // warning: Integer joined with non-Integer
The sequence Ja in the above indicates that Integer is antidimensional.
The strong type mechanism can support the traditional dimensional analysis exploited by physicists, chemists and engineers. When strong types are added, subtracted, compared or assigned, the strong types need merely match up with each other. However, multiplication and division can join arbitrary dimensional types and the result is often a new type. Consider forming the velocity from a distance and a time:
//lint -strong( AcJcX, Met, Sec, Velocity = Met/Sec ) typedef double Met, Sec, Velocity; Velocity speed( Met d, Sec t ) { Velocity v; v = d / t; // ok v = 1 / t; // warning v = (3.5/t) * d; // ok v=(1/(t*t))*d*t; // ok return v;// ok }
In this example, the 4th argument to the -strong option:
Velocity = Met/Sec
relates strong type Velocity to strong types Met and Sec. This particular suboption actually creates two strong types: Velocity and Met/Sec and relates the two types by making Met/Sec the parent type of Velocity. This relationship can be seen in the output obtained from the option -vh (or the compact form -vh- ). As an example the results of the -vh option for the above example are:
- Met - Sec - Met/Sec | + - Velocity - 1/Sec - (Sec*Sec) - 1/(Sec*Sec) - Met/(Sec*Sec)
The division of Met by Sec (within the option) can be produced in many equivalent ways. E.g.
Velocity = (1/Sec) * Met Velocity = ((1/Sec) * (Met)) Velocity = (Met/(Sec*Sec)) * Sec
are all equivalent. All of these dimensional expressions are reduced to the canonical form Met/Sec, which was the form given in the original option. Note that parentheses can be used freely and in some cases must be used to obtain the correct results. E.g.
Acceleration = Met/Sec*Sec // wrong Acceleration = Met/(Sec*Sec) // correct
We follow C syntactic rules where the operators bind left to right and the example labeled ’wrong’ results, after
cancellation, in just Met.
Briefly and for the record the canonical form produced is:
(F1*F2*...*Fn)/(G1*G2*...*Gm)
where each Fi and each Gi are simple single-identifier sorted strong types and where n >= 0 and m >= 0 but if n is
less than 2 the upper parentheses are dropped out and if m is less that 2 the lower parentheses are dropped
and if n is 0 the numerator is reduced to 1 and if m is 0 the entire denominator including the / is
dropped.
Returning to our original example (the function speed), when the statement:
v = d/t;
is encountered and an attempt is made to evaluate d/t the dimensional nature of the types of the two arguments is
noted and the names of these types is combined by the division operator to produce "Met/Sec". This uses essentially
the same algorithms and canonicalization as the compound type analysis with a -strong option. The resulting type
is assigned to Velocity without complaint because of the previously described parental relationship that exists
between these two strong types.
In the next statement
v = (3.5/t) * d;
the division results in the creation of a new strong type (1/Sec), which when multiplied by Met will become
Met/Sec. The created type will have properties AJcdX and the underlying type will be the type that a compiler
would compute.
Let’s say you have a paper 400 lines long and the printing requires 60 lines/page. How many full pages will we require? The answer is
400 lines / (60 lines/page) = 6 pages
How many lines are left over? The answer is
400 lines % (60 lines/page) = 40 lines
Thus, unlike division, the % operator yields a dimension that equates to the dimension of the numerator (in this
case, lines) while ignoring the dimension of the 2nd operand.
A simple example in the use of Dimensional strong types is that of providing a fail-safe method of converting from one system of units to another. Such conversions can quite often be accomplished by a single numeric factor. Such conversion factors should have dimensions attached to prevent mistakes. E.g.
// Centimeters to/from Inches //lint -strong( AJdX, In, Cm, CmPerIn = Cm/In ) typedef double In, Cm, CmPerIn; CmPerIn cpi = (CmPerIn) 2.54; // conversion factor void demo( In in, Cm cm ) { ... in = cm / cpi; // convert cm to in ... cm = in * cpi; // convert in to cm ... }
In this example we are defining a conversion factor, cpi, that will allow us to convert inches to centimeters (by
multiplication) and convert centimeters to inches (via division). Without strong types, conversion
factors can be misused. Do I multiply or divide? Using strong types you can be assured of getting it
right.
Obviously not all conversions fall into the category of being described by a conversion factor. Conversions between Celsius and Fahrenheit, for example, require an expression and this typically means defining a pair of functions as in the following:
//lint -strong( AJdX, Fahr, Celsius ) typedef double Fahr, Celsius; Celsius toCelsius( Fahr t ) { return (t-(Fahr)32.) * (Celsius)5. / (Fahr)9.; } Fahr toFahr( Celsius t ) { return (Fahr)32. + t * (Fahr)9. / (Celsius)5.; }
The function call overhead is probably not significant, but if it is, you may declare the functions to be inline in C++.
Some C systems support inline functions, but in any case, you can use macros.
Now let us suppose a confused programmer had written:
Fahrenheit f; Celsius c; ... f = toCelsius (c); // Type Violations
Then there would be two strong type violations since passing c to a Fahrenheit variable is bad as is assigning a Celsius value to f.
Although the examples of dimensional analysis offered above refer to floating point quantities, the same principles apply to integer arithmetic. E.g.
#include <stdio.h> #include <limits.h> //lint -strong( AcJdX, Bytes, Bits ) //lint -strong( AcJdX, BitsPerByte = Bits / Bytes ) typedef size_t Bytes, Bits, BitsPerByte; BitsPerByte bits_per_byte = CHAR_BIT; Bytes size_int = sizeof(int); Bits length_int = size_int * bits_per_byte;
In this example Bits is the length of an object in bits and Bytes is the length of an object in bytes. bits_per_byte
becomes a conversion factor to translate from one unit to the other. The example shows the use of that conversion
factor to compute the number of bits in an integer.
Let’s say that you wanted to strengthen the integrity and robustness of a program by making sure that all shifts were by quantities that were typed Bits. For example you could define a function shift_left with the intention that this function have a monopoly on shifting unsigned types to the left. This could take the form:
inline unsigned shift_left( unsigned u, Bits b ) { return u << b; }
A simple grep for "<<" can be used to ensure that no other shift lefts exist in your program. Note that the example
deals only with unsigned but if there were other types that you wanted to shift left, such as unsigned long, you
can use the C++ overload facility.
Using C you may also employ the shift_left function. However you may not have inline available and you may be concerned about speed. To obtain the required speed you can employ a macro as in:
#define Shift_Left(u,b) ((u) << (b))
But you will note that there is now no checking to ensure that the number of bits shifted are of the proper type. One approach is to use conditional compilation:
#ifdef _lint #define Shift_Left(u,b) shift_left(u,b) #else #define Shift_Left(u,b) ((u) << (b)) #endif
This will work adequately in C. If the quantity being shifted is anything other than plain unsigned, you will need to
duplicate this pattern for each type.
A probably better approach is to define a macro that can check the type, such as the macro Compatible defined below:
#ifdef _lint #define Compatible(e,type) (*(type*)__Compatible = (e),(e)) static char __Compatible[100]; //lint -esym(528,__Compatible) //lint -esym(551,__Compatible) //lint -esym(843,__Compatible) #else #define Compatible(e,type) (e) #endif
You could then define the original Shift_Left macro as:
#define Shift_Left(u,b) ((u) << compatible(b,Bits))
Compatible(e,type) works as follows. Under normal circumstances (i.e. when compiling) it is equivalent to the
expression e. When linting it is also equivalent to e except that there is a side effect of assigning to
some obscure array that has been artfully configured into resembling a data object of type type. A
complaint will be issued if the expression e would draw a complaint when assigned to an object of type
type.
In this way you can be assured that the shift amount is always assignment compatible with Bits. Note that there is
no longer a need for the twin Shift_Left definitions. And Compatible can be used in many other places to assure
that objects are typed according to program requirements.
For simplicity, we have focused on shifting left. Obviously, similar comments can be made for shifting
right.
Consider a Flags type, which supports the setting and testing of individual bits within a word. An application might need several different such types. For example, one might write:
typedef unsigned Flags1; typedef unsigned Flags2; typedef unsigned Flags3; #define A_FLAG (Flags1) 1 #define B_FLAG (Flags2) 1 #define C_FLAG (Flags3) 1
Then, with strong typing, an A_FLAG can be used with only a Flags1 type, a B_FLAG can be used with only a
Flags2 type, and a C_FLAG can be used with only a Flags3 type. This, of course, is just an example. Normally there
would be many more constants of each Flags type.
What frequently happens, however, is that some generic routines exist to deal with Flags in general. For example,
you may have a stack facility that will contain routines to push and pop Flags. You might have a routine to
print Flags (given some table that is provided as an argument to give string descriptions of individual
bits).
Although you could cast the Flags types to and from another more generic type, the practice is not to be
recommended, except as a last resort. Not only is a cast unsightly, it is hazardous since it suspends type-checking
completely.
The solution is to use a type hierarchy. Define a generic type called Flags and define all the other Flags in terms of it:
typedef unsigned Flags; typedef Flags Flags1; typedef Flags Flags2; typedef Flags Flags3;
In this case Flags1 can be combined freely with Flags, but not with Flags2 or with Flags3.
Hierarchy depends on the state of the fhs (Hierarchy of Strong types) flag, which is normally ON. If you turn it off with the
-fhs
option the natural hierarchy is not formed.
We say that Flags is a parent type to each of Flags1, Flags2 and Flags3, which are its children. Being a parent to
a child type is similar to being a base type to a derived type in an object-oriented system with one difference. A
parent is normally interchangeable with each of its children; a parent can be assigned to a child and a child can be
assigned to a parent. But a base type cannot normally be assigned to a derived type. But even this property
can be obtained via the -father option (See Section 7.5.4 Restricting Down Assignments (-father)
).
A generic Flags type can be useful for all sorts of things, such as a generic zero value, as the following example shows:
//lint -strong(AJX) typedef unsigned Flags; typedef Flags Flags1; typedef Flags Flags2; #define FZERO (Flags) 0 #define F_ONE (Flags) 1 void m() { Flags1 f1 = FZERO; // OK Flags2 f2; f2 = f1; // Warn if(f1 & f2) // Warn because of J flag f2 = f2 | F_ONE; // OK f2 = F_ONE | f2; // OK Flag2 = Flag2 f2 = F_ONE | f1; // Warn Flag2 = Flag1 }
Note that the type of a binary operator is the type of the most restrictive type of the type hierarchy (i.e., the child
rather than the parent). Thus, in the last example above, when a Flags OR’s with a Flags1 the result is a Flags1,
which clashes with the Flags2.
Type hierarchies can be an arbitrary number of levels deep.
There is evidence that type hierarchies are being built by programmers even in the absence of strong type-checking. For example, the header for Microsoft’s Windows SDK, windows.h, contains:
... typedef unsigned int WORD; typedef WORD ATOM; typedef WORD HANDLE; typedef HANDLE HWND; typedef HANDLE GLOBALHANDLE; typedef HANDLE LOCALHANDLE; typedef HANDLE HSTR; typedef HANDLE HICON; typedef HANDLE HDC; typedef HANDLE HMENU; typedef HANDLE HPEN; typedef HANDLE HFONT; typedef HANDLE HBRUSH; typedef HANDLE HBITMAP; typedef HANDLE HCURSOR; typedef HANDLE HRGN; typedef HANDLE HPALETTE; ...
The strong type hierarchy tree that is naturally constructed via typedef declaration has a limitation. All the types
in a single tree must be the same underlying type. The -parent option can be used to supplement (or completely
replace) the strong type hierarchy established via typedef declarations.
An option of the form:
-parent( Parent, Child [, Child] ... )
where Parent and Child are type names defined via typedef will create a link in the strong type hierarchy between
the Parent and each of the Child types. The Parent is considered to be equivalent to each Child for the purpose of
Strong type matching. The types need not be the same underlying type and normal checking between the types is
unchanged.
A link that would form a loop in the tree is not permitted.
For example, given the options:
-parent(Flags1,Small) -strong(AJX)
and the following code:
typedef unsigned Flags; typedef Flags Flags1; typedef Flags Flags2; typedef unsigned char Small;
then the following type hierarchy is established:
Flags / \ Flags1 Flags2 | Small
If an object of type Small is assigned to a variable of type Flags1 or Flags, no strong type violation will be
reported. Conversely, if an object of type Flags or Flags1 is assigned to type Small, no strong type violation will be
reported but a loss of precision message will still be issued (unless otherwise inhibited) because normal type
checking is not suspended.
If the -fhs option is set (turning off the hierarchy of strong types flag) a typedef will not add a hierarchical link.
The only links that will be formed will be via the -parent option.
The option
-father( Parent, Child [, Child] ... )
is similar to the -parent option and has all the effects of the -parent option and has the additional property of
making each of the links from Child to Parent one-way. That is, assignment from Parent to Child triggers a
warning. You may think of -father as a strict version of -parent.
The rationale for this option is shown in the following example.
typedef int FIndex; typedef FIndex Index;
Here Index is a special Index into an array. FIndex is a Flag or an Index. If negative, FIndex is taken to be a
special flag and otherwise can take on any of the values of Index. By defining Index in terms of FIndex we are
implying that FIndex is the parent of Index. The reader not accustomed to OOP may think that we have the
derivation backwards, that the simpler typedef, Index, should be the parent. But Index is the more specific type;
every Index is an FIndex but not conversely. Whereas it is expected that we can assign from Index to FIndex it
could be dangerous to do the inverse.
Since we do not want down assignments we give the option
-father( FIndex, Index )
in addition to the strong options, say
-strong( AcJcX, FIndex, Index )
Then
FIndex n = -1; Index i= 3; i = n; /* Warning */ n = i; /* OK */
The safe way to convert a FIndex to Index is via a function call as in
Index F_to_I( FIndex fi )
To obtain a visual picture of the hierarchy tree, use the letter ’h’ in connection with the -v option. For example, using the option +vhm for the example in Section 7.5.3 Adding to the Natural Hierarchy you will capture the following hierarchy tree.
--Flags | | +--Flags1 | | | |__Small | |__Flags2
To get a more compressed tree (vertically) you may follow the ’h’ with a ’-’. This results in a tree where every other line is removed. For example, if you had used the option +vh-m the same tree would appear as:
--Flags |--Flags1 | |__Small |__Flags2
1//lint -strong(JAc, Meter, Kilogram, Second) 2//lint -strong(JAc, Area = Meter * Meter) 3//lint -strong(JAc, Volume = Meter * Meter * Meter) 4//lint -strong(JAc, Velocity = Meter / Second) 5//lint -strong(JAc, Acceleration = Meter / (Second * Second)) 6//lint -strong(JAc, Newton = Kilogram * Acceleration) 7/*lint -strong(JAc, GravitationalConstant = 8 Newton * Area / (Kilogram * Kilogram) 9 ) 10*/ 11typedef double Meter, Second, Velocity, Acceleration; 12typedef double Kilogram, Newton; 13typedef double Area, Volume; 14typedef double GravitationalConstant; 15 16const GravitationalConstant G = 6.67e-11; 17 18Newton attraction(Kilogram mass1, Kilogram mass2, Meter distance) { 19 return (mass1 * mass2) / (distance * distance); 20}
An expression is strongly typed if:
Every strong type is reduced to a canonical form internally. Dimensional strong types may be specified using any valid C expression containing:
binary * / operators
identifiers (including the special identifier 1 for numerators)
balanced parentheses
Output (in messages and the output of -vh ) will always be presented in the canonical form where all terms are reduced, consecutively multiplied operands are sorted lexicographically, multiplicative expressions as operands to division are parenthesized, a missing numerator is replaced with a 1, and a denominator of 1 is omitted (and its dividend not parenthesized).
The primary message numbers related to Strong Types are: 18 , 138 , 463 , 632 , 633 , 634 , 635 , 636 , 637 , 638 , 639 , 640 , and 697 . Setting a strong boolean type will affect the behavior of messages involving boolean contexts that are otherwise unrelated to strong types.