Vector Help

Reference Manual for PC-lint® Plus

10 Metrics

10.1 Introduction

Metrics provide access to valuable information about your program and can be used to enforce coding guidelines or measure code quality. A wide variety of statistics are available for a range of subjects including functions, classes, and files.

The two primary forms of output associated with metrics are reports and violations. A metric report provides the values of all metrics, or a nominated subset, in a standard format at the end of analysis. A metric check can be registered which will trigger violation messages when an entity does not meet the conditions of the check.

10.1.1 Terminology

A metric subject represents the abstract notion of all entities of a certain kind. For example, file and function are metric subjects. A defined list of metrics are associated with each subject, but values of collected metrics are a property of instances of that subject. Each individual file or function (instance) present in the program will collect values for metrics associated with the file or function subjects respectively.

The value of each metric is either a number, an entity, or a collection of entities. A numeric metric is classified as either a measure or a count. A defined quantity such as Halstead volume or cyclomatic complexity is a measure while a metric such as “number of forward goto statements” representing how many entities matching some criteria exist is a count. A collection is an array of instances of a particular metric subject, for example file.functions is a collection of metric data for each function defined within a particular file. Collections can be used with built-in operations including count, sum, and filter. See 10.3 Metric Expressions for more details on these operations.

See 10.6 Built-in Metrics for a list of metric subjects and their metrics.

10.1.2 Options

The +metric_report option can be used to generate customized reports. The +metric option can check for violations of metric-based conditions, nominate metrics for inclusion in the report, and create new custom metrics. All options relating to metrics must be appear prior to the first module.

10.2 Metric Report

The +metric_report option is used to request a metric report at the end of analysis. By default, all metrics are included in the report. If the +metric option has been used to nominate any metrics for explicit inclusion in the report then only nominated metrics will appear. This can be overridden by providing as an argument to the optional scope sub-option to +metric_report either all or nominated which determines conclusively whether all metrics are included or only nominated metrics (even if no metrics have been nominated).

The +metric_report option accepts the following optional sub-options:

For example, the option +metric_report(scope=all, format=xml, filename=path/file.xml) will request a report containing all metrics to be written to the file path/file.xml in XML format.

10.2.1 Metric Nomination

Metrics are nominated using a +metric option that simply names a metric using its subject and metric name. For example, +metric(file.num_lines) nominates the values of file.num_lines to appear on the report for all files.

The metric report includes function and class definitions as well as templates and specializations thereof, but it does not include instantiations.

10.2.2 Report Fields

The metric report provides the subject, entity_name, id (metric name), value, and location for each instance of an included metric. In the CSV format, fields appear in the listed order and column names are not included. In the JSON and XML formats, the indicated names are used and the entries appear enclosed in an outer metrics object. The location field uses the format file:line:col and is filled only for entries with the function subject.

10.3 Metric Expressions

Metrics use a C-like expression grammar with function calls (to built-in functions), simple assignment, parenthetical grouping, identifiers, numeric literals with optional decimal places, the dot operator, and arithmetic, logical, and comparison operators. Built-in functions are listed below. The assignment operator is available only in the context of defining a custom metric. Identifiers denote the name of a metric or a metric subject, and they may contain letters, underscores, and non-leading digits. Numeric literals are always decimal and consist of a whole number part optionally followed by a period and fractional part. The dot operator takes a metric subject or instance thereof on the left and a metric name on the right; this forms a metric designator which refers to that metric name of that subject, analogous to structure access in C. An expression is considered to “evaluate to true” if the result is a non-zero numeric value. Evaluation errors may yield an empty value and evaluation of a sub-expression to an empty value will typically cause evaluation of the enclosing expression to fail in turn. When a metric subject appears in an expression, it generally leads to implicit iteration in the enclosing context where the expression will be executed multiple times, once with each instance of that subject substituted in place of the abstract subject.

10.3.1 Built-in Functions

10.4 Metric Checking

If the expression provided to a +metric option is a comparison operator (possibly within a logical operator) then it introduces a new metric check. The expression must contain at least one metric designator and the subject (left component) of all metric designators in the top-level expression must match. Multiple metric names for the same subject may appear. The provided expression will be evaluated at the end of analysis for each instance of that subject (e.g. for the subject file the expression would be evaluated separately using the metric values for each file). The check is considered to be violated if the expression evaluates to true. A message in the info category will be issued reporting this violation. By default this will be 888 , although this is configurable using the msgno sub-option.

As an example, the option +metric(function.num_return_stmts > 1) will cause a message to be emitted when a function contains multiple return statements. If a function f with two return statements is defined in the presence of this option then PC-lint Plus will emit:

info 888: number of return statements (2) in function 'f' is greater than 1

Multiple metric designators involving the same subject may appear, for example +metric(file.num_comment_lines / file.num_lines < 0.10) will cause a message to emitted if fewer than 10% of lines in a file are comment lines (see Definitions below for the precise specification of terms like “comment line”).

When registering a metric check, the +metric option accepts the following sub-options:

For example, violations reported in response to the option +metric(file.num_lines >= 100 && file.num_comment_lines < 10, msgno=8042, append=[example]) will be emitted with message number 8042 and [example] appended to the message text.

Metric checks are generally performed during or after global wrap-up and can incorporate information from multiple modules. In unit checkout mode, metric checks for functions, classes, and modules are performed locally at module wrap-up. Metric data for each module is independent when performing metric checks in unit checkout mode.

10.4.1 Metric Violation Messages

Metric violation messages are generated automatically based on the provided expression and the circumstances of the individual violation. The previous example demonstrates that a violation involving the expression function.num_return_stmts > 1 can produce a message rendered as number of return statements (2) in function ’f’ is greater than 1 (where 2 is the number of return statements present in f).

When using a more complex expression, the message text may change not only in value parameterization but also in form. For example:

+metric( 
 (function.num_backward_goto_stmts > 0) || 
 (function.num_forward_goto_stmts > 0 && (function.num_labels > 1 || !function.isExternC)) 
 ,msgno=8001 
)

This check enforces that goto statements may only be used for the purposes of jumping to a single cleanup label at the end of an extern "C" function. Different violations of this requirement produce different messages. For example:

extern "C" void f() { 
   goto clean; 
clean: 
   return; 
}

does not violate the check.

void w() { 
begin: 
   goto begin; 
}

violates the check because w contains a backwards goto statement which is never permitted. This emits:

info 8001: number of backward goto statements (1) in function 'w' is greater than 0

In the alternative example:

void h() { 
   goto clean; 
clean: 
   return; 
}

the message is again reported but the text is now:

info 8001: number of forward goto statements (1) in function 'h' is greater than 0 
         and not isExternC of function 'h'

which highlights a different portion of the requirement.

10.4.2 Integration with Queries

Metric checks for a class or function can utilize Queries to access AST information. Single argument Query Functions that match the metric subject can be accessed by name in the same manner as a built-in metric. For example, the previously discussed option:

+metric( 
 (function.num_backward_goto_stmts > 0) || 
 (function.num_forward_goto_stmts > 0 && (function.num_labels > 1 || !function.isExternC)) 
)

creates a metric check which will report general usage of goto while allowing extern "C" functions to follow a permitted pattern of exception handling using a cleanup label. This uses the query function isExternC to determine whether the function is declared as extern "C".

10.5 Custom Metrics

If the expression provided to a +metric option is an assignment then the left operand must be a metric designator and the right operand must be an expression which may include other metric designators referring to different metrics about the same subject as the assignment target.

Custom metrics may be defined in terms of other custom metrics, but evaluation of a custom metric recursively on the same object will cause an error. Note that unevaluated recursion is permitted; the computation of a custom metric may involve evaluation of the same custom metric of a different object if it is defined in a way that leads to a base case. See the 10.5.1.2 Number of inheritance edges in class hierarchy example below for an application of recursion.

Metric designators referring to custom metrics can be nominated or used in checks like built-in metrics. A custom metric cannot redefine a built-in metric nor a previously defined custom metric.

Assignments may not appear (and by extension, custom metrics may not be created) within a larger expression of any +metric option. A custom metric can only be created when a +metric option specifies exactly one top-level assignment.

The sub-option nominate, taking no arguments, may be specified to additionally nominate the newly created metric in a manner equivalent to an explicit +metric nomination option referring to the metric designator being assigned to.

10.5.1 Examples

Percentage of functions in a file with multiple return statements return_ratio.lnt:

// Create a new custom metric called 'funcs_with_multiple_returns' for any 'file'. 
// This custom metric is a collection and consists of all functions from the built-in 
// 'file.functions' where the condition 'function.num_return_stmts > 1' is true. 
+metric(file.funcs_with_multiple_returns = filter(file.functions, function.num_return_stmts > 1)) 
// Create a new custom metric called 'multiple_return_ratio' for any 'file'. 
// This custom metric is numeric and is the ratio of the sizes of the 
// 'funcs_with_multiple_returns' and 'functions' collections. The size of a collection is found 
// using the 'count' function. 
+metric(file.multiple_return_ratio = count(file.funcs_with_multiple_returns) / count(file.functions)) 
// Nominate 'multiple_return_ratio' to appear on the report. 
+metric(file.multiple_return_ratio) 
// Request a metric report. It will be emitted to standard error at the end of analysis. 
// Because a metric was nominated, only nominated metrics will be displayed. 
// Only 'multiple_return_ratio' has been nominated. The default format is CSV. 
+metric_report

test.cpp:

int f1(int a) { return a + 1; } 
int f2(int a) { return a + 2; } 
int f3(int a) { 
   if (a == 5) { 
      return 3; 
   } else { 
      return 300; 
   } 
} 
int f4(int a) { return a + 4; }

Example report output:

file,test.cpp,multiple_return_ratio,0.250

Number of inheritance edges in class hierarchy For example, consider the class hierarchy:

struct X { }; 
struct Y : X { }; 
struct Z { }; 
struct A : Z, Y { }; 
struct B { }; 
struct C : A, B { }; 
struct D : C { };

To compute the total number of inheritance edges in the hierarchy for each class, a custom metric can be created:

+metric( 
   class.hierarchy_edges = 
      count(class.immediate_bases) + 
      sum(class.immediate_bases, class.hierarchy_edges) 
)

This option defines the custom metric class.hierarchy_edges recursively to the sum of the number of immediate base classes of all base classes in the hierarchy. The recursion eventually ends when a class has no immediate bases, which yields a sum of 0 without evaluating hierarchy_edges. This produces the report:

class,A,hierarchy_edges,3 
class,B,hierarchy_edges,0 
class,C,hierarchy_edges,5 
class,D,hierarchy_edges,6 
class,X,hierarchy_edges,0 
class,Y,hierarchy_edges,1 
class,Z,hierarchy_edges,0

Ratio of number of lines in member function to average number of lines in member functions of the containing class This example creates a custom metric, metric_length_ratio, representing the ratio of the number of lines in a member function to the average number of lines in member functions of the containing class. The first option creates a custom metric called avg_method_lines which calculates the average number of lines across methods in a class. The second option uses this avg_method_lines metric from the enclosing class to define method_length_ratio for each function.

+metric(class.avg_method_lines = average(class.non_static_member_functions, function.num_lines)) 
+metric(function.method_length_ratio = function.num_lines / function.parent_class.avg_method_lines)

10.6 Built-in Metrics

10.6.1 Definitions

Halstead metrics, from Halstead’s “software science”, are defined by equations based on the classification of program tokens as operators or operands [?].

Note that in C++, the standard grammar term function-try-block refers to a construct of the form:

void f() try { } catch (...) { }

where the compound statement typically encompassing a function body is replaced with a top-level try-catch.

10.6.2 Project (project)

10.6.3 Translation Unit (translation_unit)

10.6.4 File (file)

10.6.5 Function (function)

10.6.6 Class (class)

10.6.7 Dynamic Metrics

Project hosts provide dynamic metrics of the form num_message_N_emitted where N is a message number emitted at least once during analysis. For example, if two instances of message 9001 are emitted, then the dynamic metric project.num_message_9001_emitted will have a value of 2 on the report. Dynamic metrics are not available in metric checks.

10.7 Sample Derived Metrics

The flexibility of metric expressions provides wide latitude to derive ratios, averages, and entirely new metrics from the data listed in the previous section. The definitions of common derived metrics are provided as a sample of what can be accomplished with metric options.

10.7.1 Class (class)