Calculation Set Definition Specification

Overview

A RIOS Calculation Set Definition is a standard means to represent a set of calculations that can be applied to the data collected in an Assessment Document for a given Instrument Definition.

Format

Calculation Set Definitions are stored and exchanged as JSON (JavaScript Object Notation) objects. The structure of these objects must adhere to the rules set forth in this document. When stored in files, these files must be UTF-8 encoded.

Structure

Root Object

The Root Object of a Calculation Set Definition consists of a few properties:

instrument
Type:Instrument Reference Object
Constraints:Required
Description:This property specifies which Instrument Definition the calculations defined within this definition applies to.
meta
Type:Metadata Collection Object
Description:This property allows arbitrary information about this Calculation Set to be stored within the definition. This property is optional.
calculations
Type:Array of Calculation Object
Constraints:Required; Must contain at least one Calculation Object
Description:This property contains the set of calculations that should be applied to an Assessment Document. The ordering of calculations in this array is important, as they are always executed in the specified order.

Instrument Reference Object

An Instrument Reference Object is the means for a Calculation Set Definition to reference the exact Instrument (and version of that Instrument) that the values contained within are in reference to.

id
Type:String
Constraints:Required; Must be a URI as described in RFC3986
Description:This property is a reference to the id property on the root object of an Instrument Definition. It is meant to specify the exact Instrument this Calculation Set is augmenting.
version
Type:String
Constraints:Required
Description:This property is a reference the the version property on the root object of an Instrument Definition. It is meant to specify the exact revision of the Instrument this Calculation Set is augmenting.

Calculation Object

Calculation Objects are the core of what makes up a Calculation Set Definition. They describe the values that should be calculated for an Assessment Document. These objects consist of several properties:

id
Type:String
Constraints:Required; Must be an Identifier
Description:This property uniquely identifies the calculation so that its value can be referred to in subsequent documents or calculations. It must be unique amongst all calculations and fields IDs from the original Instrument Definition.
description
Type:String
Description:This property allows the Calculation Set author to explain what the calculation is, what it’s being used for, or any other helpful information. This property is optional and is not intended to ever be shown to an end-user.
identifiable
Type:Boolean
Description:Indicates whether or not the value generated by this calculation will (or can) contain information that can be used to identify the subject or respondent. This is typically used to flag calculations that would contain information that could be classified as “Protected Health Information” (HIPAA PHI), “Personally Identifiable Information” (NIST PII), “Personal Data” (EU Data Protection Directive), etc. This property is optional, and, if not specified, is assumed to be false.
type
Type:

Enumerated String

Constraints:

Required; Must be one of the Instrument Definition data types listed below

Description:

This property identifies the type of data that will be returned as a response to this Field.

PossibleValues:
  • float
  • integer
  • text
  • boolean
  • date
  • time
  • dateTime
method
Type:

Enumerated String

Constraints:

Required

Description:

This property identifies method that will be used to perform the calculations.

PossibleValues:
Method Description
python The calculation will be in the form of a single-line Python v2.7 expression, or the name of a Python callable that can be imported and executed.
htsql The calculation will be in the form of an HTSQL v2 expression.
options
Type:Object
Constraints:The contents of the Object depend on the method specified in the method property. See the Calculation Methods section for information on which options are needed for which methods.
Description:This property allows the calculation author to provide the necessary information to the calculation engine in order to perform the operation.

Identifier

Identifiers are strings that adhere to the following restrictions:

  • Consists of 2 or more of the following characters:
    • Lowercase latin alphabetic characters (“a” through “z”; Unicode 0061 through 007A)
    • Latin numeric digits (“0” through “9”; Unicode 0030 through 0039)
    • Underscore characters (“_”; Unicode 005F)
  • The first character is an alphabetic character.
  • The last character is not an underscore.
  • Does not contain consecutive underscore characters.

Example Identifiers:

  • page1
  • grp_a
  • ref_1_2_alpha

Metadata Collection Object

A Metadata Collection Object consists of one to many properties that allows you to attach arbitrary, implementation-specific, or other such data to structures within an Calculation Set Definition.

For consistency’s and interoperability’s sake, some common data elements are defined below, but note that the Metadata Collection Object has no required or predefined properties, and can therefore contain any (legal JSON) property names and value data types. Software that consumes Calculation Set Definitions must ignore any property whose name it does not recognize or support.

Property Name Data Type Example Description
author String John Smith A string that describes the entity that created this definition.
copyright String 2009, Smith Instrumentation A string that describes who owns the copyright to the Instrument, Calculations, or Scores implemented by this definition.
homepage String http://www.example.com A URL (as described by RFC1738) to a web page that has more information about this Instrument or Calculation.
generator String SurveyBuilder/1.0 A string that indicates what application produced the Calculation Set Definition. This must should be formatted similarly to HTTP Product Token strings as specified in RFC2616.

Calculation Methods

python

The python method provides two approaches to specify the calculation, both being implemented using the Python v2.7 language. The approach used is based on which properties are passed into the accompanying options object.

Expressions

The first approach is through an explicitly defined expression. In the options object that accompanies the calculation definition, there must be a property named expression that contains a single-line Python expression. The value that results from the evaluation of this expression is what will be stored as the result of the calculation.

This expression will be evaluated within a scope that includes:

Given an Instrument that defines two fields, “foo” and “bar”, the following are some examples of what expressions could look like:

assessment['foo'] * 2

assessment['foo'] + math.log(assessment['foo'])

'GOOD' if re.match(r'^[a-z]{3}$', assessment['bar']) else 'BAD'

Callables

The second approach is through specifying a callable object by name. In the options object that accompanies the calculation definition, there must be a property named callable that contains the dot-separated, fully-qualified path to the callable. The value that this callable returns is what will be stored as the result of the calculation.

When executed, the callable object will receive the following arguments:

assessment
A dictionary containing the Assessment values (described in Assessment Variable).
calculations
A dictionary contain the previous calculation values (described in Previous Calculation Variable).

If the callable property had the value “mymodule.my_calculation”, it could be implemented as follows:

# mymodule.py

def my_calculation(assessment, calculations):
    return assessment['foo'] * 2

Or,

# mymodule.py

class Calculator(object):
    def __call__(self, assessment, calculations):
        return assessment['foo'] * 2

my_calculation = Calculator()

Assessment Variable

In both execution approaches, a variable named assessment is made available that contains the values from the Assessment. This variable is a dictionary whose keys correspond to the field identifiers from the Instrument. All field identifiers will be present as keys, even if there is no value (e.g., None) recorded for the field.

The values for these keys will be coerced to the appropriate Python types according to the following table:

Instrument Type Python Type
integer int
float float
text unicode
boolean bool
date datetime.date
time datetime.time
dateTime datetime.datetime
enumeration unicode
enumerationSet list of unicode
recordList list of dictionaries whose keys are the sub-field identifiers
matrix dictionary whose keys are the row identifiers, and the values are then dictionaries whose keys are the column identifiers

Previous Calculation Variable

In both execution approaches, a variable named calculations is made available that contains the values that resulted from previous calculations performed during this execution. Calculations within a given Calculation Set are executed in the order they’re listed in the definition. The resulting values are then passed to each subsequent calculation.

For example, imagine a Calculation Set definition where three calculations are defined in the following order: “foo”, “bar”, “baz”. When the “foo” calculation is executed, the calculations dictionary will be empty. When the “bar” calculation is executed, the calculations dictionary will have a single key, “foo”, with the results of the “foo” calculation. When the “baz” calculation is executed, the calculations dictionary will have two keys, “foo” and “bar”, containing their respective calculation results.

htsql

The htsql method allows calculations to be written as HTSQL v2 expressions. The expression to execute must be specified in an expression property on the accompanying options object.

Given an instrument that defines two fields, “foo” and “bar”, the following are some examples of what expressions could look like:

$foo * 2

trunc($foo) + 42

if($bar > 10, 'GOOD', 'BAD')

{$foo + $bar}

/{($bar - $foo) / $foo}

If the value returned by the HTSQL expression is scalar, that value is what is kept as the result. If the value returned is a Record, then the value in the first column of that Record is kept as the result. If the value return is a list of Records, then the value in the first column of the first Record is kept as the result.

Assessment Parameters

Assessment values for simple-typed fields will be available to your expression as parameters that can be accessed using reference syntax (e.g., prefixing the name with $ – so, the “foo” field would be access liked $foo).

To access the values of matrix cells, you’ll need to concatenate the ID of the matrix field with the ID of the row and the ID of the column with underscores. For example, $matrixfield_firstrow_somecolumn.

Due to a limitation of the the mechanics of HTSQL, the values for the subfields in recordList questions will not be available for use by your expressions.

The values for these parameters will be coerced to the appropriate HTSQL types according to the following table:

Instrument Type HTSQL Type
integer integer
float float
text untyped
boolean boolean
date date
time time
dateTime datetime
enumeration untyped
enumerationSet record of untyped

Previous Calculation Parameters

Much like the Assessment values, the values that resulted from previous calculations performed during this execution will be available as referenceable ($-prefixed) parameters. Calculations within a given Calculation Set are executed in the order they’re listed in the definition. The resulting values are then passed to each subsequent calculation.

Calculation Results

The results of the calculations in a Calculation Set will be stored in the document-level meta structure of the Assessment Document under the property named calculations. This property will be an object whose keys are the identifiers of the calculations, and whose values are the results of those calculations. All calculation identifiers must be present in the object, even those whose calculations resulted in a null/None.