.. _classifier:

# Classifier functions

## What are they?

Classifier functions, also known as scaffold or decision tree functions, give
users a way to define their own RiskScape functions using a simpler syntax than
Java or Python.

At their simplest, they can let you define a simple loss formula based on your
model's inputs.  At the more complex end, they can allow you to define a
classification tree to classify your assets and then use that classification to
select a different function to build a more complex loss result.

## Quick-start

This guide will show you how to define the simplest possible classifier
function, test it via the RiskScape CLI, and then expand it in to a more complex
function and use it in a model to produce losses for your assets.

### Hello, World

Below, is a simple example of a valid scaffold function that always returns
a greeting.

```text
id: hello-world

argument-types:
  input:
    name: text

return-type: text

filter: strLength(input.name) > 3
  filter: input.name = 'John'
    function: 'Giday'

  filter: strLength(input.name) = 4
    function: 'Greetings'

  default:
    # This greeting applies if none of the filters at the same indent level (or deeper)
    function: 'Hello'

default:
  function: 'Hi'
```

Save it in your `functions` directory as `hello-world.txt`.

Then create a file called `names.csv` containing the following content.

```text
name
Bob
John
Mary
Ann-Marie
Jo
```

RiskScape can then be ran with this command:

```text
riskscape model run table -p function=hello-world -p relation=names.csv -o - -p include-asset-in-output=true
```

### Function anatomy

A classifier function works like a decision tree; *filters*
decide which branch of the tree are applicable, *functions* are
the leaves and do any required computation by applying an expression
to the input.

A *pre*-leaf (if present) is evaluated before searching the tree
for the leaf to apply. Any attributes set by a pre-leaf may be
used in *filters* or *functions* in later leaves.

The body of the function is the tree of filters and functions defined
in-between the pre- and post- sections.  If a filter is indented beneath another filter, this is referred to as a child filter.
Child filters are evaluated if the parent filter evaluated to true.
If a filter has a function nested beneath it and no child filters, then this function
is evaluated if the filter evaluates to true.
These  comprise the classifying
part of the function.

A *post*-leaf (if present) is evaluated after the body has been evaluated.

.. uml::

  start
  if (pre-leaf exists) then (yes)
    :Evaluate the pre-leaf;
  endif
  :Walk tree to find leaf with all *filters* true;
  if (tree leaf not found) then (yes)
    :Select default leaf instead;
  endif
  :Evaluate the tree leaf;
  if (post-leaf exists) then (yes)
    :Evaluate the post-leaf;
  endif
  :Return result;
  stop

.. note::

  When evaluating, a leaf may access the input and any output of a earlier
  leaf.

  - pre-leaf, may only evaluate function inputs
  - tree leaf, may evaluate function inputs and any pre-leaf outputs
  - post-leaf, may evaluate function inputs, pre- and tree leaf outputs

#### `id`

Sets the identifier for this function.  This function can be referred to by this
identifier in expressions.

#### `argument-types`

Argument-types specify what the function expects for inputs, i.e.

- number of inputs
- name for input
- type of input

The general format is:

```text
argument-types:
  name: type-expression
```

The argument's name is used to access the input from *filter* and *function* expressions.
For example, to pass one of the built-in RiskScape :ref:`types` (a string)
and call it `arg-1`, use:
```text
argument-types:
  arg-1: text
```

If the argument is a type you've defined yourself (i.e. in `types.ini`), then use
the `lookup()` function and pass it the name of your type, e.g.

```text
  arg-2: lookup('some-type-id')
```

To declare a Struct argument, use a multi-line type expression. For example:
```text
  arg-3:
    name: text
    age: integer
    ...
```

Note that when you define a function for a model, its arguments *must* be Struct(s).

#### `return-type`

The return type, while optional, allows you to explicitly state what your function returns.
This is set in a similar way to individual argument-types, e.g.

```text
return-type: lookup('some-type-id')

# or, to return a struct:

return-type:
  damage: floating
  category: text
```

If no return type is defined, then the function will return the result of either the body, or the `post` section, depending on whether `post` is defined

#### `filter`

The *filter* is a test to determine if this branch/leaf should be applied.

If a *filter* condition is true, then processing will step to the right of
the tree looking for either:
- *function* in which case it becomes the tree leaf
- the next *filter* at a greater indentation level.

If a *filter* condition is false the processing will:
- look down the tree for a *filter* or *default* at the same indentation level
- step to the left and down the tree, looking for *filter* or *default* at the
  higher level

*filters* are written using RiskScape language expressions that return a boolean type
(e.g. true or false)

.. note::

  Any *filter* may have a *default*.  If a top level *default*
  filter is not defined, the return type of the function will be :ref:`nullable <type-nullable>`.


#### `function`

The `function` (or `default`) allows you to apply an expression to the function's arguments to produce a result.

*Functions* are written using the :ref:`expressions`.
There are two forms of *function* expressions - a single-line form and a multi-line struct form.

The single-line form looks like:
```text
# This function returns text
function: input.name + ' ' + hazard.name
```

The multi-line form looks like:
```text
function:
  # This function returns a struct containing 'name' and 'cost'
  name: input.name + ' (' + input.region + ')'
  cost: '$' + str(input.cost_dollars) + '.00'
```

##### Calling other functions

Other functions can be called using the [RiskScape expression language](../expressions.md#functions)

```text
function: some_damage_function(hazard) * asset.cost
```

.. _old_classifiers:

## Converting older classifier functions

Earlier versions of RiskScape (pre-2020) used a slightly different format for classifier functions.  Here is a short guide to the differences and upgrading your functions.

### Declaring types

Types used to be declared with an extra level of nesting like so:

```text
argument-types:
  arg1:
    id: some-other-type
  arg2:
    expression: text
  arg3: struct(id: integer, name: text)
```

This can be converted to the new format like so:

```text
argument-types:
  arg1: lookup('some-other-type')
  arg2: text
  arg3: struct(id: integer, name: text)
```

Additionally, the third argument could be declared like:

```text
  arg3:
    id: integer
    name: text
```

This form is more consistent with how multi-line functions are written.

### Return value behaviour

In the previous version, the result could have any struct members set in pre-, body or post- sections.
This has changed - only the body values
or post values are returned.   The return type is now optional, and RiskScape will build a
return type for you if omitted.

Consider this example:

```text
pre:
  a: 1

default:
  b: 2

post:
  c: 3
```

In the previous version of classifier functions, `a`, `b` & `c` would be included in the returned value if `a`, `b` & `c` were present in the declared return type.  In the current version, only `c` will be returned.  If you wanted to return `a` and `b` as well, the `post` section would need to change to:


```text
post:
  a: a
  b: b
  c: 3
```
