.. _cpython-impl:

# CPython

.. tip::
    *CPython* is what most people consider regular Python.
    If you have used Python before on your computer, then you were most likely using CPython.
    We refer to it as CPython in the documentation for clarity.

Python is a programming language. The Python language has several different *implementations*, or variants.
Two Python implementations are supported by RiskScape:
- :ref:`jython-impl`: the *default* Python support in RiskScape.
- [CPython](https://en.wikipedia.org/wiki/CPython): the Python that you typically use outside of RiskScape.

As soon as you try to `import` Python modules into your RiskScape function,
you will notice that Jython behaves quite differently to regular Python.
To make life easier, RiskScape lets you write RiskScape functions that use the exact same
Python installation that you use normally.

.. note::
    To use CPython functions in RiskScape, you must have Python already installed locally on your system.

## Configuring CPython support in RiskScape

Normally when you run Python code *outside* of RiskScape, you will be using
a `python` binary (Linux or Mac) or a `python.exe` executable (Windows) that is installed somewhere on your system.

In order to use CPython in RiskScape, you have to first tell RiskScape *where* this Python binary is located.
We do this by specifying full path to the Python binary in RiskScape's `settings.ini` file.

.. note::
    If you're using Linux, and you don't specify a path to the Python binary, 
    RiskScape will check `/usr/bin/python3`, and use that if available. 

Refer to the :ref:`settings_ini` instructions for where `settings.ini` is located on your system.
You may need to create the `settings.ini` file if it does not already exist.
On Windows, you can do this by entering the following commands into your terminal:

```none
mkdir %USERPROFILE%\RiskScape
notepad %USERPROFILE%\RiskScape\settings.ini
```

Next, you will need to add a `[cpython]` configuration section and specify the `python3-bin` path to the Python binary.
On Windows, a typical `settings.ini` might look something like this:

```ini
[cpython]
python3-bin = C:\Users\Ronnie\AppData\Local\Python\Python 3.8\python.exe
```

The actual `python3-bin` path will vary depending on where Python was installed on your system.
If you are not sure where Python is installed on your system, try running the following command:

- `where python` (Windows)
- `which python` (Linux or Mac)

.. note::
    If you use several different Python environments (i.e. ``venv``), then each virtual environment will have its own Python executable.
    You will need to *activate* the Python environment you want to use first, *before* running ``where python``.

For example, if you use the [Anaconda Python distribution](https://en.wikipedia.org/wiki/Anaconda_(Python_distribution%29)
the default Python location might look like `C:\Users\Ronnie\Anaconda3\python.exe`.
Whereas the path for a Python environment called `Geo` would look more like `C:\Users\Ronnie\Anaconda3\envs\Geo\python.exe`.

On Linux or Mac, the `settings.ini` file will look more like the following (note that the path may vary).

```ini
[cpython]
python3-bin = /usr/bin/python3
```

You can test your installation by running `riskscape -V`.
If any warnings or errors occur, then you may have specified the incorrect `python3-bin` file path.
Try following the Troubleshooting section below.

## Writing CPython functions for RiskScape

There is no special magic required to make your functions work with RiskScape, but there are a
few important points to know.

* You must have a function in your script called `function` - this is the one that
  RiskScape will call.

* To `import` a Python package in your RiskScape function, you must first have that CPython package
  installed locally. Refer to [python.org](https://packaging.python.org/en/latest/tutorials/installing-packages)
  for more details on how to install a Python package.

* Data is passed to CPython in simple JSON-like data types; it will be a mixture
  of numbers, strings, lists and dictionaries.  Trying to use more complicated data types,
  like coverages or lookup tables, will fail.

* You can use `print` for debugging your scripts within RiskScape - the output is sent to RiskScape
  and will be displayed alongside any other RiskScape output.

.. note::
    Your function will get called for *every* element-at-risk, so adding ``print`` statements can
    result in *a lot* of output for ``riskscape model run`` commands.
    We recommend only using ``print`` with ``riskscape expression evaluate``, or with a *small*
    sub-set of model data.

* You can test your functions with and without RiskScape.  You could use a separate python script that
  imports and tests your CPython RiskScape function script, or you can test it in RiskScape
  using the `riskscape expression evaluate` command.

* For more efficient model processing with CPython, it can be worth defining the type that your function uses.
  A lot of the examples will simply use `argument-types = [ building: anything, hazard: nullable(floating) ]`.
  However, this can be inefficient because the `anything` type will pass *all* the exposure-layer data to your function,
  regardless of whether the function uses it or not.

* Not all data types are supported with CPython. For example, `decimal` and `date` types are currently not supported.
  This will result in an error if your exposure-layer data contains these types.
  You can avoid this problem by defining the exact :ref:`type <types>` that your function expects, instead of `building: anything`.

### A Simple example

To start with we'll create a simple hello world python function.  Create a `helloworld.py` file in
your favourite text editor and add the following:

```python
def function(name):
  return "Hello, %s" % name
```

In the same directory, create a `project.ini` and add the following:

```ini
[function helloworld]
framework = cpython
location = helloworld.py
return-type = text
argument-types = [text]
```

Now, in the same directory, open a terminal or command prompt and test your function
with the following command:

```none
riskscape expression eval "helloworld('Ronnie')"
```

If it has worked successfully, you should see:

```none
Hello, Ronnie
```

Fortunately, CPython integration is not limited to just "Hello, world" examples.

### Using Geometry in your CPython functions

RiskScape supports passing geometry to your CPython function,
but there are some important things you will need to know if you want to manipulate geometry in your Python code.

There are many Python packages available for working with geometry.
RiskScape can't predict what specific geometry package you will want to use,
so it passes the geometry to your function in the well-known bytes format (known as WKB).

In your function, you can then construct a geometry object from the WKB.
How you do this depends on what Python package you are using for geometry.

If your function also returns geometry, then you will need to return it in WKB format,
rather than returning the geometry object itself.

Here is a simple example that uses the `shapely` Python package to work with geometry.
The following function both takes geometry as an argument and also returns it.

``` ini
# project.ini
[function get_centroid]
location = centroid.py
framework = cpython
description = Return the centroid (point) of a simple feature
return-type = geometry
argument-types = [geometry]
```

``` python
# centroid.py
from shapely import wkb

# returns the centroid of the feature
def function(geom):
  # note the geometry argument gets passed as a tuple
  bytes, srid = geom
  # use shapely to turn the WKB into geometry
  shapely_geom = wkb.loads(bytes)

  # return the input geometry's centroid (as WKB)
  return (shapely_geom.centroid.wkb, srid)
```

Notice that the geometry argument is passed as a two-item tuple, containing the WKB `bytes` as well as an `srid`. This is because RiskScape
stores an opaque identifier, called an spatial reference identifier (SRID), to identify the coordinate reference system
the geometry belongs to.

.. tip::
    You only need to worry about the SRID if your function's return-type contains geometry, i.e.
    you want to modify the geometry yourself in Python. Returning the same SRID tells RiskScape
    that the coordinate reference system has not changed.

### Inline functions

If you only need to write a small amount of Python code, you might prefer to do it *inline*. 
Python code can be added directly to your `project.ini` by using the `source` parameter. 

 ```
  [function echo]
  framework = cpython
  argument-types = [text]
  return-type = text
  source = '''
  def function(message):
      return message + ' back at you'
  '''
  
  ```
#### Importing other python files

You may want to keep related code together in the same Python file, but then reuse the
code as separate functions in your RiskScape pipeline. 

When using a `.py` file as the location for your RiskScape function, you can only define a
_single_ function per `.py` file.
However, it is easy to use the same `.py` file to define _multiple_ RiskScape functions by importing the Python code as inline functions.

For example, say we had the following `functions.py` file that contains the following Python code:

```python
def square_root(num):
    return num ** (1/2)

def square(num):
    return num ** 2
```

Then you could import the python code and use it in a new inline function like this:

```ini
[function pythagoras]
description = Returns the hypotenuse of a right-angled triangle
framework = cpython
argument-types = [floating, floating]
return-type = floating
source = '''
from functions import square, square_root
def function(a, b):
    return square_root(square(a) + square(b))
'''
```

.. note::
    When importing python code like this, the path of the ``.py`` file should
    always be relative to your ``project.ini`` file.
    For example, if the ``functions.py`` file was located in a ``util`` directory,
    then you would use ``from util.functions import ...```

### Importing functions directly

The Python code that will get turned into a RiskScape function is always called ``function``.
Therefore you can import individual Python functions directly from a `.py` file simply by renaming them 'function'.
For example:

```ini
[function sqrt]
framework = cpython
source = from functions import square_root as function
argument-types = [floating]
return-type = floating

[function square]
framework = cpython
source = from functions import square as function
argument-types = [floating]
return-type = floating
```

.. note::
    These examples were chosen for simplicity, but they already exist as *built-in* RiskScape
    functions, called ``pow()`` and ``square_root()``.

## CPython vs Jython

The CPython and Jython plugins are both enabled in your RiskScape installation by default.
If you correctly setup CPython in your `settings.ini`, any Python functions
in your project will be executed using CPython by default.
Otherwise, Jython will be used by default.

In your INI file, you can explicitly define what Python implementation RiskScape should use
via the `framework` parameter, like we did in our `helloworld` example earlier.
This makes it possible to use a mix of both Jython (i.e. `framework = jython`) and
CPython (i.e. `framework = cpython`) functions in your project.

Whether you use CPython or Jython will depend on your circumstances.
Using CPython requires extra setup, as you need to have a pre-existing
working python environment available, but it gives you access to a
complete and familiar python environment.

Jython requires no setup, but is limited to python 2.7 and is
geared towards scripting a Java environment, rather than being a true
python environment.  For simple functions this isn't a problem.

.. tip::
    If you have never used Python before, and only want to write a simple risk function,
    then we recommend using Jython. If you want to use other Python modules, such as ``numpy``
    then you should use CPython.

## Troubleshooting

First, check the `python3-bin` path in your `settings.ini` is a valid Python 3 executable
using the `python -V` command.

```none
C:\Your\Path\To\Python\python3.exe -V
```

RiskScape only supports Python 3. If you see a `2.x.x` version displayed, then you will need
to install Python 3.
You may also want to check your version of Python 3 is [still supported](https://en.wikipedia.org/wiki/History_of_Python#Table_of_versions).
Older, unsupported versions of Python 3 may still fail to work with RiskScape due to compatibility issues.
RiskScape is known to work with Python v3.8 and v3.9.

Next, check that you can use your `python.exe` path to execute a simple Python statement, e.g.

```none
C:\Your\Path\To\Python\python.exe -c "print(1 + 1)"
2
```

If you see an error instead of `2`, it may mean you are using the wrong `python.exe` path, or you need to
install a more recent version of Python on your system.
For example, a bad Python install might throw out a cryptic error like:

```none
C:\Your\Path\To\Python\python.exe -c "print(1 + 1)"
ImportError: No module named site
```

Finally, if that all works, check that all the `settings.ini` keywords are all lower-case, and RiskScape is loading the correct `settings.ini` file.
The following command will display debug containing the `settings.ini` and CPython path.

```none
riskscape --log-level .engine.cli.CliBootstrap=info,.cpython=info --version
```

You could also double-check the INI file definition for your function. If it contains the ``framework`` parameter,
then check it's set to the Python implementation that you want to use, i.e. ``cpython`` or ``jython``.

