Skip to content
Snippets Groups Projects
Commit aa837b46 authored by Alex Wiens's avatar Alex Wiens
Browse files

Add further notes to motivation/rationale

parent 37df6dd1
No related branches found
No related tags found
No related merge requests found
# Rule format - Motivation and rationale
The following documentation explains different design dimensions considered during implementation.
You can find documentation about the actual implementation here:
# Rule format - Motivation and rationale
* [Rule Format](rule-format.md)
* [Rule - available data structures and functions](data-structures-and-functions.md)
## Rule format
The overall motivation for the rule format is an easy to write and read format for rule specification that supports the necessary rule complexity.
Sufficiently complex rules require a sequence of arithmetic operations.
......@@ -44,7 +51,10 @@ Python also supports strings, but these could be, if necessary, replaced with id
Strings, scalar integers and scalar floating point numbers are an integral part of Python.
For the representation of matrices and vectors, one could use structures of arrays or types from additional libraries such as NumPy or Pandas.
We decided to use the Pandas DataFrame type for matrices and vectors in hope that it would improve elegance of the expressions.
Pandas DataFrame types come with support for attached metadata, but the mechanism was not flexible enough.
We decided to use a new type that wraps metadata and NumPy matrices and vectors.
The downside of this approach is the necessity to add all required functions to the new data type that need to be applicable to the data (e.g. mean, sum, etc.).
However, when using these functions the metadata needs to be adjusted according to the function and therefore the functions need to be wrapped anyway.
## Scopes
......@@ -71,6 +81,8 @@ To attach timestamps to the measured data, there are two methods:
* Save timestamps as additional column to the input.
* Save timestamps as metadata attached to the matrix object.
In our case, the timestamps are stored as column stored separately in the metadata.
## Operations
Operations are applied to matrix and vector values to perform arithmetic transformations and reductions.
......@@ -98,6 +110,7 @@ Therefore, there are two methods to achieve operator overloading:
* Assign functions to every new object.
In this case, every relevant function needs to be wrapped, to manipulate the resulting object.
In our case, we use a new class and additionally wrap the functions to adjust metadata.
## Units
......@@ -113,6 +126,8 @@ There are two possible implementation methods:
* Values are stored in types that understand units and perform unit checks implicitly.
* Units are stored for each column in the metadata and unit checks are performed separately for every operation.
We are using `Pint` to wrap the NumPy values with the necessary unit information.
## Open ideas
......@@ -180,6 +195,51 @@ y = x * 2.0
Here the rule evaluation logic first evaluates the condition expression "c" and afterwards evaluates the expressions in either the "if" branch or the "else" branch.
### Rule format files
As an alternative to a JSON based file, one could also create a file format mixed from plain text and JSON (or YAML).
Example:
```
---
{
"name":"Low CPU load",
"tag":"lowload",
"parameters": ["lowcpuload_threshold_factor","job_min_duration_seconds","sampling_interval_seconds"],
"metrics": ["cpu_load"],
"requirements": [
"job.exclusive == 1",
"job.duration > job_min_duration_seconds",
"required_metrics_min_samples > job_min_duration_seconds / sampling_interval_seconds"
],
"output":"lowload",
"output_scalar":"load_perc",
"template":"Job ({{ job.jobId }})\nThis job was detected as lowload because the mean cpu load {{ load_mean }} falls below the threshold {{ load_threshold }}."
}
---
# rule terms
load_mean = cpu_load[cpu_load_pre_cutoff_samples:].mean('all')
load_threshold = job.numHwthreads * lowcpuload_threshold_factor
lowload_nodes = load_mean < load_threshold
lowload = lowload_nodes.any('all')
load_perc = 1.0 - (load_mean / load_threshold)
# next rule ...
---
[..]
---
[..]
```
Here, the rule attributes are defined in a JSON object in a separate section enclosed by two `---` lines that are easy to parse.
The rule terms are defined in the following section.
The left side of the term consists of only one variable name and the rest of the line constitutes the expression part of the rule term, optionally separated by an equality sign `=`.
The seperation of the two parts would be easy to parse.
This would also ease the use of comments and blank lines.
The section is followed by further section pairs for other rules.
### Rule requirements
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment