Proof-Of-Concept Implementation
WARNING
The docker-compose example file should not be used in a production environment. The example file does not have an explicit handling of docker volumes for PostgreSQL and RabbitMQ.
Only use the example docker-compose file as a reference to write your own file. Otherwise you risk data loss if the volumes are not handled correctly for your setup.
Basic Concepts
JSON-Schema
The API of this implementation allows to create schema definitions for any type. These schema definitions are done via JSON-Schema. This implementation uses the functionality of the most recent JSON-Schema draft (2020-12).
Whenever a new schema definition is given to the API, it is validated and then stored in the database. The validation of a given schema definition as well as the validation of arbitrary data against such a schema definition is done in strict mode to ensure that only valid schema definitions or data without any unnecessary additional properties is stored.
A Note on Licenses
This system supports data usage licenses as an integral and required part of all data. In the current implementation, all license fields in our system are only saved as links to a license hosted on an external system.
A list of JSON-LD licenses which could be used are hosted on Github.
Services
This diagram shows an overview of the system architecture. A full deployment consists of exactly one ingress service, a RabbitMQ instance, at least one ingestion service instance, a PostgreSQL instance and at least one API service. The schema validation service is an integrated part of our system and does not require an extra deployment. Also, a MQTT broker is required, where the sensor values are published to and read from.
Entities
This diagram shows a full overview of all entities and their relationship. A description of the entities and their uses is listed under this diagram.
Entity | Description |
---|---|
TypeDefinition |
A TypeDefinition defines how data is structured and provides the semantic context information of associated data. They consist of a unique and human-readable name, a JSON-Schema describing the format of all the data of this type and the context information (URL string or an object which is placed in the @context section whenever data of this type is serialized in JSON-LD). |
Component |
Components represent real-world things. For example, a Component can be a production site, a production hall, a machine or a single sensor. Each Component can produce data which is ingested into the database. The relationship between different components at specific times is modeled by the ComponentRelation entity. |
ComponentRelation |
The relationship between different Components is modeled via the ComponentRelation entity. A Component can have subComponents or be a (sub-)component of another component. For example: A sensor which is built into a machine is a subcomponent of this machine and the machine is a subcomponent of the factory hall. Also the factory hall can have multiple machines, which are all subcomponents of this factory hall. This essentially creates a tree structure between components. The ComponentRelation describes the edges between the Component nodes in that tree. Furthermore, components can be moved around in the tree. A sensor can be built into one machine, then moved to another one in the real world. To model this, the ComponentRelation model includes two dates which define when this ComponentRelation was created and until which time it was active. The from date describes when the ComponentRelation was created and was thus active, the to date describes when that relation stopped being active because the component was moved and was thus replaced by a new relation. In case the to date is not set (null ), it means that this relation is currently active. There can be multiple root-components and thus multiple trees within the system at the same time. |
ComponentInformation |
The ComponentInformation stores the actual metadata of a Component . Since this metadata can change over time, the ComponentInformation belonging to a Component can be versioned via the previousVersion and nextVersion relations. The metadata itself can be any JSON. However, each ComponentInformation requires that a metadataType is defined which is a relation to a TypeDefinition specifying the context information and JSON-Schema of the metadata. This effectively makes the metadata strongly typed since the metadata JSON has to fullfill the requirements given by the associated TypeDefinition . Lastly, the ComponentInformation specifies which MQTT topic is used for its associated sensor data. This is used to link sensor data to exactly one ComponentInformation whenever a new value is received. To ensure that a topic is always associated with at maximum one ComponentInformation , when creating a new ComponentInformation the reference implementation checks if the topic is already in use by another ComponentInformation . If it is associated with the ComponentInformation which a new version is created for, the new version is created and is from this point on associated with all received measurement values. In case the topic is in use by another, unrelated ComponentInformation , the versioning process is aborted with an error. |
Measurement |
The Measurement entity is created for each sensor data which is received via MQTT. Whenever a measurement value is received through MQTT, the associated ComponentInformation is retrieved from the database by the MQTT topic that the value was received on. Each received sensor value also needs to include a TypeDefinition which specifies the format and semantic annotation for the value. Furthermore, the sensor itself can also generate metadata (JSON) for sensor values. This (optional) metadata is also required to adhere to the requirements of a TypeDefinition . The schema given by the TypeDefininition for metadata and sensor value is used to verify the format of the value and the metadata before storing this information in the database. |
Setup
Requirements
The following tools/software is required for this implementation:
- PostgreSQL 14 Database
- RabbitMQ
- Node.js 16 (Current LTS Version as of writing)
- MQTT Broker
Setup
Clone the git repository and install the dependencies by running npm install
. An empty database also needs to be created in PostgreSQL. Afterwards, the tool can be started by running npm start
.
However, configuration is done by setting environment variables. Thus if the base URL of the implementation will be https://example.com
and the name of the PostgreSQL database is example
(accessible without authentication in this case), the full command for running the implementation is DB_NAME=example API_BASE_URL=https://example.com MODE=development npm start
.
A full list of all supported environment variables, their default values and meaning is listed in the following section.
Configuration
Configuration is done via Environment variables. In case the variable is not set, it will automatically be set to its default value.
Variable | Description | Default Value | Required |
---|---|---|---|
MODE | Specifies the services that the instance will start. Can be development , ingress , ingestion or api . development mode will start all services, at the cost of performance. Only one ingress service must be active at any time for each MQTT Broker. Otherwise duplicate data might be put into the database. ingestion mode workers can be scaled up as needed. All modes require access to the database, MQTT and RabbitMQ. |
* | |
API_PORT | The port the API will listen on | 8080 |
|
API_BASE_URL | The base URL that the API will be accessible on. Example: https://example.com
|
* | |
MQTT_HOST | The host the MQTT Client will connect to | localhost |
|
MQTT_PORT | The port the MQTT Client will connect to | 1883 |
|
MQTT_USERNAME | The username the MQTT Client will use | ||
MQTT_PASSWORD | The password the MQTT Client will use | ||
MQTT_PROTOCOL | The protocol the MQTT Client will use for the connection. Can be one of wss, ws, mqtt, mqtts, tcp, ssl, wx, wxs . |
mqtt |
|
DB_NAME | The name of the postgres database to use. | * | |
DB_USERNAME | The postgres username to use for authentication. | ||
DB_PASSWORD | The password for the postgres user. | ||
DB_HOST | The hostname of the postgres database. | localhost |
|
DB_PORT | The port to connect to the postgres database. | 5432 |
|
DB_LOGGING | Setting this value to true forces the DB driver to log every DB query for debugging purposes. Might negatively impact performance. |
false |
|
RABBITMQ_HOST | The hostname the RabbitMQ Client will use | localhost |
|
RABBITMQ_PORT | The port the RabbitMQ Client will use | 5672 |
|
RABBITMQ_USERNAME | The username the RabbitMQ Client will use | ||
RABBITMQ_PASSWORD | The password the RabbitMQ Client will use | ||
RABBITMQ_QUEUE | The name of the queue the RabbitMQ Client will use. This will be created if it does not exist. | measurements |
API Documentation
Some of the mentioned endpoints support versioning by setting a Accept-Datetime
header according to the HTTP Memento specification. All endpoints which support this header mention this in their documentation. If an endpoint does support this header and the header is not set in the request, this value will default to the current date and time.
In case an endpoint does not support this header, the header is not required and will be ignored.
The repository also contains the file API.paw
and can be opened with the HTTP Client Paw for Mac. All of the examples shown below are also in this Paw file.
Type Definition Controller
/type
POST Creates a new type definition. The name given in the JSON payload must be unique.
Body:
{
"name": <someName>, // The unique name of the type
"license": <licenseURL>, // A URL pointing towards the data usage license for this type
"context": <context>, // The context data for the type, can be a URL string or JSON payload
"schema": <schema> // The JSON schema used to verify all data associated with this type
}
Example:
curl -X "POST" "http://localhost:8080/type" \
-H 'Content-Type: application/json' \
-d $'{
"name": "Vector",
"context": "https://schema.org/Vector",
"license": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
"schema": {
"type": "object",
"properties": {
"x": {
"type": "number"
},
"y": {
"type": "number"
},
"z": {
"type": "number"
}
},
"required": [
"x",
"y",
"z"
]
}
}'
Or another example:
curl -X "POST" "http://localhost:8080/type" \
-H 'Content-Type: application/json' \
-d $'{
"name": "Person",
"context": "https://schema.org/Person",
"license": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
"schema": {
"type": "object",
"properties": {
"familyName": {
"type": "string"
},
"givenName": {
"type": "string"
},
"age": {
"type": "number"
}
},
"required": [
"familyName",
"givenName"
]
}
}'
/type/<typeName>
GET Returns the type associated with the given typeName
.
Example:
curl "http://localhost:8080/type/Person" \
-H 'Content-Type: application/json'
Component Controller
All routes associated with components are grouped under the path /component
.
/component/root
GET Returns the root component of the tree. The returned root component will stay the root component and cannot be made a sub-component.
This endpoint supports the Accept-Datetime
header.
Example:
curl "http://localhost:8080/component/root" \
-H 'Accept-Datetime: 2022-02-22T16:47:33+00:00'
/component/<componentId>
GET Returns the component associated with the given identifier.
This endpoint supports the Accept-Datetime
header.
Example:
curl "http://localhost:8080/component/a6d43f32-2b20-4dd4-bfdf-e7eea55d7d47" \
-H 'Accept-Datetime: 2022-02-22T20:40:33+00:00'
/component/<componentId>/information
GET Returns the currently active ComponentInformation
instance.
This endpoint supports the Accept-Datetime
header.
Example:
curl "http://localhost:8080/component/b4df9852-dd91-448a-9d4f-2157df21f976/information" \
-H 'Accept-Datetime: 2022-06-21T11:02:11+02:00'
/component/
POST Creates a new component with the given initial metadata.
Required Body:
{
"name": <initialName>, // The initial name of the new component's metadata
"metadataType": <type>, // The name of the type definition used for the first version of this component's information
"metadata": <metadata>// The initial metadata of this component, must fulfill the schema of the metadataType
"parentComponentId": <componentId>, // (optional) - The identifier of the parent component, new component will be a root node if this is not set.
"topic": <MQTT Topic> // The topic in MQTT that the data will be published under. Must not be currently in use by another component/component information
"componentLicense": <licenseURL> // The data usage license used for the component entity
"informationLicense": <licenseURL> // The data usage license used for the initial ComponentInformation instance
"measurementLicense": <licenseURL> // The data usage license used for all measurements associated with the initial ComponentInformation instance
}
Example:
## Create Component
curl -X "POST" "http://localhost:8080/component" \
-H 'Content-Type: application/json; charset=utf-8' \
-d $'{
"metadata": {
"familyName": "Aaaaa",
"givenName": "Alice",
"age": 42
},
"metadataType": "Person",
"informationLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
"measurementLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
"componentLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
"parentComponentId": "b7fa1868-2fb9-45b6-8f2c-1988948850e3",
"topic": "Aaaaa/Alice",
"name": "Factory Operator"
}'
Component Information Controller
All routes associated with component information are grouped under the path /information
.
/information/<informationId>
GET Returns the component information associated with the given id.
Example:
curl "http://localhost:8080/information/2a9ab310-2869-478d-b150-8e82b437731d"
/information/<informationId>
POST Creates a new version for the information given in the informationId
. In case the given component information already has a newer version or the MQTT topic is currently in use by another information, the request will return a 409 - Conflict
status code.
Required Body:
{
"name": "SomeNewVersionName", // The name for the new component information
"metadataType": "Person", // The type of the new component information`s metadata
"metadata": <some JSON Object> // The metadata of the this information instance.
"topic": <MQTT Topic>
"informationLicense": <licenseURL> // The data usage license used for this ComponentInformation instance
"measurementLicense": <licenseURL> // The data usage license used for all measurements associated with the new ComponentInformation instance
}
Example:
curl -X "POST" "http://localhost:8080/information/e3207e54-7e8c-402e-855f-4526e2253bda" \
-H 'Content-Type: application/json; charset=utf-8' \
-d $'{
"metadata": {
"x": 1,
"y": 42,
"z": -50
},
"topic": "machines/lasertracker",
"name": "NewVersionName",
"metadataType": "Vector",
"informationLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
"measurementLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld"
}'
Measurement Controller
All routes to read out measurement values are grouped under /measurement
/measurement/<measurementId>
GET Returns the measurement object associated with the given id.
Example:
curl "http://localhost:8080/measurement/9d239f6a-718b-4493-b1a3-9d716982ef73"
/measurement
GET Search endpoint for measurements. This endpoint supports these (optional) filtering arguments as query parameters and returns the measurement values in ascending order sorted by their creation date:
-
from
- Timestamp - Returned measurement values are older than this timestamp -
to
- Timestamp - Returned measurement values must be younger than this timestamp -
pageSize
- Int (default: 500) - Return no more thanpageSize
number of measurements -
page
- Int - (default: 1) Used for pagination (offset is givenpageSize
). -
valueType
- String - Only measurements with this value type are returned. -
metadataType
- String - Only measurements with metadata of this type are returned. -
informationType
- String - Only measurements associated with a component information with this metadata type are returned. -
filter
- String - Filter expression, explained below.
Example:
curl "http://localhost:8080/measurement?from=2022-03-01T10%3A36%3A21.587Z&page=1&filter=%22measurement%22.%22value%22-%3E%3E%27y%27%20%3D%20%27-0.5%27%20AND%20%22componentInformation%22.%22metadata%22-%3E%3E%27familyName%27%3D%27Hallo2%27&to=2022-03-01T10%3A36%3A21.589Z&valueType=Vector&metadataType=Vector&informationType=Person&pageSize=200000"
Filtering Expression:
For more granular filtering, a filter expression can be given to for example filter out certain metadata/value/information metadata properties. This filtering string supports the JSON Functions of PostgreSQL.
There are multiple supported sub-expressions to access the component information's metadata, the measurement's metadata and the measurement value:
-
"measurement"."value"
- Access the measurement value -
"measurement"."metadata"
- Access the measurement's metadata -
"componentInformation"."metadata"
- Access the component information's metadata
Example to filter out all (Vector) values with a y
value of -0.5
and a givenName
property value of Alice
in the associated component information metadata:
"measurement"."value"->>'y' = '-0.5' AND "componentInformation"."metadata"->>'givenName'='Alice'
Component Relation Controller
All routes associated with the relations between components are grouped under the path /relation
.
/relation/<relationId>
GET Returns the relation associated with the given id.
Example:
curl "http://localhost:8080/relation/e4b5f035-74f7-49c8-8a81-8c94db0d994d"
/relation/
POST Creates a new relationship between components, moving the component with the associated componentId
(together with all its subcomponents) to a new parent in the components tree. The old parent relation pointing to the componentId
will be marked as "old" by seting the to
date to the time of when this request is made.
{
"componentId": "<UUID>",
"newParentComponentId": "<UUID>",
"relationLicense": <licenseURL> // The data usage license used for this relation.
}
Example
curl -X "POST" "http://localhost:8080/relation" \
-H 'Content-Type: application/json; charset=utf-8' \
-d $'{
"componentId": "54730c0a-208e-46e4-9a93-80bdfe195508",
"newParentComponentId": "f0e87c59-945c-4342-9496-96bcdabf33b6",
"relationLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld"
}'
MQTT Measurements
This implementation requires, that all received sensor values have a specific schema.
{
"value": <JSON> // The actual sensor value, needs to fulfill the value type requirements
"timestamp": <timestamp>, // Timestamp of when this value was captured
"valueType": <type>, // Name of the type definition of this sensor value
"metadata": <JSON> // (Optional - Required if metadataType is set) Metadata of this measurement, needs to fulfill the metadataType requirements.
"metadataType": <type> // (Optional - Required if metadata is set) Type definition of this measurement's metadata
}
Example Value:
{
"value": {
"x": -0.5914874999999999,
"y": -0.5,
"z": -0.401
},
"timestamp": "2021-09-07T08:20:16.718803Z",
"valueType": "Vector",
"metadata": {
"x": 0,
"y": 1,
"z": 2
},
"metadataType": "Vector"
}
Vocabulary
The API also has a vocabulary which can be used to query the context data about the JSON-LD types. The vocabulary is accessible under /vocab/<type>
. The available types are: Component
, ComponentInformation
, Measurement
and ComponentRelation