Skip to content
Snippets Groups Projects
Select Git revision
  • master
1 result

fair-sensor-ecosystem

  • Clone with SSH
  • Clone with HTTPS
  • user avatar
    Jonas Schlabertz authored
    b8f53651
    History

    Proof-Of-Concept Implementation

    WARNING

    The docker-compose example file should not be used in a production environment. The example file does not have an explicit handling of docker volumes for PostgreSQL and RabbitMQ.

    Only use the example docker-compose file as a reference to write your own file. Otherwise you risk data loss if the volumes are not handled correctly for your setup.

    Basic Concepts

    JSON-Schema

    The API of this implementation allows to create schema definitions for any type. These schema definitions are done via JSON-Schema. This implementation uses the functionality of the most recent JSON-Schema draft (2020-12).

    Whenever a new schema definition is given to the API, it is validated and then stored in the database. The validation of a given schema definition as well as the validation of arbitrary data against such a schema definition is done in strict mode to ensure that only valid schema definitions or data without any unnecessary additional properties is stored.

    A Note on Licenses

    This system supports data usage licenses as an integral and required part of all data. In the current implementation, all license fields in our system are only saved as links to a license hosted on an external system.

    A list of JSON-LD licenses which could be used are hosted on Github.

    Services

    This diagram shows an overview of the system architecture. A full deployment consists of exactly one ingress service, a RabbitMQ instance, at least one ingestion service instance, a PostgreSQL instance and at least one API service. The schema validation service is an integrated part of our system and does not require an extra deployment. Also, a MQTT broker is required, where the sensor values are published to and read from.

    Entities

    This diagram shows a full overview of all entities and their relationship. A description of the entities and their uses is listed under this diagram.

    Entity Description
    TypeDefinition A TypeDefinition defines how data is structured and provides the semantic context information of associated data. They consist of a unique and human-readable name, a JSON-Schema describing the format of all the data of this type and the context information (URL string or an object which is placed in the @context section whenever data of this type is serialized in JSON-LD).
    Component Components represent real-world things. For example, a Component can be a production site, a production hall, a machine or a single sensor. Each Component can produce data which is ingested into the database. The relationship between different components at specific times is modeled by the ComponentRelation entity.
    ComponentRelation The relationship between different Components is modeled via the ComponentRelation entity. A Component can have subComponents or be a (sub-)component of another component. For example: A sensor which is built into a machine is a subcomponent of this machine and the machine is a subcomponent of the factory hall. Also the factory hall can have multiple machines, which are all subcomponents of this factory hall. This essentially creates a tree structure between components. The ComponentRelation describes the edges between the Component nodes in that tree. Furthermore, components can be moved around in the tree. A sensor can be built into one machine, then moved to another one in the real world. To model this, the ComponentRelation model includes two dates which define when this ComponentRelation was created and until which time it was active. The from date describes when the ComponentRelation was created and was thus active, the to date describes when that relation stopped being active because the component was moved and was thus replaced by a new relation. In case the to date is not set (null), it means that this relation is currently active. There can be multiple root-components and thus multiple trees within the system at the same time.
    ComponentInformation The ComponentInformation stores the actual metadata of a Component. Since this metadata can change over time, the ComponentInformation belonging to a Component can be versioned via the previousVersion and nextVersion relations. The metadata itself can be any JSON. However, each ComponentInformation requires that a metadataType is defined which is a relation to a TypeDefinition specifying the context information and JSON-Schema of the metadata. This effectively makes the metadata strongly typed since the metadata JSON has to fullfill the requirements given by the associated TypeDefinition. Lastly, the ComponentInformation specifies which MQTT topic is used for its associated sensor data. This is used to link sensor data to exactly one ComponentInformation whenever a new value is received. To ensure that a topic is always associated with at maximum one ComponentInformation, when creating a new ComponentInformation the reference implementation checks if the topic is already in use by another ComponentInformation. If it is associated with the ComponentInformation which a new version is created for, the new version is created and is from this point on associated with all received measurement values. In case the topic is in use by another, unrelated ComponentInformation, the versioning process is aborted with an error.
    Measurement The Measurement entity is created for each sensor data which is received via MQTT. Whenever a measurement value is received through MQTT, the associated ComponentInformation is retrieved from the database by the MQTT topic that the value was received on. Each received sensor value also needs to include a TypeDefinition which specifies the format and semantic annotation for the value. Furthermore, the sensor itself can also generate metadata (JSON) for sensor values. This (optional) metadata is also required to adhere to the requirements of a TypeDefinition. The schema given by the TypeDefininition for metadata and sensor value is used to verify the format of the value and the metadata before storing this information in the database.

    Setup

    Requirements

    The following tools/software is required for this implementation:

    • PostgreSQL 14 Database
    • RabbitMQ
    • Node.js 16 (Current LTS Version as of writing)
    • MQTT Broker

    Setup

    Clone the git repository and install the dependencies by running npm install. An empty database also needs to be created in PostgreSQL. Afterwards, the tool can be started by running npm start.

    However, configuration is done by setting environment variables. Thus if the base URL of the implementation will be https://example.com and the name of the PostgreSQL database is example (accessible without authentication in this case), the full command for running the implementation is DB_NAME=example API_BASE_URL=https://example.com MODE=development npm start.

    A full list of all supported environment variables, their default values and meaning is listed in the following section.

    Configuration

    Configuration is done via Environment variables. In case the variable is not set, it will automatically be set to its default value.

    Variable Description Default Value Required
    MODE Specifies the services that the instance will start. Can be development, ingress, ingestion or api. development mode will start all services, at the cost of performance. Only one ingress service must be active at any time for each MQTT Broker. Otherwise duplicate data might be put into the database. ingestion mode workers can be scaled up as needed. All modes require access to the database, MQTT and RabbitMQ. *
    API_PORT The port the API will listen on 8080
    API_BASE_URL The base URL that the API will be accessible on. Example: https://example.com *
    MQTT_HOST The host the MQTT Client will connect to localhost
    MQTT_PORT The port the MQTT Client will connect to 1883
    MQTT_USERNAME The username the MQTT Client will use
    MQTT_PASSWORD The password the MQTT Client will use
    MQTT_PROTOCOL The protocol the MQTT Client will use for the connection. Can be one of wss, ws, mqtt, mqtts, tcp, ssl, wx, wxs. mqtt
    DB_NAME The name of the postgres database to use. *
    DB_USERNAME The postgres username to use for authentication.
    DB_PASSWORD The password for the postgres user.
    DB_HOST The hostname of the postgres database. localhost
    DB_PORT The port to connect to the postgres database. 5432
    DB_LOGGING Setting this value to true forces the DB driver to log every DB query for debugging purposes. Might negatively impact performance. false
    RABBITMQ_HOST The hostname the RabbitMQ Client will use localhost
    RABBITMQ_PORT The port the RabbitMQ Client will use 5672
    RABBITMQ_USERNAME The username the RabbitMQ Client will use
    RABBITMQ_PASSWORD The password the RabbitMQ Client will use
    RABBITMQ_QUEUE The name of the queue the RabbitMQ Client will use. This will be created if it does not exist. measurements

    API Documentation

    Some of the mentioned endpoints support versioning by setting a Accept-Datetime header according to the HTTP Memento specification. All endpoints which support this header mention this in their documentation. If an endpoint does support this header and the header is not set in the request, this value will default to the current date and time.

    In case an endpoint does not support this header, the header is not required and will be ignored.

    The repository also contains the file API.paw and can be opened with the HTTP Client Paw for Mac. All of the examples shown below are also in this Paw file.

    Type Definition Controller

    POST /type

    Creates a new type definition. The name given in the JSON payload must be unique.

    Body:

    {
      "name": <someName>, // The unique name of the type
      "license": <licenseURL>, // A URL pointing towards the data usage license for this type 
      "context": <context>, // The context data for the type, can be a URL string or JSON payload
      "schema": <schema> // The JSON schema used to verify all data associated with this type
    }

    Example:

    curl -X "POST" "http://localhost:8080/type" \
         -H 'Content-Type: application/json' \
         -d $'{
      "name": "Vector",
      "context": "https://schema.org/Vector",
      "license": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
      "schema": {
        "type": "object",
        "properties": {
          "x": {
            "type": "number"
          },
          "y": {
            "type": "number"
          },
          "z": {
            "type": "number"
          }
        },
        "required": [
          "x",
          "y",
          "z"
        ]
      }
    }'
    

    Or another example:

    curl -X "POST" "http://localhost:8080/type" \
         -H 'Content-Type: application/json' \
         -d $'{
      "name": "Person",
      "context": "https://schema.org/Person",
      "license": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
      "schema": {
        "type": "object",
        "properties": {
          "familyName": {
            "type": "string"
          },
          "givenName": {
            "type": "string"
          },
          "age": {
            "type": "number"
          }
        },
        "required": [
          "familyName",
          "givenName"
        ]
      }
    }'
    

    GET /type/<typeName>

    Returns the type associated with the given typeName.

    Example:

    curl "http://localhost:8080/type/Person" \
         -H 'Content-Type: application/json'

    Component Controller

    All routes associated with components are grouped under the path /component.

    GET /component/root

    Returns the root component of the tree. The returned root component will stay the root component and cannot be made a sub-component.

    This endpoint supports the Accept-Datetime header.

    Example:

    curl "http://localhost:8080/component/root" \
         -H 'Accept-Datetime: 2022-02-22T16:47:33+00:00'

    GET /component/<componentId>

    Returns the component associated with the given identifier.

    This endpoint supports the Accept-Datetime header.

    Example:

    curl "http://localhost:8080/component/a6d43f32-2b20-4dd4-bfdf-e7eea55d7d47" \
         -H 'Accept-Datetime: 2022-02-22T20:40:33+00:00'

    GET /component/<componentId>/information

    Returns the currently active ComponentInformation instance.

    This endpoint supports the Accept-Datetime header.

    Example:

    curl "http://localhost:8080/component/b4df9852-dd91-448a-9d4f-2157df21f976/information" \
         -H 'Accept-Datetime: 2022-06-21T11:02:11+02:00'

    POST /component/

    Creates a new component with the given initial metadata.

    Required Body:

    {
      "name": <initialName>, // The initial name of the new component's metadata
    
      "metadataType": <type>, // The name of the type definition used for the first version of this component's information
    
      "metadata": <metadata>// The initial metadata of this component, must fulfill the schema of the metadataType
    
      "parentComponentId": <componentId>, // (optional) - The identifier of the parent component, new component will be a root node if this is not set.
    
      "topic": <MQTT Topic> // The topic in MQTT that the data will be published under. Must not be currently in use by another component/component information
    
      "componentLicense": <licenseURL> // The data usage license used for the component entity
    
      "informationLicense": <licenseURL> // The data usage license used for the initial ComponentInformation instance
    
      "measurementLicense": <licenseURL> // The data usage license used for all measurements associated with the initial ComponentInformation instance 
    }

    Example:

    ## Create Component
    curl -X "POST" "http://localhost:8080/component" \
         -H 'Content-Type: application/json; charset=utf-8' \
         -d $'{
      "metadata": {
        "familyName": "Aaaaa",
        "givenName": "Alice",
        "age": 42
      },
      "metadataType": "Person",
      "informationLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
      "measurementLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
      "componentLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
      "parentComponentId": "b7fa1868-2fb9-45b6-8f2c-1988948850e3",
      "topic": "Aaaaa/Alice",
      "name": "Factory Operator"
    }'

    Component Information Controller

    All routes associated with component information are grouped under the path /information.

    GET /information/<informationId>

    Returns the component information associated with the given id.

    Example:

    curl "http://localhost:8080/information/2a9ab310-2869-478d-b150-8e82b437731d"

    POST /information/<informationId>

    Creates a new version for the information given in the informationId. In case the given component information already has a newer version or the MQTT topic is currently in use by another information, the request will return a 409 - Conflict status code.

    Required Body:

    {
      "name": "SomeNewVersionName", // The name for the new component information
      "metadataType": "Person", // The type of the new component information`s metadata
      "metadata": <some JSON Object> // The metadata of the this information instance. 
      "topic": <MQTT Topic>
      "informationLicense": <licenseURL> // The data usage license used for this ComponentInformation instance
    
      "measurementLicense": <licenseURL> // The data usage license used for all measurements associated with the new ComponentInformation instance 
    }

    Example:

    curl -X "POST" "http://localhost:8080/information/e3207e54-7e8c-402e-855f-4526e2253bda" \
         -H 'Content-Type: application/json; charset=utf-8' \
         -d $'{
      "metadata": {
        "x": 1,
        "y": 42,
        "z": -50
      },
      "topic": "machines/lasertracker",
      "name": "NewVersionName",
      "metadataType": "Vector",
      "informationLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld",
      "measurementLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld"
    }'
    

    Measurement Controller

    All routes to read out measurement values are grouped under /measurement

    GET /measurement/<measurementId>

    Returns the measurement object associated with the given id.

    Example:

    curl "http://localhost:8080/measurement/9d239f6a-718b-4493-b1a3-9d716982ef73"

    GET /measurement

    Search endpoint for measurements. This endpoint supports these (optional) filtering arguments as query parameters and returns the measurement values in ascending order sorted by their creation date:

    • from - Timestamp - Returned measurement values are older than this timestamp
    • to - Timestamp - Returned measurement values must be younger than this timestamp
    • pageSize - Int (default: 500) - Return no more than pageSize number of measurements
    • page - Int - (default: 1) Used for pagination (offset is given pageSize).
    • valueType - String - Only measurements with this value type are returned.
    • metadataType - String - Only measurements with metadata of this type are returned.
    • informationType - String - Only measurements associated with a component information with this metadata type are returned.
    • filter- String - Filter expression, explained below.

    Example:

    curl "http://localhost:8080/measurement?from=2022-03-01T10%3A36%3A21.587Z&page=1&filter=%22measurement%22.%22value%22-%3E%3E%27y%27%20%3D%20%27-0.5%27%20AND%20%22componentInformation%22.%22metadata%22-%3E%3E%27familyName%27%3D%27Hallo2%27&to=2022-03-01T10%3A36%3A21.589Z&valueType=Vector&metadataType=Vector&informationType=Person&pageSize=200000"

    Filtering Expression:

    For more granular filtering, a filter expression can be given to for example filter out certain metadata/value/information metadata properties. This filtering string supports the JSON Functions of PostgreSQL.

    There are multiple supported sub-expressions to access the component information's metadata, the measurement's metadata and the measurement value:

    • "measurement"."value" - Access the measurement value
    • "measurement"."metadata" - Access the measurement's metadata
    • "componentInformation"."metadata" - Access the component information's metadata

    Example to filter out all (Vector) values with a y value of -0.5 and a givenName property value of Alice in the associated component information metadata:

    "measurement"."value"->>'y' = '-0.5' AND "componentInformation"."metadata"->>'givenName'='Alice'

    Component Relation Controller

    All routes associated with the relations between components are grouped under the path /relation.

    GET /relation/<relationId>

    Returns the relation associated with the given id.

    Example:

    curl "http://localhost:8080/relation/e4b5f035-74f7-49c8-8a81-8c94db0d994d"

    POST /relation/

    Creates a new relationship between components, moving the component with the associated componentId (together with all its subcomponents) to a new parent in the components tree. The old parent relation pointing to the componentId will be marked as "old" by seting the to date to the time of when this request is made.

    {
        "componentId": "<UUID>",
        "newParentComponentId": "<UUID>",
        "relationLicense": <licenseURL> // The data usage license used for this relation.
    }

    Example

    curl -X "POST" "http://localhost:8080/relation" \
         -H 'Content-Type: application/json; charset=utf-8' \
         -d $'{
      "componentId": "54730c0a-208e-46e4-9a93-80bdfe195508",
      "newParentComponentId": "f0e87c59-945c-4342-9496-96bcdabf33b6",
      "relationLicense": "https://github.com/spdx/license-list-data/blob/master/jsonld/MIT-0.jsonld"
    }'

    MQTT Measurements

    This implementation requires, that all received sensor values have a specific schema.

    {
        "value": <JSON> // The actual sensor value, needs to fulfill the value type requirements
    
        "timestamp": <timestamp>, // Timestamp of when this value was captured
        
        "valueType": <type>, // Name of the type definition of this sensor value
    
        "metadata": <JSON> // (Optional - Required if metadataType is set) Metadata of this measurement, needs to fulfill the metadataType requirements.
    
        "metadataType": <type> // (Optional - Required if metadata is set) Type definition of this measurement's metadata
    }

    Example Value:

    {
        "value": {
            "x": -0.5914874999999999,
            "y": -0.5,
            "z": -0.401
        },
        "timestamp": "2021-09-07T08:20:16.718803Z",
        "valueType": "Vector",
        "metadata": {
            "x": 0,
            "y": 1,
            "z": 2
        },
        "metadataType": "Vector"
    }

    Vocabulary

    The API also has a vocabulary which can be used to query the context data about the JSON-LD types. The vocabulary is accessible under /vocab/<type>. The available types are: Component, ComponentInformation, Measurement and ComponentRelation