Datasets
Datasets define the logical grouping for data stored and queried through Infuse DB. Use them to keep telemetry and operational streams organised by product, fleet, customer, environment, location, or asset group.
A dataset should make it clear what records belong together and how teams will query them later. For IoT data, that usually means preserving the device identifier and adding dimensions that match the way operators, customers, and backend services think about the fleet.
Common Dataset Dimensions
| Dimension | Use |
|---|---|
deviceId | Join records back to Infuse IoT device identity and Marketplace workflows. |
| Customer | Filter data for a customer, tenant, or account. |
| Environment | Separate production, staging, test, or field-trial data. |
| Location | Query by site, region, installation, or geography. |
| Asset group | Group devices by physical asset, product line, or operational unit. |
| Board or hardware profile | Compare telemetry across hardware families. |
| Stream type | Separate raw telemetry, computed signals, events, or status records. |
Raw And Computed Streams
Infuse DB can be used for both raw and computed device streams:
- Raw streams preserve device telemetry close to its source form.
- Computed streams store higher-value records derived from raw telemetry, such as operational signals, summaries, or enriched values.
Keeping both forms available lets teams inspect source data while also querying the signals used by applications, operations, and analytics workflows.
Dataset Design Guidance
Start with the queries your team needs to answer:
- Which devices for this customer reported a fault in the last hour?
- How did this asset group perform across the last deployment window?
- Which locations show changing environmental readings?
- What computed signal should downstream applications or Analytics consume?
Then choose dataset dimensions that make those queries direct and repeatable.