Trace describes a benchmark result of a Solution on a Definition with a specific Workload. The collection of all Trace files forms the database of benchmark results.
In a Trace object, solution and evaluation are optional. If they are not provided, it describes
a workload entry in the dataset.
In a Trace object, the definition and solution are externally linked through their names;
workload and evaluation are embedded in the Trace object. This is because definition and
solution are relatively large objects and will be used repeatedly in the dataset.
JSON Schema Description
Top-Level Object Structure
| Field | Type | Required | Description |
|---|---|---|---|
definition | string | Yes | The name of the Definition used in this run. |
workload | object | Yes | An object describing the specific input configuration for this run. See Workload. |
solution | string | No | The name of the Solution tested in this run. |
evaluation | object | No | An object containing the detailed results of this run. |
evaluation : Benchmark Statistics Summary
This object represents a single, complete benchmark result.
| Field | Type | Required | Description |
|---|---|---|---|
status | string | Yes | The final status of the evaluation run. Has to be one of the following: |
"PASSED", "INCORRECT_SHAPE", "INCORRECT_NUMERICAL", "INCORRECT_DTYPE", "RUNTIME_ERROR", "COMPILE_ERROR". | |||
log | string | Yes | The embedded record of the stdout and stderr of the evaluation run. |
correctness | object | Yes | The summarized correctness results across all entries in the dataset. |
performance | object | Yes | The summarized performance metrics across all entries in the dataset. |
environment | object | Yes | A snapshot of the hardware and software execution environment. |
timestamp | string | Yes | The ISO 8601 timestamp of when this summary was generated. |
correctness : Correctness Summary
| Field | Type | Required | Description |
|---|---|---|---|
max_relative_error | float | Yes | The maximum relative difference found. |
max_absolute_error | float | Yes | The maximum absolute difference found. |
performance : Performance Summary
| Field | Type | Required | Description |
|---|---|---|---|
latency_ms | float | Yes | The mean latency in milliseconds per execution for this implementation. |
reference_latency_ms | float | Yes | The mean latency of the Definition’s reference code on the same data/hardware. |
speedup_factor | float | Yes | The calculated speedup (reference_latency_ms / latency_ms). |
Note that it’s normal for the speedup factor to be very large since the references are torch only, unoptimized implementations.
environment: Environment Definition Object
The environment object specifies the exact execution environment for this benchmark run.
| Field | Type | Required | Description |
|---|---|---|---|
hardware | string | Yes | The name of the hardware, e.g., "NVIDIA_H100". |
libs | object | Yes | A snapshot of the relevant software libraries and their versions. Keys are library names, and values are version strings. |
The correctness and performance Nullable Table
The correctness and performance fields are set to be nullable depending on the status.
| status | correctness | performance |
|---|---|---|
| PASSED | Required | Required |
| INCORRECT_NUMERICAL | Required | None |
| INCORRECT_SHAPE/DTYPE | None | None |
| RUNTIME_ERROR | None | None |
| COMPILE_ERROR | None | None |

