Automating WCET and timing analysis in CI: integrating RocqStat into your pipeline
embeddedci-cdtesting

Automating WCET and timing analysis in CI: integrating RocqStat into your pipeline

ddeployed
2026-02-01
10 min read
Advertisement

Automate WCET checks in CI: concrete Jenkins and GitHub Actions examples running RocqStat/VectorCAST, failing builds on regressions and surfacing results to PRs.

Stop timing regressions from reaching hardware: Automating WCET and timing analysis in CI with RocqStat

Hook: You ship embedded software to safety-critical hardware and trust your timing analysis — until a single pull request quietly increases a worst-case execution time (WCET) and triggers failures in the field. In 2026, timing regressions are a first-class CI concern: they break certifications, delay releases, and inflate validation costs. This guide shows how to integrate RocqStat (and VectorCAST where applicable) into Jenkins and GitHub Actions pipelines to run timing checks, fail builds on regressions, and surface results directly to pull requests.

Why automating timing analysis matters in 2026

The market changed in late 2025 and early 2026: Vector Informatik acquired StatInf and RocqStat in January 2026, signaling that timing analysis and statistical WCET methods are now a strategic part of mainstream test toolchains. Vector plans to integrate RocqStat into VectorCAST, creating tighter workflows between testing and timing verification (Automotive World, Jan 16, 2026). For teams building embedded, automotive, avionics or industrial control software, that means timing analysis can — and should — live in CI.

Automating WCET checks in CI solves common pain points:

  • Detect regressions early — before hardware testbeds and certification runs.
  • Enforce timing budgets as code-level gates, reducing last-minute design churn.
  • Provide traceable, auditable artifacts for safety standards (ISO 26262, DO-178C).
  • Reduce tool sprawl by centralizing timing and testing results into PR feedback loops.

High-level approach

At a high level the pipeline does three things:

  1. Build the firmware or binary under test in CI (reproducible toolchain + deterministic flags).
  2. Run timing analysis with RocqStat (standalone) or VectorCAST + RocqStat plugin to produce an artifact (JSON + HTML report + traces).
  3. Compare results to a baseline (main branch or golden build) and fail the job or mark the check as failed when WCET increases beyond configured thresholds. Surface results to the PR with a summary and links to full artifacts.

Key design decisions and best practices

  • Baseline strategy: Store a canonical WCET baseline per release or derive the baseline from the main branch build executed in CI. Keep baselines as artifacts and version them.
  • Threshold policy: Use absolute (ms) and relative (%) thresholds. For example, fail on >10% or >500 µs increase, whichever triggers first.
  • Representative environment: Run timing measurements on representative hardware or accurate emulation (QEMU + cycle-accurate models) to avoid false alarms from CI runners.
  • Statistical robustness: Use RocqStat's statistical WCET metrics (p99, p99.999, confidence intervals) rather than single-run maxima; consider ML-assisted detection as part of your regression signal processing (ML-assisted regression detection patterns are useful here).
  • Artifact retention: Attach JSON and HTML reports to CI runs and keep traces for at least one release cycle for audits — back them with a zero-trust storage and proven retention policy.
  • Security: Run analysis in an isolated container and limit access to private keys and testbeds. Avoid uploading traces with secrets; prefer local caches or local-first sync appliances for sensitive artifacts.

GitHub Actions: Full pipeline example

The GitHub Actions example builds the firmware, runs RocqStat in a Docker container, compares results to the baseline built from main, fails on regressions, and posts a PR comment with a short summary and links to artifacts.

Assumptions:

  • RocqStat CLI is available as a Docker image (rocqstat/cli:latest). Substitute your vendor image or a local wrapper if different.
  • RocqStat produces a JSON report with fields we use in the sample script: wcet_p99 and wcet_ms. Adjust keys to match your RocqStat output.
  • GitHub token is available in secrets.GITHUB_TOKEN.
# .github/workflows/ci-rocqstat.yml
name: CI - Build & WCET checks

on:
  pull_request:
    types: [opened, synchronize, reopened]
  push:
    branches: [main]

jobs:
  build-and-wcet:
    runs-on: ubuntu-latest
    env:
      THRESHOLD_PCT: '10'      # fail if >10% increase
      THRESHOLD_MS: '0.5'      # fail if >0.5 ms increase

    steps:
      - name: Checkout PR
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Build firmware
        run: |
          ./scripts/build.sh --target=stm32 --release
          ls -la build/

      - name: Run RocqStat analysis
        run: |
          docker run --rm -v ${{ github.workspace }}:/work -w /work rocqstat/cli:latest \
            rcstat analyze --input build/firmware.elf --output reports/rocqstat.json --html reports/rocqstat.html

      - name: Save current report as artifact
        uses: actions/upload-artifact@v4
        with:
          name: rocqstat-current
          path: reports/rocqstat.json

      - name: Build baseline from main branch
        uses: actions/checkout@v4
        with:
          ref: main
          path: main_build

      - name: Build baseline firmware
        run: |
          pushd main_build
          ./scripts/build.sh --target=stm32 --release
          popd

      - name: Run RocqStat on baseline
        run: |
          docker run --rm -v ${{ github.workspace }}/main_build:/main_build -v ${{ github.workspace }}:/work -w /work rocqstat/cli:latest \
            rcstat analyze --input /main_build/build/firmware.elf --output reports/rocqstat_baseline.json

      - name: Compare WCETs and fail on regression
        run: |
          python3 tools/compare_wcet.py \
            --current reports/rocqstat.json \
            --baseline reports/rocqstat_baseline.json \
            --pct-threshold $THRESHOLD_PCT \
            --ms-threshold $THRESHOLD_MS

      - name: Post PR summary
        uses: actions/github-script@v6
        with:
          github-token: ${{ secrets.GITHUB_TOKEN }}
          script: |
            const fs = require('fs');
            const rpt = JSON.parse(fs.readFileSync('reports/rocqstat.json'));
            const base = JSON.parse(fs.readFileSync('reports/rocqstat_baseline.json'));
            const pct = ((rpt.wcet_p99 - base.wcet_p99) / base.wcet_p99 * 100).toFixed(2);
            const body = `**RocqStat timing check**\n\n- p99 WCET: ${rpt.wcet_p99} ms\n- Baseline p99: ${base.wcet_p99} ms\n- Change: ${pct}%\n\n[Full report](./reports/rocqstat.html)`;
            github.rest.issues.createComment({owner: context.repo.owner, repo: context.repo.repo, issue_number: context.issue.number, body});

      - name: Upload HTML report
        uses: actions/upload-artifact@v4
        with:
          name: rocqstat-html
          path: reports/rocqstat.html
  

Helper: compare_wcet.py

This tiny Python script parses RocqStat JSON and returns non-zero if a regression exceeds thresholds. Tune keys to match your RocqStat JSON schema.

#!/usr/bin/env python3
import argparse
import json
import sys

parser = argparse.ArgumentParser()
parser.add_argument('--current', required=True)
parser.add_argument('--baseline', required=True)
parser.add_argument('--pct-threshold', type=float, required=True)
parser.add_argument('--ms-threshold', type=float, required=True)
args = parser.parse_args()

cur = json.load(open(args.current))
base = json.load(open(args.baseline))

# Adjust fields if RocqStat JSON uses different keys
cur_wcet = float(cur.get('wcet_p99') or cur.get('wcet_ms'))
base_wcet = float(base.get('wcet_p99') or base.get('wcet_ms'))

if base_wcet <= 0:
    print('Baseline WCET is zero or missing; failing safe.')
    sys.exit(2)

pct_increase = (cur_wcet - base_wcet) / base_wcet * 100.0
ms_increase = cur_wcet - base_wcet

print(f'Baseline: {base_wcet} ms  Current: {cur_wcet} ms  Increase: {pct_increase:.2f}% / {ms_increase:.6f} ms')

if pct_increase > args.pct_threshold or ms_increase > args.ms_threshold:
    print('Timing regression detected: failing build')
    sys.exit(1)

print('Timing OK')

Jenkins pipeline example

For teams running Jenkins, use a Declarative Pipeline or Multibranch pipeline. The following Jenkinsfile runs the same flow: build, run RocqStat in Docker, compare with baseline taken from main branch, and post a PR comment using the GitHub CLI (gh).

// Jenkinsfile
pipeline {
  agent any
  environment {
    GITHUB_TOKEN = credentials('github-token')
    THRESHOLD_PCT = '10'
    THRESHOLD_MS = '0.5'
  }
  stages {
    stage('Checkout') {
      steps {
        checkout scm
      }
    }
    stage('Build') {
      steps {
        sh './scripts/build.sh --target=stm32 --release'
      }
    }
    stage('RocqStat Analysis') {
      steps {
        sh 'docker run --rm -v $WORKSPACE:$WORKSPACE -w $WORKSPACE rocqstat/cli:latest rcstat analyze --input build/firmware.elf --output reports/rocqstat.json --html reports/rocqstat.html'
        archiveArtifacts artifacts: 'reports/rocqstat.html, reports/rocqstat.json', fingerprint: true
      }
    }
    stage('Baseline & Compare') {
      steps {
        // get baseline by checking out main into a temp dir
        sh 'git clone --single-branch --branch main ${GIT_URL} main_build'
        sh 'pushd main_build && ./scripts/build.sh --target=stm32 --release && popd'
        sh 'docker run --rm -v $WORKSPACE:$WORKSPACE -w $WORKSPACE rocqstat/cli:latest rcstat analyze --input main_build/build/firmware.elf --output reports/rocqstat_baseline.json'
        sh 'python3 tools/compare_wcet.py --current reports/rocqstat.json --baseline reports/rocqstat_baseline.json --pct-threshold $THRESHOLD_PCT --ms-threshold $THRESHOLD_MS'
      }
    }
  }
  post {
    failure {
      script {
        sh "gh auth login --with-token <<<\"${GITHUB_TOKEN}\""
        def body = sh(script: "python3 tools/compact_pr_comment.py reports/rocqstat.json reports/rocqstat_baseline.json", returnStdout: true).trim()
        sh "gh pr comment ${env.CHANGE_ID} --body \"${body}\""
      }
    }
    success {
      archiveArtifacts artifacts: 'reports/rocqstat.html', fingerprint: true
    }
  }
}

Posting check results

Jenkins plugins exist to create GitHub checks and status contexts. If you prefer not to use the GitHub CLI, install the GitHub Checks Plugin or use the GitHub Pull Request Builder to set PR statuses. Key is to provide both machine-readable status (check) and human summary (comment or HTML report link).

VectorCAST integration strategy

Vector's acquisition of RocqStat opens a path to built-in timing analysis for teams already using VectorCAST. In 2026 you have two pragmatic options:

  • Short term: Continue using the RocqStat CLI integrated into CI as shown above. This gives immediate automation benefits without waiting for full VectorCAST integration.
  • Medium term: Migrate to VectorCAST with RocqStat plugin once integrated — this will centralize test, coverage, and timing analysis in a single toolchain and produce vendor-supported artifacts for certification.

Typical VectorCAST CI flows will look similar: use the VectorCAST automation CLI to run unit/integration tests, trigger the RocqStat timing pass, collect the resulting timing report, and perform the same baseline comparison logic. Consult your VectorCAST + RocqStat documentation for exact CLI flags and the canonical JSON output format.

Handling non-determinism and flaky timing

Timing measurements are inherently noisy. Here are actionable steps to reduce noise and avoid false positives:

  • Warm-up runs: Run multiple warm-up iterations and ignore outliers using interquartile range trimming.
  • Statistical WCET: Use RocqStat's statistical methods (confidence intervals, quantiles) and evaluate p99/p99.999 rather than raw max.
  • Controlled environment: Run on reserved bench hardware or pinned CI runners with real-time priorities if possible; consider portable power and reliable bench setups if you orchestrate physical testbeds.
  • Delta windows: Only enforce small delta thresholds on hot code paths and use larger windows for less critical subsystems.

Integrating timing checks into release and certification workflows

Timing artifacts are required for safety cases. Automating generation and retention of timing evidence accelerates certification:

  • Attach JSON/HTML and raw traces to release artifacts and change requests.
  • Version baselines with each certified release and include per-release WCET budgets in release notes.
  • Automate generation of trace buckets and statistical summaries used in safety reports; protect those artifacts with zero-trust storage and provenance.

Monitoring and dashboards

Don’t let timing checks be a one-off. Stream WCET metrics to your metrics system (Prometheus, Datadog) via the CI job or a small exporter. Track:

  • p50/p90/p99 WCET over time
  • Number of PRs that fail timing checks
  • Average time to fix timing regressions

Looking forward from 2026, expect the following trends to shape timing automation:

  • Tool convergence: Vector's acquisition of RocqStat is part of a larger consolidation: timing analysis will be embedded into mainstream test toolchains rather than specialist add-ons.
  • CI-first certification: Certification authorities will accept more machine-generated, versioned evidence from CI pipelines, shortening audit cycles; pairing CI evidence with trusted data strategies helps in regulated audits.
  • ML-assisted regression detection: Tooling will flag meaningful regressions by correlating code changes with timing distributions and test coverage (see related ML/observability patterns).
  • Edge-native timing checks: Running timing passes on physical testbeds orchestrated from CI will become common; expect managed local-first or remote testbed offerings and more robust bench power systems (portable power).
“Timing safety is becoming a critical ...” — Eric Barton, Vector (Automotive World, Jan 2026).

Checklist: implement WCET checks in your CI today

  • Build deterministic firmware artifacts in CI and keep compiler flags consistent.
  • Run RocqStat in containerized form to produce machine-readable JSON and human-friendly HTML reports.
  • Store or derive baselines from the main branch and version them with secure storage.
  • Compare current and baseline metrics with an automated script; fail the build on policy breaches.
  • Surface results to PRs using comments and GitHub Checks or Jenkins statuses.
  • Persist artifacts for certification and feed metrics dashboards for trend analysis (see observability & cost-control playbook).

Actionable takeaways

  • Start small: add a RocqStat step that produces JSON artifacts and a simple compare script in your PR pipeline.
  • Define an explicit timing policy (absolute + relative thresholds) and document it in your repo.
  • Automate baseline generation from main; don’t rely on manual golden files that drift.
  • Plan migration: if you use VectorCAST, budget the medium-term integration of RocqStat into VectorCAST for a single-vendor workflow.

Further reading and resources

  • Vector Informatik — official announcement of the RocqStat acquisition (Automotive World, Jan 16, 2026): https://www.automotiveworld.com
  • RocqStat and VectorCAST vendor docs for CLI usage and JSON schema — consult vendor docs for exact JSON keys and plugin instructions.

Final words & call to action

In 2026, timing analysis belongs in CI. Adding RocqStat-based WCET checks to PR pipelines prevents surprises in hardware, shortens certification cycles, and turns timing safety into a continuous engineering signal rather than a late-stage fire drill.

Start now: copy the GitHub Actions workflow and scripts in this article into a feature branch, run them against a representative build, and tune thresholds until you stop seeing noise. If you’d like a jumpstart, deployed.cloud provides CI templates, Jenkinsfiles and a hands-on workshop to integrate RocqStat/VectorCAST into your delivery pipeline. Contact us or check our repo to get a working starter pipeline you can drop into your project.

Advertisement

Related Topics

#embedded#ci-cd#testing
d

deployed

Contributor

Senior editor and content strategist. Writing about technology, design, and the future of digital media. Follow along for deep dives into the industry's moving parts.

Advertisement
2026-02-07T02:19:52.734Z