← Back to Benchmarks

Benchmark

Arcade

A modern action suite that stresses long-horizon control, multi-task decision making, and latency resilience.

Arcade benchmark preview

Suite coverage

Fourteen arenas that mix physics-based navigation, timed puzzles, and cooperative objectives.

Metrics monitored

  • Completion rate within dynamic time budgets.
  • Risk-aware policy shifts after penalty spikes.
  • Coordination efficiency when tasks are chained together.

Current focus

Reducing action latency variance so agent decisions remain stable under bursty system load.