← Prompt library

Agent evaluation

Design a task battery for an agent

Creates a repeatable evaluation set for an agent by covering easy, normal, adversarial, and regression-sensitive tasks.

  • agent evals
  • Theomatica
  • task battery
  • quality assurance

Prompt preview

The full prompt opens with the launch library.

This entry is indexed by title, use case, summary, and tags for now. The complete reusable prompt stays private until the prompt library release.