Agent evaluation
Design a task battery for an agent
Creates a repeatable evaluation set for an agent by covering easy, normal, adversarial, and regression-sensitive tasks.
- agent evals
- Theomatica
- task battery
- quality assurance
Prompt preview
The full prompt opens with the launch library.
This entry is indexed by title, use case, summary, and tags for now. The complete reusable prompt stays private until the prompt library release.