Stage 2
AARRA
Act As a Real Research Assistant
Coming soon
AARRA-Bench raises the bar from execution to contribution: agents are expected to evaluate work critically, leverage MCP and agent skills, and operate with greater independence. Details are being finalized.
Planned focus areas
Critical evaluation
MCP & agent skills
LLM-as-judge
Crowdsourced data