← All tasks
interactionmedium
conflicting-documents-resolution
Resolve conflicting experimental configurations across multiple documents.
Why this matters
This task evaluates whether the agent can resolve conflicting information across multiple external artifacts instead of relying on a single document.
Agent instruction
You are acting as a research assistant.
Please review the provided materials in /app:
paper.mdREADME.mdconfig.yamllogs.txt
Your task is to identify inconsistencies in the reported experimental setup and determine the actual configuration used.
Write your result to /app/review.txt.
The file must contain exactly 4 lines:
- A score from 0 to 5 for documentation consistency
- A short issue label
- Whether the reported documentation is reliable: yes or no
- A brief explanation
The agent sees only this instruction and the files placed in its container. Reference solutions and verifier tests are intentionally hidden.