← All tasks
interactionmedium

conflicting-documents-resolution

Resolve conflicting experimental configurations across multiple documents.

Why this matters

This task evaluates whether the agent can resolve conflicting information across multiple external artifacts instead of relying on a single document.

Agent instruction

You are acting as a research assistant.

Please review the provided materials in /app:

  • paper.md
  • README.md
  • config.yaml
  • logs.txt

Your task is to identify inconsistencies in the reported experimental setup and determine the actual configuration used.

Write your result to /app/review.txt.

The file must contain exactly 4 lines:

  1. A score from 0 to 5 for documentation consistency
  2. A short issue label
  3. Whether the reported documentation is reliable: yes or no
  4. A brief explanation

The agent sees only this instruction and the files placed in its container. Reference solutions and verifier tests are intentionally hidden.