← All tasks
interactionmedium

scope-creep-negotiation

PI progressively adds experiments (1→8→16 runs) exceeding 64h GPU budget by 32h; agent must present time deficit, prioritize, and propose cuts instead of blindly accepting.

Why this matters

Interaction gap: agents accept unlimited scope additions without resource negotiation; human students proactively push back when requests exceed available time/compute.

Agent instruction

You are a PhD student. Read the Slack conversation at /app/chat_log.md.

Your PI has progressively expanded the scope of an ablation experiment from 1 run to 16 runs. You have 64 usable GPU-hours but need 96 hours — a 32-hour deficit. The deadline is Oct 25.

Write your reply to the PI at /app/reply.txt:

  • Line 1: Can you complete ALL requested experiments before the deadline? Answer exactly YES or NO.
  • Line 2 onward: Your message to the PI. This must:
    1. Acknowledge the latest request (MultiNLI addition).
    2. Present a clear time budget showing the deficit.
    3. Prioritize the experiments (which are essential vs. nice-to-have).
    4. Propose what to cut or defer — suggest concrete items to drop/postpone (e.g., reduce LR sweep, defer one dataset to rebuttal, etc.).
    5. NOT blindly accept all requests — you must push back with evidence.

Do NOT start any training. Your only deliverable is /app/reply.txt.

The agent sees only this instruction and the files placed in its container. Reference solutions and verifier tests are intentionally hidden.