Initial retrieval results (top‑K retrieved video indices). Video breakdowns JSONL (summaries, objects, actions, scene tags, frame captions per video). It then prompts an LLM to compare pairs of ...