🎥 Supplementary Videos
MPE — Adversary Model Stochasticity
MPE - Adversary: We fix the seed and evaluate thrice to show inherent model stochasticity. DiffFSP learns more diverse strategies.
MPE — Tag Predator–Prey
Qualitative predator–prey results under sparse rewards show competitive gameplay.
RaceTrack — Robustness to Unseen Opponents
Trained agents (yellow) vs unseen agents (blue). Agents learn to set up overtakes through cornering; several exhibit block pass behavior.
RaceTrack — Robustness: Failure Modes
Left: QSMFSP fails a lane change, rear-ending the opponent (local observations only). Right: DiffFSP infers agents ahead and briefly violates track boundaries to overtake.
RaceTrack — Overtake (1)
The attacker overtakes at a turn and immediately blocks to prevent re-overtake.
RaceTrack — Overtake (2)
Another strategic overtake on a curve followed by a blocking maneuver.
RaceTrack — Overtake (1v1)
1v1 overtake at a curve with immediate defensive positioning.
RaceTrack — Block (1)
The defender executes sustained blocking to prevent an overtake.
RaceTrack — Defensive Driving (1)
The attacker maintains distance and speed match; occasional shoulder checks without overtake attempts.
RaceTrack — Overtake Fail (1)
The attacker aborts an overtake, braking late to avoid a rear-end collision.
RaceTrack — Brake Check (Follower)
The defender performs a brake check; the attacker reacts defensively to avoid collision.