Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting

Yiming Xu; Hao Cheng; Monika Sester

doi:10.48550/arXiv.2402.03981

Details

Originalsprache	Englisch
Titel des Sammelwerks	2024 IEEE Intelligent Vehicles Symposium (IV)
Seiten	2397-2404
Seitenumfang	8
ISBN (elektronisch)	979-8-3503-4881-1
Publikationsstatus	Veröffentlicht - 6 Feb. 2024

Publikationsreihe

Name	IEEE Intelligent Vehicles Symposium, Proceedings
ISSN (Print)	1931-0587
ISSN (elektronisch)	2642-7214

Abstract

In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.

ASJC Scopus Sachgebiete

Informatik (insg.)
Angewandte Informatik
Ingenieurwesen (insg.)
Fahrzeugbau
Mathematik (insg.)
Modellierung und Simulation

Zitieren

Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. / Xu, Yiming; Cheng, Hao; Sester, Monika.
2024 IEEE Intelligent Vehicles Symposium (IV). 2024. S. 2397-2404 (IEEE Intelligent Vehicles Symposium, Proceedings).

Publikation: Beitrag in Buch/Bericht/Sammelwerk/Konferenzband › Aufsatz in Konferenzband › Forschung › Peer-Review

Xu, Y, Cheng, H & Sester, M 2024, Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. in 2024 IEEE Intelligent Vehicles Symposium (IV). IEEE Intelligent Vehicles Symposium, Proceedings, S. 2397-2404. https://doi.org/10.48550/arXiv.2402.03981, https://doi.org/10.1109/IV55156.2024.10588486

Xu, Y., Cheng, H., & Sester, M. (2024). Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. In 2024 IEEE Intelligent Vehicles Symposium (IV) (S. 2397-2404). (IEEE Intelligent Vehicles Symposium, Proceedings). https://doi.org/10.48550/arXiv.2402.03981, https://doi.org/10.1109/IV55156.2024.10588486

Xu Y, Cheng H, Sester M. Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. in 2024 IEEE Intelligent Vehicles Symposium (IV). 2024. S. 2397-2404. (IEEE Intelligent Vehicles Symposium, Proceedings). doi: 10.48550/arXiv.2402.03981, 10.1109/IV55156.2024.10588486

Xu, Yiming ; Cheng, Hao ; Sester, Monika. / Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting. 2024 IEEE Intelligent Vehicles Symposium (IV). 2024. S. 2397-2404 (IEEE Intelligent Vehicles Symposium, Proceedings).

Download

@inproceedings{fe1c66175b504b3494eec26ea859e19e,

title = "Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting",

abstract = "In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.",

keywords = "cs.CV",

author = "Yiming Xu and Hao Cheng and Monika Sester",

year = "2024",

month = feb,

day = "6",

doi = "10.48550/arXiv.2402.03981",

language = "English",

isbn = "979-8-3503-4882-8",

series = "IEEE Intelligent Vehicles Symposium, Proceedings",

pages = "2397--2404",

booktitle = "2024 IEEE Intelligent Vehicles Symposium (IV)",

}

Download

TY - GEN

T1 - Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting

AU - Xu, Yiming

AU - Cheng, Hao

AU - Sester, Monika

PY - 2024/2/6

Y1 - 2024/2/6

N2 - In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.

AB - In autonomous driving tasks, trajectory prediction in complex traffic environments requires adherence to real-world context conditions and behavior multimodalities. Existing methods predominantly rely on prior assumptions or generative models trained on curated data to learn road agents' stochastic behavior bounded by scene constraints. However, they often face mode averaging issues due to data imbalance and simplistic priors, and could even suffer from mode collapse due to unstable training and single ground truth supervision. These issues lead the existing methods to a loss of predictive diversity and adherence to the scene constraints. To address these challenges, we introduce a novel trajectory generator named Controllable Diffusion Trajectory (CDT), which integrates map information and social interactions into a Transformer-based conditional denoising diffusion model to guide the prediction of future trajectories. To ensure multimodality, we incorporate behavioral tokens to direct the trajectory's modes, such as going straight, turning right or left. Moreover, we incorporate the predicted endpoints as an alternative behavioral token into the CDT model to facilitate the prediction of accurate trajectories. Extensive experiments on the Argoverse 2 benchmark demonstrate that CDT excels in generating diverse and scene-compliant trajectories in complex urban settings.

KW - cs.CV

UR - http://www.scopus.com/inward/record.url?scp=85199753378&partnerID=8YFLogxK

U2 - 10.48550/arXiv.2402.03981

DO - 10.48550/arXiv.2402.03981

M3 - Conference contribution

SN - 979-8-3503-4882-8

T3 - IEEE Intelligent Vehicles Symposium, Proceedings

SP - 2397

EP - 2404

BT - 2024 IEEE Intelligent Vehicles Symposium (IV)

ER -

Research@Leibniz University

Controllable Diverse Sampling for Diffusion Based Motion Behavior Forecasting

Autorschaft

Organisationseinheiten

Details

Publikationsreihe

Abstract

ASJC Scopus Sachgebiete

Zitieren

Von denselben Autoren

Multi-modal Land Cover Classification of Historical Aerial Images and Topographic Maps: A Comparative Study

Visualization of Space Occupancy Uncertainty in a 3D Voxel-based Urban Model

Gap completion in point cloud scene occluded by vehicles using SGC-Net

Investigating Effects of Future Path Visualisation on Path Choices During Collision Encounters

3D Uncertain Implicit Surface Mapping Using GMM and GP

Multi-modal Land Cover Classification of Historical Aerial Images and Topographic Maps: A Comparative Study

Visualization of Space Occupancy Uncertainty in a 3D Voxel-based Urban Model

Gap completion in point cloud scene occluded by vehicles using SGC-Net

Investigating Effects of Future Path Visualisation on Path Choices During Collision Encounters

3D Uncertain Implicit Surface Mapping Using GMM and GP

Multi-modal Land Cover Classification of Historical Aerial Images and Topographic Maps: A Comparative Study