OVERVIEW

Figure1

Swarm UAV autonomous flight for Embodied Long-Horizon (ELH) tasks is crucial for advancing the low-altitude economy. However, existing methods focus only on specific basic tasks due to dataset limitations, failing in real-world deployment for ELH tasks. ELH tasks are not mere concatenations of basic tasks, requiring handling long-term dependencies, maintaining embodied persistent states, and adapting to dynamic goal shifts. This paper presents U2UData+, the first large-scale swarm UAV autonomous flight dataset for ELH tasks and the first scalable swarm UAV data online collection and algorithm closed-loop verification platform. The dataset is captured by 15 UAVs in autonomous collaborative flights for ELH tasks, comprising 12 scenes, 720 traces, 120 hours, 600 seconds per trajectory, 4.32M LiDAR frames, and 12.96M RGB frames. This dataset also includes brightness, temperature, humidity, smoke, and airflow values covering all flight routes. The platform supports the customization of simulators, UAVs, sensors, flight algorithms, formation modes, and ELH tasks. Through a visual control window, this platform allows users to collect customized datasets through one-click deployment online and to verify algorithms by closed-loop simulation. U2UData+ also introduces an ELH task for wildlife conservation and provides comprehensive benchmarks with 9 SOTA models.

SIMULATOR: U2USIM+

4
Terrains
7
Weather conditions
8
Sensor types
9km2
Flight area
58
Forest vegetation assets
15
Animals

simulator

U2USim
Figure1
Landcsape
Sunny
Rain
Snow
Sandstorm
Fog
Thunder
Wind
Lake Side
Figure1
Terrains
Simulator
Customization
UAV
Customization
Sensor
Customization
Flight Algo.
Customization
Formation
Customization
ELH Task
Customization

simulator

Figure1
U2USim+
Figure1
Visual control window
Figure1
The scalable UAVs data online collection and algorithm closed-loop verification platform. Wind throughout the map.
Figure1
Scalable animal behavior
Figure1
Scalable animal behavior
Figure1
Scalable animal behavior

DATASET: U2UDATA+

Dataset Scale

15
UAVs
12
Scenes
720
Traces
120h
Total Flight Time
4.32M
LiDAR Frames
12.96M
RGB Frames

Data

Dataset Collection

Figure1
Sensor positions
Figure1
Sensor types
Figure1
Discipline formation mode
Figure1
Fixed formation mode
Figure1
Autonomous formation mode

Data

Dataset Analysis

Figure1
A detailed comparison of swarm UAV datasets.
-indicates that specific information is not provided. DF:Discipline formation mode; FF:Fixed formation mode; AF:Autonomous formation mode. U2USim★ represents the scalable U2USim.
Figure1
ESTN:The trajectory number of each scene.
Figure1
ET-Length:The length of each trajectory. TNT: The total length of trajectories
Figure1
A detailed comparison of the data size between U2UData-2 with existing swarm UAV datasets.

BENCHMARK: CODE

Figure1
Swarm UAV collaborative tracking benchmark for ELH tasks in the U2UData+ dataset.

This benchmark uses 9 SOAT collaborative tracking algorithms, including No Fusion, Late Fusion, Early Fusion, When2Com (Liu et al. 2020), DiscoNet (Li et al. 2021), V2VNet (Wang et al.2020), V2X-ViT (Xu et al. 2022b), CoBEVT (Xu et al.2022a), and Where2com (Hu et al. 2022). This benchmark will be updated dynamically afterwards.

EMBODIED LONG-HORIZON TASKS

Figure1
Collaborative communication
Figure1
Collaborative perception
Figure1
Collaborative localization
Figure1
Collaborative perception
Figure1
Collaborative communication
Figure1
Collaborative task re-allocation

Embodied Long-Horizon (ELH) tasks is crucial for advancing the low-altitude economy. ELH tasks are complex, multi-step tasks that require sustained embodied planning, sequential decision-making, and extended execution over a prolonged period to achieve a final goal.

TUTORIAL: DEMO

U2UData Tutorial Demo.
U2UData+ Tutorial Demo.

This project provides demo videos in 1080P and 4K resolutions. Due to webpage video size limitations, the video content is somewhat blurry.

CITE

@article{feng2025u2udata,
title={U2UData+: A Scalable Swarm UAVs Autonomous Flight Dataset for Embodied Long-horizon Tasks},
author={Feng, Tongtong and Wang, Xin and Han, Feilin and Zhang, Leping and Zhu, Wenwu},
journal={Thirtieth AAAI Conference on Artificial Intelligence},
year={2025}
}
@article{feng2025embodied,
title={Embodied AI: From LLMs to World Models},
author={Feng, Tongtong and Wang, Xin and Jiang, Yu-Gang and Zhu, Wenwu},
journal={IEEE Circuits and Systems Magazine},
year={2025}
}
@article{feng2025evoagent,
title={EvoAgent: Agent Autonomous Evolution with Continual World Model for Long-Horizon Tasks},
author={Feng, Tongtong and Wang, Xin and Zhou, Zekai and Wang, Ren and Zhan, Yuwei and Li, Guangyao and Li, Qing and Zhu, Wenwu},
journal={arXiv preprint arXiv:2502.05907},
year={2025}
}
@inproceedings{feng2024u2udata,
title={U2udata: A large-scale cooperative perception dataset for swarm uavs autonomous flight},
author={Feng, Tongtong and Wang, Xin and Han, Feilin and Zhang, Leping and Zhu, Wenwu},
booktitle={Proceedings of the 32nd ACM International Conference on Multimedia},
pages={7600--7608},
year={2024}
}
@inproceedings{han2024u2usim,
author = {Han, Feilin and Zhang, Leping and Wang, Xin and Zhao, Ke-Ao and Zhong, Ying and Su, Ziyi and Feng, Tongtong and Zhu, Wenwu},
title = {U2USim - A UAV Telepresence Simulation Platform with Multi-agent Sensing and Dynamic Environment},
year = {2024},
booktitle = {Proceedings of the 32nd ACM International Conference on Multimedia},
pages = {11258–11260}
}
@inproceedings{feng2024multi,
author={Feng, Tongtong and Li, Qing and Wang, Xin and Wang, Mingzi and Li, Guangyao and Zhu, Wenwu},
title={Multi-weather cross-view geo-localization using denoising diffusion models},
year = {2024},
booktitle={Proceedings of the 2nd Workshop on UAVs in Multimedia: Capturing the World from a New Perspective},
pages={35--39}
}

TEAM

People1
Tongtong Feng
People2
Xin Wang
People3
Feilin Han
People4
Leping Zhang
People5
Wenwu Zhu

AEAI Lab ©2025