newsletter Newsletter

Chinese researchers publish the largest drone tracking dataset. In doing so, they lay the foundation for drone surveillance.

Despite all the criticism and warnings, the development of autonomous surveillance capabilities has made steady progress in recent years. Surveillance cameras equipped with AI image analysis are being deployed at airports, subway stations, or simply on the street. One of the pioneers of autonomous surveillance technology is China.

However, automated aerial surveillance using drones has been almost impossible: compared to videos captured by static ground cameras, videos captured by drones have different and unique viewing angles, more motion blur, and different resolutions due to the different speeds and heights.

Artificial intelligence trained for image analysis with classic image or video datasets therefore fails when analyzing drone footage.


Over 4000 videos for drone tracking

Researchers from the Chinese Academy of Sciences, Shenzhen Research Institute of Big Data and Chinese University of Hong Kong, Shenzhen have now released WebUAV-3M.

The dataset includes 4485 drone videos in which objects from 216 different categories, such as people, bears, bicycles, and agricultural machinery, were tagged in collaboration between humans and machines. It is the largest public drone tracking dataset, the team writes.

The videos are also accompanied by a linguistic description of their content in text and audio form. The researchers hope this will enable the development of multimodal tracking systems. Linguistic descriptions are also beneficial for interaction with blind people, they say.

WebUAV-3M includes labeled videos as well as linguistic descriptions of the content. | Image: Zhang et al

In initial tests, image analysis systems trained with WebUAV-3M were able to significantly improve their tracking accuracy for drone images. However, it remains below 50 percent even with WebUAV-3M - for now.

AI surveillance: China leads the way in research.

A dataset like WebUAV-3M can be used outside of surveillance - but funding from government institutions suggests that surveillance is the goal of the research.


The dystopia of constant surveillance of all movements and intentions, driven by algorithms that combine digital traces and information collected via (flying) cameras, is getting closer in some countries. This development is driven by research from China, the EU, and the USA.

For example, the Center for Security and Emerging Technology recently examined the evolution of surveillance technology in AI research between 2015 and 2019. The team analyzed a total of 100 million English-language publications, sorted by six tasks:

  • Face recognition,
  • person recognition,
  • action recognition,
  • emotion recognition,
  • crowd counting
  • and face guard detection against surveillance technology.

Facial recognition, crowd counting, and detecting attempts to protect oneself from facial recognition were among the fastest-growing research areas during the time studied. Overall, about 5.5 percent of all papers studied dealt with surveillance technology.

Leading the way in research on this technology: China. During the period studied, the share of Chinese research in AI papers increased from 33 to 37 percent, and the share of AI work specialized in surveillance technologies increased from 36 to 42 percent. In the six categories studied, China was more than 20 percent ahead of the EU (second) and the U.S. (third).

Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.

Read more about Artificial Intelligence:

Support our independent, free-access reporting. Any contribution helps and secures our future. Support now:
Bank transfer
Max is managing editor at THE DECODER. As a trained philosopher, he deals with consciousness, AI, and the question of whether machines can really think or just pretend to.
Join our community
Join the DECODER community on Discord, Reddit or Twitter - we can't wait to meet you.