AI & Machine Intelligence

“Greater Bay Area Data Super Initiative” Launched in Dongguan, Data Annotation Industrial Park Makes Its Debut

On Dec 2, 2025, Guangdong’s first High-Quality Dataset Competition and Dongguan Data Annotation Park launched in Dongguan, featuring 6 platforms and 22 settled enterprises to boost GBA’s data-driven industrial development.

On December 2, 2025, the launching ceremony of the first Guangdong High-Quality Dataset Innovation Competition was held in Dongguan. Zhang Guozhi, Member of the Standing Committee of the Provincial Party Committee and Vice Governor of Guangdong Province, Lü Chengxi, Deputy Secretary of the Dongguan Municipal Party Committee and Mayor of Dongguan, and Wang Tianguang, Secretary of the Party Leadership Group and Director of the Provincial Administration of Government Services and Data attended the event. Also present were Li Jun, leader of Dongguan City, responsible comrades from the Provincial Administration of Government Services and Data, Department of Education, Department of Science and Technology, Department of Industry and Information Technology, Department of Human Resources and Social Security, main responsible comrades of the administrations of government services and data of 21 prefecture-level and above cities in the province, responsible comrades of pilot zones for data industry agglomeration construction, as well as responsible comrades from relevant departments of Dongguan City, enterprise representatives, industry organization representatives, fund institution representatives, university representatives and other nearly 500 participants.

Greater Bay Area Data Super Initiative

01 First High-Quality Dataset Competition Launched to Explore New Paths for Data Value Transformation

At present, data has become a core production factor driving industrial transformation, and high-quality datasets are the “source of living water” for releasing data value. As a pacesetter, forerunner and pilot zone of reform and opening up, Guangdong, based on its actual conditions, is fully committed to building a new highland for digital and intelligent development, and takes the lead in holding the High-Quality Dataset Innovation Competition.

Greater Bay Area Data Super Initiative

This High-Quality Dataset Innovation Competition represents a pioneering “from 0 to 1” leap in China. Adhering to the principles of “real demands, real data, real solutions and real applications”, it adopts an innovative “phased listing and year-round competition” model through the “tackling key problems by open recruitment” competition mechanism. Focusing on the development needs of key fields such as industrial manufacturing, medical and health care, scientific and technological innovation, urban governance and transportation, it first identifies scenarios and then secures data. With the competition as the carrier and starting point, it aims to explore and build a number of high-quality, reusable datasets, providing “fuel” for artificial intelligence model training and industry applications.

At the launching ceremony, the first batch of high-quality dataset competition topics from key fields such as energy, biomedicine, finance, transportation, low-altitude economy and education were officially “released”. Units including China Southern Power Grid Co., Ltd., Guangzhou (National) Laboratory, Guangdong Provincial People’s Hospital, Ping An Property & Casualty Insurance Company of China, Ltd., Capital Bio-Medical Holdings Co., Ltd., Guangzhou Kingmed Diagnostics Group Co., Ltd., Guangdong Taiyi High-tech Development Co., Ltd., Guangdong Vocational Education Bridge Data Technology Co., Ltd., Dongguan Artificial Intelligence and Digital Economy Co., Ltd., and Dongguan Aohai Technology Co., Ltd. issued the first batch of “demand lists” for high-quality datasets.

Greater Bay Area Data Super Initiative

In the next step, the competition will build a complete closed loop of “data supply – technology R&D – scenario implementation – industrial upgrading” through the “1+3+N” organizational system consisting of “1 listing mechanism + 3 competition stages + N supply-demand docking meetings”. By promoting application, integration and industrial development through the competition, while facilitating the replication and promotion of mature data application scenarios, it will better tap the digital and intelligent potential of emerging fields such as low-altitude economy and industrial internet, give full play to the enabling role of high-quality datasets, effectively release the precious value of data elements, and actively contribute “Guangdong experience” to the construction of a national integrated data market and the prosperous development of the data industry ecosystem.

02 Six Major Platforms of Dongguan Data Annotation Park Debut to Strengthen the Ecological Support for the Data Industry

As a strong city in scientific and technological innovation and manufacturing, Dongguan has the number of industrial enterprises above designated size ranking among the top three in China. It has both the advantages of rich AI application scenarios and massive industrial data, and is a national pilot base for artificial intelligence application.

“Data is like oil; it cannot be only extracted but not refined.” At present, Dongguan is taking the construction of the “Greater Bay Area Data Valley” as the starting point, taking the lead in laying out basic links such as data annotation, and striving to build the country’s first large-scale edge intelligent computing network to realize in-depth mining and efficient processing of industrial production line data.

The planning and construction of the Dongguan Data Annotation Industrial Park (hereinafter referred to as “Dongguan Data Annotation Park”) is an important exploration in this regard. Dongguan will strive to form 100 industry-level high-quality datasets within three years and build the largest and most intelligent data annotation base in the Guangdong-Hong Kong-Macao Greater Bay Area.

It is understood that Dongguan Data Annotation Park is located in Wanjiang Sub-district, with a total investment of 330 million yuan. It has joined hands with two leading enterprises, China Telecom and Baidu Intelligent Cloud, to build an industrial ecosystem of “one park, two bases and six platforms”. On the morning of that day, Dongguan Data Annotation Park was officially unveiled and put into operation.

At the launching ceremony of the competition, Dongguan Data Annotation Park and six enabling platforms were released collectively, and 22 enterprises signed contracts to settle in the park simultaneously.

Greater Bay Area Data Super Initiative

The six major platforms are the Data Annotation Exhibition Center, Multimodal Data Intelligent Annotation Platform, Data Talent Training and Certification Platform, Embodied Intelligence Data Collection and Annotation Laboratory, High-Quality Dataset and Large Model Evaluation Center, and Industry-Level Data Trusted Space. Covering technical support, talent training, achievement transformation and other aspects, they provide all-round infrastructure support for the development of the data annotation industry.

At the same time, the Bay Area Service Innovation Center of Shenzhen Data Exchange signed a contract to settle in Nancheng Sub-district, further promoting the innovative application of data elements in fields such as government affairs, security and transactions.

03 Clear Path for High-Quality Dataset Construction, Multiple Parties Discuss Innovative Practice Plans

High-quality datasets are a key factor determining the quality of large models and also the core cornerstone of the progress of the digital industry. In the sharing session of the launching ceremony, three industry experts respectively brought insights and practices on high-quality datasets.

Li Shuai, Deputy Director of the Artificial Intelligence Center of the Fifth Electronics Research Institute of the Ministry of Industry and Information Technology, introduced that data preprocessing, data annotation, data synthesis and data quality assessment are interlocking core links in the process of building high-quality datasets. Through systematic collaboration, they jointly ensure that the datasets can accurately support the training and application of large models.

Li Shuai mentioned that the Fifth Electronics Research Institute is jointly building a high-quality dataset evaluation service center with data annotation bases, industry-leading enterprises and artificial intelligence service providers, to provide standardized production processes and authoritative quality certification for high-quality datasets.

Shen Jian, Head of Autonomous Driving Business Operations at Baidu, focused on the field of embodied intelligence and shared solutions for data collection and annotation rooms. Shen Jian believes that data is a huge bottleneck for the embodied intelligence industry to move towards general intelligence, and obtaining high-quality and massive training data is the key to breaking the situation for humanoid robots.

It is understood that Baidu Intelligent Cloud can custom-build various real collection scenarios for embodied intelligence according to customer collection needs, support scenario operation and management, carry out task customization, long-term stable and large-scale collection operations, and support the improvement of model capabilities through a full-process platform for collection, annotation, management and training.

Wei Wenbo, Deputy General Manager of the Business Development Department of China Telecom Artificial Intelligence Technology Co., Ltd., summarized many key points for high-quality dataset construction, including data security and compliance, and integrated platform toolchain support.

Wei Wenbo said that China Telecom is building a new paradigm of “one platform and three systems”. By building and operating an integrated platform, it supports the three systems of dataset construction, quality evaluation and data security, enabling the controllable, efficient and compliant construction and value release of high-quality datasets, and systematically solving the problems in dataset construction.

With the continuous investment of multi-party technical forces and the accumulation of practical experience, the construction of high-quality datasets is moving from single-point breakthrough to multi-point blooming, which will provide a more solid data support for the innovative development of the artificial intelligence industry.

Leave a Reply

Your email address will not be published. Required fields are marked *

Comment moderation is enabled. Your comment may take some time to appear.