Posted in | News | Signal Processing

Globe230k Dataset Unlocks Global-Scale Land Characterization

Monitoring the remarkable shifts in land use throughout the last century, global land cover maps offer crucial perspectives on how human settlement has influenced the environment.

Scientists from Sun Yat-Sen University developed a large-scale annotated dataset (Globe230k) for highly generalized global land cover mapping. The annotated patches provide cues to help classification tools distinguish cropland, forest, wetland, grassland, and more. Image Credit: Qian Shi, Da He, Zhengyu Liu, Xiaoping Liu, Jingqian Xue, Sun Yat-Sen University.

Scientists at Sun Yat-sen University developed an extensive remote sensing annotation dataset to aid Earth observation research and offer fresh perspectives on dynamically monitoring global land cover.

Their research, published on October 16th, 2023 in the Journal of Remote Sensing, delves into the profound transformations in global land use/land cover (LULC) brought about by the progress of industrialization and urbanization, encompassing phenomena such as deforestation and flooding.

We urgently need high-frequency, high-resolution monitoring of LULC to mitigate the impact of human activities on the climate and the environment.

Qian Shi, Professor, Sun Yat-Sen University

The monitoring of Global Land Use/Land Cover (LULC) depends on automated classification algorithms, which categorize satellite remote sensing images pixel by pixel. Data-driven deep learning approaches extract inherent features from these images and predict the LULC label for each pixel.

In recent times, a technique known as semantic segmentation has gained popularity among researchers for deep-learning tasks in global land cover mapping using remote sensing images. Unlike classifying entire images at once, semantic segmentation assigns specific labels to each pixel or element individually.

Different from recognizing the commercial scene or residential scene in an image, the semantic segmentation network can delineate the boundaries of each land object in the scene and help us understand how land is being used.

Qian Shi, Professor, Sun Yat-Sen University

Achieving a comprehensive high-level semantic understanding requires considering the contextual information of each pixel. Geographical objects are intricately linked to their surrounding scenes, offering cues for predicting individual pixel characteristics. For instance, airports host parked airplanes, harbors serve as docks for ships, and mangroves typically thrive along shorelines.

However, the effectiveness of semantic segmentation is constrained by the quantity and quality of training data. Shi notes that existing annotation data often lack in terms of quantity, quality, and spatial resolution.

Additionally, these datasets are typically regionally sampled, exhibiting limited diversity and variability, thereby complicating the scalability of data-driven models on a global scale.

To overcome these limitations, the research team has introduced a large-scale annotation dataset, Globe230k, specifically designed for semantic segmentation of remote sensing images. This dataset boasts three key advantages:

  • Scale: The Globe230k dataset comprises 232,819 annotated images with sufficient dimensions and spatial resolution.
  • Diversity: The annotated images are extracted from global regions, covering an expansive area of over 60,000 square kilometers. This signifies a notable level of variability and diversity in the dataset.
  • Multimodal features: In addition to RGB bands, the Globe230k dataset incorporates essential features for Earth system research, including vegetation, elevation, and polarization indices.

The research team assessed the Globe230k dataset using various state-of-the-art semantic segmentation algorithms. The findings indicated its effectiveness in evaluating key aspects of land cover characterization, such as multiscale modeling, detail reconstruction, and generalization ability in algorithms.

We believe that the Globe230k dataset could support further Earth observation research and provide new insights into global land cover dynamic monitoring.

Qian Shi, Professor, Sun Yat-Sen University

The dataset has been released to the public, serving as a benchmark to foster advancements in global land cover mapping and the development of semantic segmentation algorithms.

The study was financially supported by the National Key Research and Development Program of China and the National Natural Science Foundation of China.

Additional contributors to the project include Da He, Zhengyu Liu, Xiaoping Liu, and Jingqian Xue, all affiliated with Sun Yat-sen University and the Guangdong Provincial Key Laboratory for Urbanization and Geo-simulation.

Journal Reference:

Shi, Q., et al. (2023) Globe230k: A Benchmark Dense-Pixel Annotation Dataset for Global Land Cover Mapping. Journal of Remote Sensing.


Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Azthena logo powered by Azthena AI

Your AI Assistant finding answers from trusted AZoM content

Azthena logo with the word Azthena

Your AI Powered Scientific Assistant

Hi, I'm Azthena, you can trust me to find commercial scientific answers from

A few things you need to know before we start. Please read and accept to continue.

  • Use of “Azthena” is subject to the terms and conditions of use as set out by OpenAI.
  • Content provided on any AZoNetwork sites are subject to the site Terms & Conditions and Privacy Policy.
  • Large Language Models can make mistakes. Consider checking important information.

Great. Ask your question.

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.