Classify and Label the Content from the German Text Snippets of the Sustainable Development Report

Authors

  • Kongqiang Wang Yunnan University
  • Qingli Tan College of Ecology and Environment, Yunnan University
  • Peng Zhang School of Information Science and Engineering, Yunnan University

Keywords:

Text Content Analysis, Text Classification, Multi-category labeling, Pre-trained Model, Bert

Abstract

The ability to understand text snippets content is an essential component of human-like artificial intelligence, as text snippets content greatly influence human cognition, decision making, and social interactions. In addition to intention recognition in sustainable development report, the task of identifying the potential categories behind an individual’s text state in german text snippets is of great importance in many application scenarios. The main content of our research is content classification and
labeling from German text snippets of sustainable development reports, which aims at assign a content classification label to German text snippets taken from sustainability reports. Each snippet contains 3–5 sentences and corresponds to one of the predefined categories based on the German Sustainability Code (DNK). We used a context-based text prediction method and combine with a pre-trained model. During this process, we repeatedly tested different pre-trained models in an effort to achieve the best results. The best result on the test set was an Accuracy of 0.635285412. We have reached the most advanced level in this field compared with other models. The project code is available from https://github.com/WangKongQiang/Sustaineval-2025.

Downloads

Published

2026-06-30