Classify and Label the Content from the Chinese Text Snippets of the LLM-Generated Text Detection

Kongqiang Wang; Qingli Tan; Peng Zhang

Authors

Kongqiang Wang Yunnan University
Qingli Tan College of Ecology and Environment, Yunnan University, 650500, Kunming, Yunnan, China
Peng Zhang School of Information Science and Engineering, Yunnan University, 650500, Kunming, Yunnan, China.

Keywords:

Text Content Analysis, Text Classification, Multi-category Labeling, Pre-trained Model, Machine Learning

Abstract

The ability to understand Chinese text snippets content is an essential component of human-like artificial intelligence, as text snippets content greatly influence human cognition, decision making, and social interactions. In addition to intention recognition in Chinese text snippets, the task of identifying the potential categories behind an individual’s Chinese text state in LLM-Generated text snippets is of great importance in many application scenarios. The main content of our research is content
classification and labeling from Chinese text snippets of LLM-Generated text detection, which aims at assign a content classification label to Chinese text snippets taken from LLM-Generated text snippets. Each snippet contains many sentences and corresponds to one of the predefined categories based on the LLM-Generated text snippets. We used a context-based text prediction method and combine with a pre-trained model or machine learning (ML) model. During this process, we repeatedly
tested different pre-trained models and machine learning (ML) models in an effort to achieve the best results. The best result on the testp1 set was a macro-averaged F1-Score of 0.5746 and on the testp2 set was a macro averaged F1-Score of 0.5869 by using distilbert/distilbert- base-uncased. We have reached the most advanced level in this field compared with other participant models.
The project code is available from https://github.com/WangKongQiang/NLPCC2026_Task6.

Classify and Label the Content from the Chinese Text Snippets of the LLM-Generated Text Detection

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section

License