Classify and Label the Content from the Chinese Text Snippets of the LLM-Generated Text Detection
Keywords:
Text Content Analysis, Text Classification, Multi-category Labeling, Pre-trained Model, Machine LearningAbstract
The ability to understand Chinese text snippets content is an essential component of human-like artificial intelligence, as text snippets content greatly influence human cognition, decision making, and social interactions. In addition to intention recognition in Chinese text snippets, the task of identifying the potential categories behind an individual’s Chinese text state in LLM-Generated text snippets is of great importance in many application scenarios. The main content of our research is content
classification and labeling from Chinese text snippets of LLM-Generated text detection, which aims at assign a content classification label to Chinese text snippets taken from LLM-Generated text snippets. Each snippet contains many sentences and corresponds to one of the predefined categories based on the LLM-Generated text snippets. We used a context-based text prediction method and combine with a pre-trained model or machine learning (ML) model. During this process, we repeatedly
tested different pre-trained models and machine learning (ML) models in an effort to achieve the best results. The best result on the testp1 set was a macro-averaged F1-Score of 0.5746 and on the testp2 set was a macro averaged F1-Score of 0.5869 by using distilbert/distilbert- base-uncased. We have reached the most advanced level in this field compared with other participant models.
The project code is available from https://github.com/WangKongQiang/NLPCC2026_Task6.
Downloads
Published
Issue
Section
License
Copyright (c) 2026 Kongqiang Wang, Qingli Tan, Peng Zhang

This work is licensed under a Creative Commons Attribution 4.0 International License.