[an error occurred while processing this directive]

China Terminology ›› 2024, Vol. 26 ›› Issue (1): 11-18.doi: 10.12339/j.issn.1673-8578.2024.01.002

Previous Articles     Next Articles

Corpus-Based Term Extraction in Field of Chinese Teaching as a Foreign Language

LU Yixin()   

  • Received:2023-07-09 Revised:2023-08-25 Online:2024-01-05 Published:2023-11-16

Abstract:

This paper introduces a method to extract terms of Chinese teaching as a foreign language. We take the text in the field of Chinese teaching as a foreign language as the target text, follow the principles of subject orientation, scientific corpus, and limited sample representation to establish a specialized corpus, and process it such as word segmentation and POS tagging. We combine statistical and linguistic rules, use the C-value method to calculate the term degree value, and explore the “hybrid solution” to find, define and extract terms of different lengths in this field. Finally a terminology base for Chinese teaching as a foreign language is established, including 238 single word terms, 375 two word terms, 121 three word terms, and 50 long terms (consisting of 4-6 words).

Key words: specialized corpus, term extraction, Chinese teaching as foreign language, terminology base for Chinese teaching, C-value algorithm