中国科技术语 ›› 2019, Vol. 21 ›› Issue (6): 11-16.doi: 10.3969/j.issn.1673-8578.2019.06.002

• 术语学研究 • 上一篇    下一篇

面向中亚地区的多语种专业领域术语库及本体知识库构建

原伟   

  1. 信息工程大学洛阳外国语学院,河南洛阳 471003
  • 收稿日期:2018-01-01 修回日期:2019-11-01 出版日期:2019-12-25 发布日期:2020-05-11
  • 作者简介:原伟(1981—),男,博士,副教授,主要研究方向为计算语言学和语料库语言学。通信方式:yw5811827@126.com。
  • 基金资助:
    国家社会科学基金项目“基于本体的俄汉可比语料库构建与评估”(14CYY051);国家社会科学基金项目“基于可比语料库和本体的俄汉网络新闻话题监测与情感识别研究”(18BYY235)

Construction of Multilingual Domain Term Base and Ontology for Central Asia Area

YUAN Wei   

  • Received:2018-01-01 Revised:2019-11-01 Online:2019-12-25 Published:2020-05-11

摘要:

针对目前乌兹别克语、哈萨克语等中亚语种急缺专业领域词典、术语库及知识本体库的问题,文章以安全领域为例,利用现有术语作为种子词,自动采集维基百科及双语专业词典中的术语对,人工校对后构建了中型中、俄、乌、哈多语种专业领域术语库。以此术语库为基础,搭建了包含人员、组织、地域、技术、设备、活动、文件7大类及35个子类的领域本体,最后讨论了该专业术语库及领域本体库的扩展潜力和应用前景。本成果是一项重要的基础性工作,对中亚语种的术语词典编撰、术语学、自然语言处理和语言教学研究均有较大现实意义。

关键词: 中亚, 术语, 本体, 俄语, 乌兹别克语, 哈萨克语

Abstract:

Studies on languages of Uzbek, Kazakh and other Central Asian are facing a problem of lack of professional domain dictionaries, terminology and knowledge ontology. For solving this problem, we take the military field as an example, and use the existing terms as the seed word to automatically collect bilingual terminology pairs from Wikipedia and professional dictionaries. Based on manual proofreading, we built a medium-sized Chinese, Russian, Uzbek, and Kazakh term base. Based on this term base, the military domain ontology has built, which includes 7 categories (person, organization, region, technology, equipment, activity and document) and 35 sub-categories. We also discussed the potential and application prospects of the term base and ontology library. This achievement is an important basic work, and it has great practical significance for the compilation of terminology dictionary, terminology, natural language processing and language teaching in Central Asian languages.

Key words: Central Asia, terminology, ontology, Russian, Uzbek, Kazakh

中图分类号: