China Terminology ›› 2022, Vol. 24 ›› Issue (1): 14-25.doi: 10.12339/j.issn.1673-8578.2022.01.002
Previous Articles Next Articles
XIANG Lu1,2(), ZHOU Yu1,2,3(
), ZONG Chengqing1,2(
)
Received:
2021-07-30
Revised:
2021-10-09
Online:
2022-01-05
Published:
2021-12-27
CLC Number:
XIANG Lu, ZHOU Yu, ZONG Chengqing. Bilingual Terminology Alignment Based on Chinese-English Monolingual Terminological Bank[J]. China Terminology, 2022, 24(1): 14-25.
Add to citation manager EndNote|Ris|BibTeX
URL: https://www.term.org.cn/EN/10.12339/j.issn.1673-8578.2022.01.002
源语言术语 | 逻辑卷轴管理 | 网路语音协定 | 北斗卫星导航系统 |
---|---|---|---|
谷歌翻译 | Logical scroll management | Internet voice protocol | Beidou satellite navigation system |
有道翻译 | Logical scroll management | Voip protocol | Beidou navigation system |
百度翻译 | Logical scroll management | Network voice protocol | Beidou navigation system |
搜狗翻译 | Logical scroll management | Voice over internet protocol | Beidou navigation system |
Bing翻译 | Logic scroll management | Internet voice protocol | Beidou satellite navigation system |
目标语言术语 | Logical volume management | Network voice protocol | Beidou navigation satellite system |
源语言术语 | 逻辑卷轴管理 | 网路语音协定 | 北斗卫星导航系统 |
---|---|---|---|
谷歌翻译 | Logical scroll management | Internet voice protocol | Beidou satellite navigation system |
有道翻译 | Logical scroll management | Voip protocol | Beidou navigation system |
百度翻译 | Logical scroll management | Network voice protocol | Beidou navigation system |
搜狗翻译 | Logical scroll management | Voice over internet protocol | Beidou navigation system |
Bing翻译 | Logic scroll management | Internet voice protocol | Beidou satellite navigation system |
目标语言术语 | Logical volume management | Network voice protocol | Beidou navigation satellite system |
领域 | 编号 | 方法 | 正确率 (%) |
---|---|---|---|
计算机科学 | 1 | 基线系统1 | 43.34 |
2 | 基线系统2 | 6.09 | |
3 | 基线系统3 | 65.82 | |
4 | 本文方法 | 74.46 | |
土木工程 | 5 | 基线系统1 | 39.48 |
6 | 基线系统2 | 3.68 | |
7 | 基线系统3 | 61.42 | |
8 | 本文方法 | 74.62 | |
医学 | 9 | 基线系统1 | 46.23 |
10 | 基线系统2 | 2.87 | |
11 | 基线系统3 | 65.35 | |
12 | 本文方法 | 74.84 |
领域 | 编号 | 方法 | 正确率 (%) |
---|---|---|---|
计算机科学 | 1 | 基线系统1 | 43.34 |
2 | 基线系统2 | 6.09 | |
3 | 基线系统3 | 65.82 | |
4 | 本文方法 | 74.46 | |
土木工程 | 5 | 基线系统1 | 39.48 |
6 | 基线系统2 | 3.68 | |
7 | 基线系统3 | 61.42 | |
8 | 本文方法 | 74.62 | |
医学 | 9 | 基线系统1 | 46.23 |
10 | 基线系统2 | 2.87 | |
11 | 基线系统3 | 65.35 | |
12 | 本文方法 | 74.84 |
领域 | 编号 | 方法 | 正确率 (%) |
---|---|---|---|
计算机科学 | 1 | 步骤(1) | 43.34 |
2 | +LCS | 69.59 | |
3 | +mBERT | 74.46 | |
土木工程 | 4 | 步骤(1) | 39.48 |
5 | +LCS | 68.44 | |
6 | +mBERT | 74.62 | |
医学 | 7 | 步骤(1) | 46.23 |
8 | +LCS | 71.47 | |
9 | +mBERT | 74.84 |
领域 | 编号 | 方法 | 正确率 (%) |
---|---|---|---|
计算机科学 | 1 | 步骤(1) | 43.34 |
2 | +LCS | 69.59 | |
3 | +mBERT | 74.46 | |
土木工程 | 4 | 步骤(1) | 39.48 |
5 | +LCS | 68.44 | |
6 | +mBERT | 74.62 | |
医学 | 7 | 步骤(1) | 46.23 |
8 | +LCS | 71.47 | |
9 | +mBERT | 74.84 |
中文术语 | 英语术语 | 英文伪术语 | LCS(Top5) |
---|---|---|---|
三叉搜索树 | Ternary search tree | Trigeminal search tree | Ternary search tree Optimal binary search tree Finger search tree Binary search tree Priority search tree |
Windows驱动程式套件 | Windows Driver Kit | Windows driver package | Windows Driver Kit Windows Live Spaces Windows Driver Model Windows DreamScene Windows Server |
中文术语 | 英语术语 | 英文伪术语 | LCS(Top5) |
---|---|---|---|
三叉搜索树 | Ternary search tree | Trigeminal search tree | Ternary search tree Optimal binary search tree Finger search tree Binary search tree Priority search tree |
Windows驱动程式套件 | Windows Driver Kit | Windows driver package | Windows Driver Kit Windows Live Spaces Windows Driver Model Windows DreamScene Windows Server |
中文术语 | 英语术语 | 英文伪术语 | LCS(Top10) | mBERT |
---|---|---|---|---|
访问者模式 | Visitor pattern | Visitor mode | Story Mode Monitor mode Storage model Transistor model NIST RBAC model Visitor pattern … | Visitor pattern |
中文术语 | 英语术语 | 英文伪术语 | LCS(Top10) | mBERT |
---|---|---|---|---|
访问者模式 | Visitor pattern | Visitor mode | Story Mode Monitor mode Storage model Transistor model NIST RBAC model Visitor pattern … | Visitor pattern |
[1] | 冯志伟. 现代术语学引论[M]. 北京: 语文出版社, 1997. |
[2] | 杜波, 田怀凤, 王立, 等. 基于多策略的专业领域术语抽取器的设计[J]. 计算机工程, 2005(14):159-160. |
[3] | 孙茂松, 李莉, 刘知远. 面向中英平行专利的双语术语自动抽取[J]. 清华大学学报(自然科学版), 2014, 54(10):1339-1343. |
[4] | 孙乐, 金友兵, 杜林, 等. 平行语料库中双语术语词典的自动抽取[J]. 中文信息学报, 2000(6):33-39. |
[5] | HUANG G P, ZHANG J J, ZHOU Y, et al. A simple, straightforward and effective model for joint bilingual terms detection and word alignment in smt[C]//Proceedings of the Fifth Conference on Natural Language Processing and Chinese Computing & The Twenty Fourth International Conference on Computer Processing of Oriental Languages. Kunming, China, 2016:103-115. |
[6] | LEFEVER E, MACKEN L, HOSTE V. Language-independent bilingual terminology extraction from a multilingual parallel corpus:A simple, straightforward and effective model for joint bilingual terms detection and word alignment in smt[C]// Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009). 2009: 496-504. |
[7] | FAN X, SHIMIZU N, NAKAGAWA H. Automatic extraction of bilingual terms from a chinese-japanese parallel corpus[C]// Proceedings of the 3rd International Universal Communication Symposium. 2009: 41-45. |
[8] | 蒋俊梅. 基于平行语料库的双语术语抽取系统研究[J]. 现代电子技术, 2016, 39(15):108-111. |
[9] | 康小丽, 章成志, 王惠临. 基于可比语料库的双语术语抽取研究述评[J]. 现代图书情报技术, 2009(10):7-13. |
[10] | AKER A, PARAMITA M L, GAIZAUSKAS R. Extracting bilingual terminologies from comparable corpora[C]// Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). Sofia, Bulgaria: Association for Computational Linguistics, 2013:402-411. |
[11] | 张雪, 孙宏宇, 辛东兴, 等. 自动术语抽取研究综述[J]. 软件学报, 2020, 31(7):2062-2094. |
[12] | 李思良, 许斌, 杨玉基. DRTE:面向基础教育的术语抽取方法[J]. 中文信息学报, 2018, 32(3):101-109. |
[13] | CRAM D, DAILLE B. Termsuit: Terminology extraction with term variant detection[C]//Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics. Berlin, Germany: Association for Computational Linguistics, 2016:13-18. |
[14] | ZHANG Z, GAO J, CIRAVEGNA F. Semre-rank: Improving automatic term extraction by incorporating semantic relatedness with personalised pagerank[J]. ACM Transactions on Knowledge Discovery from Data (TKDD), 2018, 12(5):1-41. |
[15] | DEVLIN J, CHANG M W, LEE K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding[C]//Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Minneapolis, Minnesota: Association for Computational Linguistics, 2019:4171-4186. |
[16] | BOURIGAULT D, GONZALEZ-MULLIER I, GROS C. Lexter, a natural language processing tool for terminology extraction[C]//Proceedings of the 7th EURALEX International Congress. Göteborg, Sweden: Novum Grafiska AB, 1996: 771-779. |
[17] |
JUSTESON J S, KATZ S M. Technical terminology: some linguistic properties and an algorithm for identification in text[J]. Natural language engineering, 1995, 1(1):9-27.
doi: 10.1017/S1351324900000048 URL |
[18] | 化柏林. 针对中文学术文献的情报方法术语抽取[J]. 现代图书情报技术, 2013 (6):68-75. |
[19] | 祝清松, 冷伏海. 自动术语识别存在的问题及发展趋势综述[J]. 图书情报工作, 2012, 56(18):104-109. |
[20] | 向音, 李苏鸣. 领域术语特征分析:以军语为例[J]. 中国科技术语, 2012, 14(5):5-9. |
[21] | 张乐, 唐亮, 易绵竹. 融合多策略的军事领域中文术语抽取研究[J]. 现代计算机, 2020(26):9-16,20. |
[22] | 屈鹏, 王惠临. 面向信息分析的专利术语抽取研究[J]. 图书情报工作, 2013, 57(1):130-135. |
[23] | 曾文, 徐硕, 张运良, 等. 科技文献术语的自动抽取技术研究与分析[J]. 现代图书情报技术, 2014(1):51-55. |
[24] | 胡阿沛, 张静, 刘俊丽. 基于改进C-value方法的中文术语抽取[J]. 现代图书情报技术, 2013(2):24-29. |
[25] | JONES K S. A statistical interpretation of term specificity and its application in retrieval[J]. Journal of documentation, 2004. |
[26] | CAMPOS R, MANGARAVITE V, PASQUALI A, et al. A text feature based automatic keyword extraction method for single documents[C]//European conference on information retrieval. Grenoble, France: Springer International Publishing, 2018:684-691. |
[27] | VU T, AW A, ZHANG M. Term extraction through unithood and termhood unification[C]// Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-II. 2008: 631-636. |
[28] | 贾美英, 杨炳儒, 郑德权, 等. 采用CRF技术的军事情报术语自动抽取研究[J]. 计算机工程与应用, 2009, 45(32):126-129. |
[29] | 刘辉, 刘耀. 基于条件随机场的专利术语抽取[J]. 数字图书馆论坛, 2014(12):46-49. |
[30] | KUCZA M, NIEHUES J, ZENKEL T, et al. Term extraction via neural sequence labeling a comparative evaluation of strategies using recurrent neural networks[C]// 19th Annual Conference of the International Speech Communication Association. Hyderabad, India: ISCA, 2018: 2072-2076. |
[31] | HAZEM A, BOUHANDI M, BOUDIN F, et al. Termeval 2020: Taln-ls2n system for automatic term extraction[C]//Proceedings of the 6th International Workshop on Computational Terminology. Marseille, France: European Language Resources Association, 2020:95-100. |
[32] | SEMMAR N. A hybrid approach for automatic extraction of bilingual multiword expressions from parallel corpora[C]//Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018). Miyazaki, Japan: European Language Resources Association (ELRA), 2018: 311-318. |
[33] |
REPAR A, PODPECAN V, VAVPETIC A, et al. Termensembler: An ensemble learning approach to bilingual term extraction and alignment[J]. Terminology. International Journal of Theoretical and Applied Issues in Specialized Communication, 2019, 25(1):93-120.
doi: 10.1075/term URL |
[34] | HAZEM A, MORIN E. Efficient data selection for bilingual terminology extraction from comparable corpora[C]//Proceedings of 26th International Conference on Computational Linguistics: Technical Papers (COLING). Osaka, Japan: The COLING 2016 Organizing Committee, 2016: 3401-3411. |
[35] | KONTONATSIOS G, KORKONTZELOS I, TSUJII J, et al. Combining string and context similarity for bilingual term alignment from comparable corpora[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP). Doha, Qatar: Association for Computational Linguistics, 2014: 1701-1712. |
[36] | DAILLE B, MORIN E. French-English terminology extraction from comparable corpora[C]//Second International Joint Conference on Natural Language Processing: Full Papers. Berlin, Heidelberg: Springer, 2005: 707-718. |
[37] | 张莉, 刘昱显. 基于语序位置特征的汉英术语对自动抽取研究[J]. 南京大学学报(自然科学), 2015, 51(4):707-713. |
[38] | 刘胜奇, 朱东华. 基于多策略融合Giza++的术语对齐法[J]. 软件学报, 2015, 26(7):1650-1661. |
[39] | RAPP R. Identifying word translations in non-parallel texts[C]//Proceedings of the 33rd Annual Meeting of the Association for Computational Linguistics. Cambridge, Massachusetts, USA: Association for Computational Linguistics, 1995:320-322. |
[40] | TANAKA K, IWASAKI H. Extraction of lexical translations from non-aligned corpora[C]//Proceedings of the 16th International Conference on Computational Linguistics. Copenhagen, Denmark. 1996:580-585. |
[41] | YU K, TSUJII J. Extracting bilingual dictionary from comparable corpora with dependency heterogeneity[C]//Proceedings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics, Companion Volume: Short Papers. Boulder, Colorado: Association for Computational Linguistics, 2009: 121-124. |
[42] | LEE L, AW A, ZHANG M, et al. Em-based hybrid model for bilingual terminology extraction from comparable corpora[C]//Proceedings of the 23rd International Conference on Computational Linguistics. Beijing, China:Coling 2010 Organizing Committee, 2010: 639-646. |
[43] | LIU Y, OTT M, GOYAL N, et al. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 2019. |
[44] | BAKKELUND D. An lcs-based string metric[J]. Olso, Norway: University of Oslo, 2009. |
[45] | 宗成庆. 统计自然语言处理[M]. 北京: 清华大学出版社, 2013. |
[1] | CHEN Ke, CHAI Qidong. The Results Analysis of Machine Translation on Petroleum Engineering Terms Based on Vector Space Model [J]. China Terminology, 2022, 24(2): 21-25. |
[2] | CHANG Baobao. Techniques of Automatic Term Extraction:Current Sate and Reflections [J]. China Terminology, 2022, 24(1): 3-13. |
[3] | WANG Huashu, LIU Shijie. Evaluation Framework of Terminology Extraction Software [J]. China Terminology, 2022, 24(1): 45-54. |
[4] | ZHAO Songge, ZHANG Hao, CHANG Baobao. Research on Automatic Extraction of Scientific Terminology from Texts Based on Self-Attention [J]. China Terminology, 2021, 23(2): 20-26. |
[5] | LU Xiaolei, WANG Fanke. Word Embedding: Concepts and Applications [J]. China Terminology, 2020, 22(3): 24-32. |
[6] | QIU Bihua. A Frame-based Version of NATO Glossaries [J]. China Terminology, 2020, 22(3): 33-39. |
[7] | MUHEYAT Niyazbek, KUENSSAULE Talp. Design and Implementation of a Terminology Recognition System in the IT Field [J]. China Terminology, 2020, 22(2): 29-32. |
[8] | LEI Shujie, XING Fukun. Types and Patterns of the English Names of Military Weapons and Equipments [J]. China Terminology, 2019, 21(1): 14-20. |
[9] | WANG Jianliang. Introduction of Intelligent Toothbrush [J]. China Terminology, 2014, 16(zk1): 22-23. |
[10] | QIAO Yi. Affective Computing [J]. China Terminology, 2014, 16(zk1): 80-82. |
[11] | WEI Yanyan. Development of the Internet of Vehicles [J]. China Terminology, 2014, 16(zk1): 146-147. |
[12] | ZHANG Hui. Practices and Thoughts on Scientific Neologism Work [J]. China Terminology, 2013, 15(6): 5-9. |
[13] | WANG Huashu. Terminology Management in Practice [J]. China Terminology, 2013, 15(2): 11-14. |
[14] | LUO Jimei. Translation Errors of Terms in Machine Translation [J]. China Terminology, 2013, 15(1): 41-45. |
Viewed | ||||||
Full text |
|
|||||
Abstract |
|
|||||