[an error occurred while processing this directive]

China Terminology ›› 2020, Vol. 22 ›› Issue (3): 24-32.doi: 10.3969/j.issn.1673-8578.2020.03.004

• Research on Terminology • Previous Articles     Next Articles

Word Embedding: Concepts and Applications

LU Xiaolei, WANG Fanke   

  • Received:2020-01-02 Revised:2020-05-17 Online:2020-06-25 Published:2020-07-20

Abstract:

This article focuses on the study of word embedding, a feature-learning technique in natural language processing that maps words or phrases to low-dimensional vectors. Beginning with the linguistic theories concerning contextual similarities — “distributional hypothesis” and “context of situation”, this article introduces two ways of numerical representation of text: one-hot and distributed representation. In addition, this article presents statistical-based language models (such as co-occurrence matrix and singular value decomposition) as well as neural network language models (NNLM, such as continuous bag-of-words and skip-gram). This article also analyzes how word embedding can be applied to the study of word-sense disambiguation and diachronic linguistics.

Key words: natural language processing, text representation, word embedding

CLC Number:  (术语学)