Natural Language Processing and Representation Learning
Date:
Natural language typically refers to human language, which serves as a carrier of logical thinking, a means of communication, and a vehicle for cultural heritage. The processing of natural language is an important research content in artificial intelligence, often referred to as the pearl on the crown of AI. An essential foundational step in natural language processing is language representation learning, which aims to construct formal or mathematical descriptions of natural language so that it can be represented in computers and automatically processed by computer programs. Early language representation methods mainly used symbolic discrete representations. In recent years, deep neural networks have been widely applied in natural language processing, not only achieving performance surpassing traditional statistical methods in many tasks such as text classification, sequence labeling, machine translation, and automatic question answering, but also enabling end-to-end training, avoiding cumbersome feature engineering. The first part of this talk will introduce the basic tasks, application areas, research history, and technological development trends of natural language processing. The second part will introduce neural network-based language representation learning methods at various granularities such as words, phrases, sentences, and sentence pairs, explaining how the latent grammatical or semantic features of language are distributedly stored in a group of neurons and represented using dense, low-dimensional, continuous vectors. It will also discuss recent research trends in neural language representation learning from the perspectives of models and learning.
