AJOU Central Library Repository: Improving Task specific Word embedding

BROWSE

Graduate School of Ajou University Department of Computer Engineering 3. Theses(Master)

Improving Task specific Word embedding

Subtitle: Improving Hierarchical Word embedding using semantic of word

Alternative Title: Minsoo Kim

Author(s): 김민수

Alternative Author(s): Minsoo Kim

Advisor: 손경아

Department: 일반대학원 컴퓨터공학과

Publisher: The Graduate School, Ajou University

Publication Year: 2019-02

Language: eng

Alternative Abstract: Using neural network models, it becomes possible to express words in a vector representation of a certain dimension. Existing word-embedding models have focused on creating semantic and syntactic features of words in vector-based textual data such as books, news, and so on. GLOVE [5], Word2Vec [6] and CBOW [8] as representative word embedding models, and recently, models such as FASTTEXT [7] have been introduced to further enhance existing models. Relational data such as graphs and networks also play an important role in the field of artificial intelligence. The network or graph embedding model such as latent space embeddings [9], DeepWalk [4] and Node2Vec [3] has also been studied and widely used in many applications. Although the text data embedding technique has become a common method of expressing the role of unit text in a context, it has solved the text mining problem using a machine learning model prior to neural network or a large lexical database such as Wordnet [14] [15]. Recently, there has been an effort to embed such a database in a graph embedding technique, since it is a type of data that does not match the expression intent of word embedding but cannot waste the value of an existing lexical database. There have been attempts to represent databases storing the semantics of words in a hierarchical manner through graph embedding techniques such as DeepWalk [4] and Node2Vec [3]. Recently, Poincare embedding has shown that it is possible to express the hierarchical relationship of words with sufficient performance even with low dimensional vector. In this paper, we embed hierarchical word data of WordNet [14] based on the Poincare embedding, and add an edge reflecting the sibling of the word to the hierarchical structure of WordNet data in order to represent more similar to the semantic meaning of itself. Converging the model with ideal objective function has been found to be difficult, but it can be improved through data-structured adjustments and more complex neural network models. Through this study, it is expected that we can solve the previously unresolved natural language processing problem through the expression of the vector of the hierarchical structure and the sibling relation of the word.

URI: https://dspace.ajou.ac.kr/handle/2018.oak/15064

Fulltext

Appears in Collections:: Graduate School of Ajou University > Department of Computer Engineering > 3. Theses(Master)

Files in This Item:: There are no files associated with this item.

Export: RIS (EndNote); XLS (Excel); XML

Show full item record

qrcode

트윗하기

License

STATISTICS: Total Visit :3,744,559; Total Download :1,818; Today View :22,217

AJOU Central Library Repository는 국립중앙도서관 OAK 보급사업으로 구축되었습니다.

BROWSE

Browse