PhD Dissertation Defense by Yi Zhang: " Learning Embeddings for Academic Papers"

Friday, September 6, 2019 - 13:00 to 16:00



The School of Computer Science at the University of Windsor is pleased to present …

Doctoral Dissertation by:

Yi Zhang
Date: Friday, September 6, 2019
Time: 1:00pm – 4:00pm
Location: LT3105


Academic papers contain both text and citation links. Representing such data is crucial for many downstream tasks, such as classification, disambiguation, duplicates detection, and recommendation. The success of Skip-gram with Negative Sampling model (SGNS) has inspired many algorithms to learn embeddings for words, documents, and networks. This presentation first discusses the embeddings for directed graphs. Then, we study the norm convergence issue in SGNS and propose to use an L2 regularization to fix the problem. We observe improvements up to 17.47% for word embeddings, 1.85% for document embeddings, and 46.41% for network embeddings.
To learn the embeddings for academic papers, we propose several neural network based algorithms that can learn high-quality embeddings from different types of data. The algorithms we proposed are N2V (network2vector) for networks, D2V (document2vector) for documents, and P2V (paper2vector) for academic papers. Experiments show that our models outperform traditional algorithms and the state-of-the-art neural network methods on various datasets under different machine learning tasks. With the high-quality embeddings, we demonstrate their applications on real-world datasets, such as academic paper and author search engines, author name disambiguation, and paper influence prediction. 


Thesis Committee:

Internal Reader: Dr. Mehdi Kargar, Dr. Dan Wu
External Reader: Dr. Jonathan Wu (ECE)
External Examiner: Dr. Ying Zou (Queen’s University)
Advisors: Dr. Jianguo Lu     
Chair: TBD


PhD Dissertation Announcement


Lambton Tower 5113, 401 Sunset Ave., Windsor, Ontario N9B 3P4 (519) 253-3000 Ext. 3716