中文      CV      Last update: May 7, 2021

Yige Xu

Scholar Links

GitHub

Google Scholar

Semantic Scholar

DBLP

Contact

Email:
ygxu18 [AT] fudan.edu.cn
xuyige1996 [AT] gmail.com

Interdisciplinary Building No.2, School of Computer Science, Fudan University, No. 2005, Songhu Road, Shanghai 200438, China

About Me

     I am final year Master student in School of Computer Science, Fudan University. My supervisor is Professor Xipeng Qiu. I am a member of Fudan Natural Language Processing Group and fastnlp develop team.

     Fastnlp mainly reproduce fastNLP package and fitlog package. I am one of the main contributors.

Education Bio

  1. 2018.9 - Now: Master of Computer Science from Fudan University, working with Professor Xipeng Qiu and Professor Xuanjing Huang. (Current, expected to graduate in June 2021)
  2. 2014.9 - 2018.6: B.Eng. Computer Science and Technology from Taishan College, Shandong University, worked with Professor Jun Ma. Taishan College is the honor college (aka Elite class) of Shandong University. Our major selects less than 20 students from more than 300 undergraduates each year.

Research Interest

For now, I mainly focus on the following perspectives for NLP:

Teaching

  1. MANA130376.01 Big Data driven Business Analytics and Application (Spring 2019). Teaching Assistant
  2. COMP130137.01 Pattern Recognition & Machine Learning (Spring 2020). Teaching Assistant
  3. DATA62004.01 Neural Network and Deep Learning (Spring 2020). Teaching Assistant

Awards

  1. How to Fine-Tune BERT for Text Classification?, CCL 2019 Best Paper Award
  2. Outstanding Students of Master's Degrees at Fudan University, 2020

Published Papers

  1. [New!] ONE2SET: Generating Diverse Keyphrases as a Set. ACL, 2021.
    Jiacheng Ye, Tao Gui, Yichao Luo, Yige Xu, Qi Zhang. [Abstract]
  2. Abstract: Recently, the sequence-to-sequence models have made remarkable progress on the task of keyphrase generation (KG) by concatenating multiple keyphrases in a predefined order as a target sequence during training. However, the keyphrases are inherently an unordered set rather than an ordered sequence. Imposing a predefined order will give wrong bias during training, which can highly penalize shifts in the order between keyphrases. In this work, we introduce a new training paradigm ONE2SET without predefining an order to concatenate the keyphrases. To fit this paradigm, we propose a novel model that consists of a fixed set of learned control codes to generate keyphrases in parallel. To solve the problem that there is no correspondence between each prediction and target during training, we introduce a K-step target assignment mechanism via bipartite matching, which greatly increases the diversity and reduces the duplication ratio of generated keyphrases. The experimental results on multiple benchmarks demonstrate that our approach significantly outperforms the state-of-the-art methods.
  3. Pre-trained Models for Natural Language Processing: A Survey. Science China Technological Sciences, 2020. [BibTeX] [DOI] [PDF]
    Xipeng Qiu, Tianxiang Sun, Yige Xu, Yunfan Shao, Ning Dai, Xuanjing Huang. [Abstract]
  4. Abstract: Recently, the emergence of pre-trained models (PTMs) has brought natural language processing (NLP) to a new era. In this survey, we provide a comprehensive review of PTMs for NLP. We first briefly introduce language representation learning and its research progress. Then we systematically categorize existing PTMs based on a taxonomy with four perspectives. Next, we describe how to adapt the knowledge of PTMs to the downstream tasks. Finally, we outline some potential directions of PTMs for future research. This survey is purposed to be a hands-on guide for understanding, using, and developing PTMs for various NLP tasks.
    BibTeX:
    			@article{qiu2020:scts-ptms,
    			author = {Xipeng Qiu and TianXiang Sun and Yige Xu and Yunfan Shao and Ning Dai and Xuanjing Huang},
    			title = {Pre-trained Models for Natural Language Processing: A Survey},
    			journal = {SCIENCE CHINA Technological Sciences},
    			publisher = {Science China Press},
    			year = {2020},
    			doi = {https://doi.org/10.1007/s11431-020-1647-3}
    			}
    			
  5. How to Fine-Tune BERT for Text Classification? CCL (Best Paper Award), 2019. [BibTeX] [PDF] [Source]
    Chi Sun, Xipeng Qiu, Yige Xu, Xuanjing Huang. [Abstract]
  6. Abstract: Language model pre-training has proven to be useful in learning universal language representations. As a state-of-the-art language model pre-training model, BERT (Bidirectional Encoder Representations from Transformers) has achieved amazing results in many language understanding tasks. In this paper, we conduct exhaustive experiments to investigate different fine-tuning methods of BERT on text classification task and provide a general solution for BERT fine-tuning. Finally, the proposed solution obtains new state-of-the-art results on eight widely-studied text classification datasets.
    BibTeX:
    @inproceedings{sun2019fine,
      title={How to fine-tune {BERT} for text classification?},
      author={Sun, Chi and Qiu, Xipeng and Xu, Yige and Huang, Xuanjing},
      booktitle={China National Conference on Chinese Computational Linguistics},
      pages={194--206},
      year={2019},
      organization={Springer}
    }
    			

Arxiv Papers

  1. Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement Learning. [arXiv] [BibTeX]
    Yichao Luo*, Yige Xu* (* Equal contribution), Jiacheng Ye, Xipeng Qiu, Qi Zhang. [Abstract]
  2. Abstract: Aiming to generate a set of keyphrases, Keyphrase Generation (KG) is a classical task for capturing the central idea from a given document. Typically, traditional KG evaluation metrics are only aware of the exact correctness of predictions on phrase-level and ignores the semantic similarities between similar predictions and targets, which inhibits the model from learning deep linguistic patterns. In this paper, we propose a new fine-grained evaluation metric that considers different granularity: token-level F1 score, edit distance, duplication, and prediction quantities. For learning more recessive linguistic patterns, we use a pre-trained model (e.g., BERT) to compute the continuous similarity score between predicted keyphrases and target keyphrases. On the whole, we propose a two-stage Reinforcement Learning (RL) training framework with two reward functions: our proposed fine-grained evaluation score and the vanilla F1 score. This framework helps the model identifying some partial match phrases which can be further optimized as the exact match ones. Experiments on four KG benchmarks show that our proposed training framework outperforms the traditional RL training frameworks among all evaluation scores. In addition, our method can effectively ease the synonym problem and generate a higher quality prediction.
    BibTeX:
    @article{luo2021keyphrase,
      title={Keyphrase Generation with Fine-Grained Evaluation-Guided Reinforcement Learning},
      author={Luo, Yichao and Xu, Yige and Ye, Jiacheng and Qiu, Xipeng and Zhang, Qi},
      journal={arXiv preprint arXiv:2104.08799},
      year={2021}
    }
    			
  3. Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation. [PDF] [arXiv] [BibTeX]
    Yige Xu, Xipeng Qiu, Ligao Zhou, Xuanjing Huang. [Abstract]
  4. Abstract: Fine-tuning pre-trained language models like BERT has become an effective way in NLP and yields state-of-the-art results on many downstream tasks. Recent studies on adapting BERT to new tasks mainly focus on modifying the model structure, re-designing the pre-train tasks, and leveraging external data and knowledge. The fine-tuning strategy itself has yet to be fully explored. In this paper, we improve the fine-tuning of BERT with two effective mechanisms: self-ensemble and self-distillation. Experiments on GLUE benchmark and Text Classification benchmark show that our proposed methods can significantly improve the adaption of BERT without any external data or knowledge. We conduct exhaustive experiments to investigate the efficiency of self-ensemble and self-distillation mechanisms, and our proposed methods achieve a new state-of-the-art result on the SNLI dataset.
    BibTeX:
    @article{xu2020improve,
      title={Improving BERT Fine-Tuning via Self-Ensemble and Self-Distillation},
      author={Xu, Yige and Qiu, Xipeng and Zhou, Ligao and Huang, Xuanjing},
      journal={arXiv preprint arXiv:2002.10345},
      year={2020}
    }