Zhengxiang Shi

I am a PhD student supervised by Prof Aldo Lipani and Prof Emine Yilmaz at University College London, affiliated with Web Intelligence Group and SpaceTimeLab. I am currently working as a Intern of Technical Staff at Cohere Command Team in London. Previously, I have completed two internships as an Applied Scientist at Amazon in both London and Seattle offices.

Prior to pursuing my PhD, I obtained a Master's degree in Data Science (Statistics) with Distinction from University College London and a Bachelor's degree in Mathematics with First Class Honor from University of Liverpool and Xi'an Jiaotong-Liverpool University.

Central to my research is the ambition to leverage language models efficiently and robustly to solve general tasks. To that end, my existing work can be broadly categorized into the following directions:

Google Scholar  /  Twitter  /  Github  /  LinkedIn  /  Email

profile photo

Research (Selected)


Instruction Tuning With Loss Over Instructions
Zhengxiang Shi, Adam X. Yang, Bin Wu, Laurence Aitchison, Emine Yilmaz, Aldo Lipani
Preprint, 2024  
Paper / Github / Community Discussion

We show that in certain scenarios, applying loss to instructions rather than outputs only, which we refer to as Instruction Modelling, could largely improve the performance of instruction tuning on both various NLP and open-ended generation benchmarks. Remarkably, in the most advantageous case, our approach boosts model performance on AlpacaEval 1.0 by over 100%.


DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
Zhengxiang Shi, Aldo Lipani
International Conference on Learning Representations (ICLR), 2024  
Paper / Github / Trending in Community

Improves efficiency of Prompt Tuning in both time and memory by over 20% (when T5-Base is used as the backbone), with better performance. DePT grows more efficient as the model size increases.



Don't Stop Pretraining? Make Prompt-based Fine-tuning Powerful Learner
Zhengxiang Shi, Aldo Lipani
Advances in Neural Information Processing Systems (NeurIPS), 2023  
Paper / Github / Trending in Community

Combines the idea of the instruction tuning and language modelling. Represents the first work to perform instruction tuning via unsupervised objectives. Boosts prompt-based fine-tuning performance by over 20% in absolute.


Rethinking Semi-supervised Learning with Language Models
Zhengxiang Shi, Francesco Tonolini, Nikolaos Aletras, Emine Yilmaz, Gabriella Kazai, Yunlong Jiao
Association for Computational Linguistics (Findings of ACL), 2023  
Paper / Github

Shows Task-adaptive Pre-training (TAPT) as a simple yet effective method for semi-supervised learning (often SoTA performance). Highlights the effectiveness of TAPT even with only a few hundred unlabelled samples (in contrary to the common belief that continued pre-training requires a large amount of unlabelled data).

Teaching Activities

Guest Lecturer: Applied Artificial Intelligence
University College London, Academic year 2023/24

Guest Lecturer: Machine Learning for Data Science
University College London, Academic year 2023/24

Teaching assistant: Statistical Natural Language Processing
University College London, Academic year 2023/24

Teaching assistant: Geospatial Programming
University College London, Academic year 2023/24

Co-supervsior: MSc Research Project
University College London, Academic year 2022/23

Teaching assistant: Machine Learning for Data Science
University College London, Academic year 2022/23

Teaching assistant: Geospatial Programming
University College London, Academic year 2022/23

Teaching assistant: Machine Learning for Data Science
University College London, Academic year 2021/22

Teaching assistant: Geospatial Programming
University College London, Academic year 2021/22

Teaching assistant: Machine Learning for Data Science
University College London, Academic year 2020/21

Teaching assistant: Geospatial Programming
University College London, Academic year 2020/21

Academic Services

Program Committee: NeurIPS (2023, 2024), ICML (2024), AAAI (2023, 2024), COLM (2024), ACL ARR (Feb. 2023 - Jan. 2024), ACL (2023), EMNLP (2022, 2023), EACL (2023), COLING (2023, 2024), ECML/PKDD (2022), KDD (2023), SIGIR (2022, 2023, 2024), ECIR (2024), SDM (2024)