Heyang Qin

Microsoft · Redmond, WA · heyangqin@microsoft.com · qysnn1@gmail.com

I work as a Researcher at Microsoft, where I'm a part of the DeepSpeed team. My research focuses on enhancing the efficiency of large-scale machine learning systems, covering both training and inference. I completed my Ph.D. from the University of Nevada, Reno in 2022, under the guidance of Dr. Feng Yan and Dr. Lei Yang. Before that, I earned my Bachelor's degree from the University of Electronic Science and Technology of China in 2017. My work is dedicated to exploring ways to improving the efficiency of large models.

Research

SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement

Large scale training requires massive parallelism to finish the training within a reasonable amount of time. To support massive parallelism, large batch training is the key enabler but often at the cost of generalization performance. We propose a fully automated and lightweight adaptive batching methodology to enable fine-grained batch size adaption (e.g., at a mini-batch level) that can achieve state-of-the-art performance with record breaking batch sizes. The core component of our method is a lightweight yet efficient representation of the critical gradient noise information. Extensive evaluations on popular benchmarks (e.g., CIFAR10, ImageNet, and BERT-Large) demonstrate that the proposed methodology outperforms state-of-the-art methodologies using adaptive batching approaches or hand-tuned static strategies in both performance and batch size. Particularly, we achieve a new state-of-the-art batch size of 78k in BERT-Large pretraining with SQuAD score 90.69 compared to 90.58 reported in previous state-of-the-art with 59k batch size.

Region Based Reinforcement Learning Scheduling Framework for MLaaS

The parallelism settings in Machine Learning as a Service (MLaaS) have critical impact on the system performance. It is a challenge to tune the parallel config because of the complex dependency and large search space. We propose a region based reinforcement learning (RRL) approach that can converge to near-optimal config magnitude faster than the traditional reinforcement learning. The proposed RRL is prototyped and evaluated using several real-world machine learning workloads. Both theoretical analysis and experiment evaluation show that RRL outperforms state-of-the-art tuning algorithms for MLaaS.

RRL Plus: Adaptive Region Based Reinforcement Learning for Machine Learning

RRL is sensitive to region size. Excessive region size leads to large performance gap between RRL solution and optimal one whereas inadequate region size leads to longer learning process. We further expand the Region Based Reinforcement Learning algorithm by Bayesian optimization and heuristic algorithm and enable it to automatically adjust the region size to achieve fast converge and near-optimal solution.

Semi-supervised Learning for Large Scale Noisy Data

A large amount of data is available for scientific use. Unfortunately, accurate labels of training data are usually manually-labeled and expensive, leading to insufficient labeled data to train machine learning model. We solve this problem by using generative model to resemble multiple state-of-the-art models to achieve better detection from very noisy training data.

Publications

Heyang Qin, Samyam Rajbhandari, Olatunji Ruwase, Feng Yan, Lei Yang, Yuxiong He, SimiGrad: Fine-Grained Adaptive Batching for Large Scale Training using Gradient Similarity Measurement, in Proceedings of the Neural Information Processing Systems 2021 (NeurIPS 2021), Virtual, December, 2021 (Acceptance rate: 2371/9122=26%). [Slides]

Heyang Qin, Syed Zawad, Yanqi Zhou, Sanjay Padhi, Lei Yang, and Feng Yan, Reinforcement Learning Empowered MLaaS Scheduling for Serving Intelligent Internet of Things, IEEE Internet of Things Journal, 2020 (Impact factor: 9.515).

Heyang Qin, Syed Zawad, Yanqi Zhou, Lei Yang, Dongfang Zhao, Feng Yan, Swift Machine Learning Model Serving Scheduling: A Region Based Reinforcement Learning Approach, in Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC 2019), Denver, CO, USA, Nov, 2019 (Acceptance rate: 78/344=22%). [Slides]

Experience

Researcher

Microsoft

I am currently working as a researcher of the Microsoft DeepSpeed team. My work focus on improving the efficiency and scalability of large scale maching learning systems by means of system optimizations as well as CUDA kernel optimization.

January 2023 - Present

Research Intern

Microsoft

I worked as a research intern of the Microsoft DeepSpeed team. My work focus on using adaptive methods to optimize large scale machine learning in terms of performance and scalability. My recent work SimiGrad achieves a record-breaking large batch size of 77k in BERT pretrain while keeping sota model performance.

October 2020 - December 2022

Research Assistant

University of Nevada, Reno

Duing this period, I work on researches about reinforcement learning and deep learning as well as their applications on cloud computing.

August 2017 - December 2022

Teaching Assistant

University of Nevada, Reno

As a teaching assistant, my job includes teaching lab sections, holding office hours and grading.

August 2017 - January 2021

Education

University of Nevada, Reno

Ph.D.
Computer Science and Engineering

GPA: 4.00

August 2017 - December 2022

University of Electronic Science and Technology of China

Bachelor
Automation Engineering
August 2013 - May 2017

Teaching

CPE 201 - Digital Design

University of Nevada, Reno

I work as a teaching assistant on CPE 201 with Dr. Hung La and Dr. Siming Liu.

August 2017 - June 2019

ENGR 100 - Introduction to Engineering Design

University of Nevada, Reno

I work as a teaching assistant on ENGR 100 with Dr. Ann-Marie Vollstedt, Prof. Kelly Keselica and Dr. Adam Kirn.

June 2019 - December 2019

Personal Interests

Apart from being a researcher, I partcipate in multiple open-source projects where I contribute codes, translation, etc. I also play table tennis and volleyball occasionally in my free time.

For indoor time, I usually enjoy my time with video games and detective fictions. I also have strong interest and rich knowledge about archaic Chinese and Chinese ancient literature.