[DLS 2026-5]On Reusing Model Parameters
- LecturerDistinguished Professor Dacheng Tao (College of Computing & Data Science, Nanyang Technological University)
Host: Mark Liao - Time2026-07-06 (Mon.) 10:00 ~ 12:00
- LocationAuditorium N106 at IIS
Abstract
The rapid rise of large language models and foundation models has fundamentally reshaped the paradigm of deep learning. Beyond conventional end-to-end training, an increasingly important question is how to effectively reuse existing model parameters to improve performance, scalability, and data efficiency. Parameter reuse, through model fusion, knowledge transfer, model editing, and adaptation with limited data, offers a principled alternative to training ever-larger models from scratch.
In this talk, we revisit recent advances in deep model fusion from the broader perspective of parameter reuse. We present a systematic taxonomy of existing approaches and analyze their underlying mechanisms, scalability, and theoretical implications. In particular, we introduce our recent developments, including (1) weight learning based model fusion and data-adaptive MoE upscaling, (2) subspace learning approaches that exploit structured parameter geometry, and (3) enhanced multi-task fusion strategies that integrate pre- and post-finetuning to reduce representation bias between merged and task-specific models.
By framing model fusion as a principled study of parameter reuse, we highlight both its practical advantages, improved efficiency, robustness, and reduced reliance on annotated data, and the open challenges that remain for large-scale foundation models.