[卓越演講2026-5] On Reusing Model Parameters
- 講者陶大程 特聘教授 (College of Computing & Data Science, Nanyang Technological University)
邀請人:廖弘源 - 時間2026-07-06 (Mon.) 10:00 ~ 12:00
- 地點資訊所新館106演講廳
線上串流
摘要
The rapid rise of large language models and foundation models has fundamentally reshaped the paradigm of deep learning. Beyond conventional end-to-end training, an increasingly important question is how to effectively reuse existing model parameters to improve performance, scalability, and data efficiency. Parameter reuse, through model fusion, knowledge transfer, model editing, and adaptation with limited data, offers a principled alternative to training ever-larger models from scratch.
In this talk, we revisit recent advances in deep model fusion from the broader perspective of parameter reuse. We present a systematic taxonomy of existing approaches and analyze their underlying mechanisms, scalability, and theoretical implications. In particular, we introduce our recent developments, including (1) weight learning based model fusion and data-adaptive MoE upscaling, (2) subspace learning approaches that exploit structured parameter geometry, and (3) enhanced multi-task fusion strategies that integrate pre- and post-finetuning to reduce representation bias between merged and task-specific models.
By framing model fusion as a principled study of parameter reuse, we highlight both its practical advantages, improved efficiency, robustness, and reduced reliance on annotated data, and the open challenges that remain for large-scale foundation models.