您的瀏覽器不支援JavaScript語法,網站的部份功能在JavaScript沒有啟用的狀態下無法正常使用。

中央研究院 資訊科學研究所

活動訊息

友善列印

列印可使用瀏覽器提供的(Ctrl+P)功能

學術演講

:::

[卓越演講2026-5] On Reusing Model Parameters

  • 講者陶大程 特聘教授 (College of Computing & Data Science, Nanyang Technological University)
    邀請人:廖弘源
  • 時間2026-07-06 (Mon.) 10:00 ~ 12:00
  • 地點資訊所新館106演講廳
線上串流
Dr Dacheng Tao is currently a Distinguished University Professor in the College of Computing & Data Science at Nanyang Technological University. He mainly applies statistics and mathematics to artificial intelligence and data science, and his research is detailed in one monograph and over 200 publications in prestigious journals and proceedings at leading conferences, with best paper awards, best student paper awards, and test-of-time awards. His publications have been cited over 112K times and he has an h-index 160+ in Google Scholar. He received the 2015 and 2020 Australian Eureka Prize, the 2018 IEEE ICDM Research Contributions Award, and the 2021 IEEE Computer Society McCluskey Technical Achievement Award. He is a Fellow of the Australian Academy of Science, AAAS, ACM and IEEE.
摘要

The rapid rise of large language models and foundation models has fundamentally reshaped the paradigm of deep learning. Beyond conventional end-to-end training, an increasingly important question is how to effectively reuse existing model parameters to improve performance, scalability, and data efficiency. Parameter reuse, through model fusion, knowledge transfer, model editing, and adaptation with limited data, offers a principled alternative to training ever-larger models from scratch.

In this talk, we revisit recent advances in deep model fusion from the broader perspective of parameter reuse. We present a systematic taxonomy of existing approaches and analyze their underlying mechanisms, scalability, and theoretical implications. In particular, we introduce our recent developments, including (1) weight learning based model fusion and data-adaptive MoE upscaling, (2) subspace learning approaches that exploit structured parameter geometry, and (3) enhanced multi-task fusion strategies that integrate pre- and post-finetuning to reduce representation bias between merged and task-specific models.

By framing model fusion as a principled study of parameter reuse, we highlight both its practical advantages, improved efficiency, robustness, and reduced reliance on annotated data, and the open challenges that remain for large-scale foundation models.