I’m a second-year master’s student at Zhejiang University. I have published several papers as the first author at top AI conferences such as ICLR and ACM MM. My research focuses on Multimodal Large Language Models, especially the applications of Vision-Language Models and effective fine-tuning strategies. Recently, I’ve been particularly interested in streaming video understanding, aiming to enable models to continuously interpret live video streams with strong temporal reasoning and timely responses. My long-term goal is to build a truly user-friendly AI assistant—reliable, practical, and proactive—that can understand visual content, communicate naturally, and help users accomplish real-world tasks with a consistently solid experience.

📖 Educations

  • 2024.09 - 2027.06 (now), Software School, Software Engineer, Zhejiang University.
  • 2020.09 - 2024.06, Undergraduate, Software College, Software Engineering (International (English)), Northeastern University.

🎖 Honors and Awards

  • 2025.10 National Scholarship.

📝 Publications

–>

Parameter-Efficient Fine-Tuning

ICLR 2025
sym

Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision

Weicai Yan, Wang Lin, Zirun Guo, Ye Wang, Fangming Feng, Xiaoda Yang, Zehan Wang, Tao Jin

  • Code.
  • Prompt Visualization. sym
ACM MM 2024
sym

Low-rank Prompt Interaction for Continual Vision-Language Retrieval

Weicai Yan, Ye Wang, Wang Lin, Zirun Guo, Zhou Zhao, Tao Jin

💻 Internships

Services

Reviewer: ACM MM 2025, ICLR 2025, ICLR 2026