I’m a second-year master’s student at Zhejiang University. I have published several papers as the first author at top AI conferences such as ICLR and ACM MM. My research focuses on Multimodal Large Language Models, especially the applications of Vision-Language Models and effective fine-tuning strategies. Recently, I’ve been particularly interested in streaming video understanding, aiming to enable models to continuously interpret live video streams with strong temporal reasoning and timely responses. My long-term goal is to build a truly user-friendly AI assistant—reliable, practical, and proactive—that can understand visual content, communicate naturally, and help users accomplish real-world tasks with a consistently solid experience.
📖 Educations
- 2024.09 - 2027.06 (now), Software School, Software Engineer, Zhejiang University.
- 2020.09 - 2024.06, Undergraduate, Software College, Software Engineering (International (English)), Northeastern University.
🎖 Honors and Awards
- 2025.10 National Scholarship.
📝 Publications
–>
Parameter-Efficient Fine-Tuning

Diff-Prompt: Diffusion-Driven Prompt Generator with Mask Supervision
Weicai Yan, Wang Lin, Zirun Guo, Ye Wang, Fangming Feng, Xiaoda Yang, Zehan Wang, Tao Jin
- Code.
- Prompt Visualization.


Low-rank Prompt Interaction for Continual Vision-Language Retrieval
Weicai Yan, Ye Wang, Wang Lin, Zirun Guo, Zhou Zhao, Tao Jin
- Code.
💻 Internships
- 2025.08 - 2026.02, MSRA, Social Computing Group, Beijing.
Services
Reviewer: ACM MM 2025, ICLR 2025, ICLR 2026