Department of Electronic and Computer Engineering Seminar - Probing CLIP's Comprehension of 360-Degree Textual and Visual Semantics

11:00am - 12:00pm
Room 4503 (Lift 25/26) Floor 4, Academic Building

Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標:支持以下联合国可持续发展目标:

The dream of instantly creating rich 360-degree panoramic worlds from text is rapidly becoming a reality, yet a crucial gap exists in our ability to reliably evaluate their semantic alignment. Contrastive Language-Image Pre-training (CLIP) models, standard AI evaluators, predominantly trained on perspective image-text pairs, face an open question regarding their understanding of the unique characteristics of 360-degree panoramic image-text pairs. In this talk, we will present some of our preliminary efforts to address this gap. This is a joint work with Hai Wang (UCL), Xiaochen Yang (Glasgow), and Mingzhi Dong (Bath).

 

讲者/ 表演者:
Prof. Jinghao XUE

Jinghao Xue received the Dr.Eng. degree in signal and information processing from Tsinghua University in 1998, and the Ph.D. degree in statistics from the University of Glasgow in 2008. He is a Professor of Statistical Pattern Recognition in the Department of Statistical Science, University College London. His research interests include statistical pattern recognition, machine learning, and computer vision. He is a Senior Area Editor of the IEEE Transactions on Circuits and Systems for Video Technology.

语言
英文
适合对象
教职员
研究生
本科生
主办单位
电子及计算器工程学系
新增活动
请各校内团体将活动发布至大学活动日历。