Department of Electronic and Computer Engineering Seminar - Probing CLIP's Comprehension of 360-Degree Textual and Visual Semantics

11:00am - 12:00pm
Room 4503 (Lift 25/26) Floor 4, Academic Building, HKUST

Supporting the below United Nations Sustainable Development Goals:支持以下聯合國可持續發展目標:支持以下联合国可持续发展目标:

The dream of instantly creating rich 360-degree panoramic worlds from text is rapidly becoming a reality, yet a crucial gap exists in our ability to reliably evaluate their semantic alignment. Contrastive Language-Image Pre-training (CLIP) models, standard AI evaluators, predominantly trained on perspective image-text pairs, face an open question regarding their understanding of the unique characteristics of 360-degree panoramic image-text pairs. In this talk, we will present some of our preliminary efforts to address this gap. This is a joint work with Hai Wang (UCL), Xiaochen Yang (Glasgow), and Mingzhi Dong (Bath).

 

Event Format
Speakers / Performers:
Prof. Jinghao XUE
Department of Statistical Science University College London, UK

Jinghao Xue received the Dr.Eng. degree in signal and information processing from Tsinghua University in 1998, and the Ph.D. degree in statistics from the University of Glasgow in 2008. He is a Professor of Statistical Pattern Recognition in the Department of Statistical Science, University College London. His research interests include statistical pattern recognition, machine learning, and computer vision. He is a Senior Area Editor of the IEEE Transactions on Circuits and Systems for Video Technology.

 

 

 

Language
English
Recommended For
Faculty and staff
PG students
UG students
Organizer
Department of Electronic & Computer Engineering
Post an event
Campus organizations are invited to add their events to the calendar.