AI Thrust Seminar | Translating Across Modalities: Innovations in Visual-to-Text and Text-to-Visual Generation

9:00am - 10:00am
Zoom Meeting ID: 913 6608 1819, Passcode: 488490

Cross-modal generation is an important task under the generative AI umbrella, in which I have focused on the visual-to-text and text-to-visual generation. To translate semantic information across modalities freely, the challenges include (1) handling unrecognizable visual instances, and (2) generating controllable complex contents with high quality. In this talk, solutions from various viewpoints will be introduced. First, the approaches of unsupervised language structure inference and uncovering domain-specific concepts will be discussed, to enhance the visual-to-text generation model performance. Afterwards, to simultaneously achieve high-fidelity visual generation and cross-modal semantic matching, the inversion and online alignment frameworks will be presented. These research findings have been validated on various scenarios, which are potentially promising to help promote the domains of game development, health care, etc.

Event Format
Speakers / Performers:
Mr. Hao WANG
A final year PhD candidate in the School of Computer Science and Engineering, Nanyang Technological University, Singapore

Hao WANG is a final year PhD candidate in the School of Computer Science and Engineering at Nanyang Technological University, Singapore. He received the B.E. degree from Huazhong University of Science and Technology. His research interest is developing AI-powered perception and generation algorithms for the multimodal domain. In particular, his recent work investigates the translation between visual and text data, to generate controllable contents with efficiency and robustness. He has published first-authored top-tier conference and journal work in computer vision and multimedia fields, including CVPR, ECCV, ACM MM, IEEE TPAMI, IEEE TIP, IEEE TMM, etc.

Language
English
Recommended For
Faculty and staff
General public
PG students
UG students
Organizer
Artificial Intelligence Thrust, HKUST(GZ)
Post an event
Campus organizations are invited to add their events to the calendar.