Department of Electronic and Computer Engineering - Seminar - Post Deep Learning Era: Foundation Models
Abstract :
Foundation models have enormous parameters trained on broad data born out of questioning and criticism. Due to their characteristics, foundation models exhibit superior performance on many challenges ranging from natural language processing to computer vision. Foundation models have therefore become mainstream in both academic research and industrial practice. Empowered by the exponential growth of data and computing capability in recent years, the huge shift in concept towards adopting foundation models only took one year in contrast to the mainstream adoption of deep learning which took the research community over five years. With the success of foundation models, we are gradually moving towards the post-deep-learning era. However, there are still plenty of opportunities to explore and thoroughly uncover the value of foundation models for use on different practical tasks. Accordingly, we need to discover the working mechanisms of foundation models, establish the foundations of deep learning, and envision its potential development trajectory. In this talk, I will present some of the recent progress. Specifically, I will commence from the development history of neural networks. I will then brief some preliminary theoretical findings to validate the merits of depth (many layers), width, hugeness (huge number of parameters), multi-modal (as input, e.g. image and text), and multi-task (as output, e.g. detection and segmentation). Regarding the theory, we introduce inductive biases to reduce the training cost of vision transformer by devising ViTAE, VSA, and RegionCL. Based upon extensive experiments on various vision tasks, including object recognition, detection, pose estimation, human matting, text detection, and remote sensing image understanding, these models obtain promising performance. The integration of the historical path, the preliminary theoretical implications, and the comprehensive empirical evaluations suggests the trend of foundation models – more is different, simpler yet refined.
Dacheng Tao (Fellow, IEEE) is the Inaugural Director of the JD Explore Academy and a Senior Vice President of JD.com. He is also an Advisor and a Chief Scientist of the Digital Sciences Initiative, The University of Sydney. He mainly applies statistics and mathematics to artificial intelligence and data science. His research is detailed in one monograph and over 200 publications in prestigious journals and proceedings at leading conferences. He is a Fellow of the Australian Academy of Science, AAAS, ACM, and IEEE. He received the Australian Eureka Prize twice in 2015 and 2020, the 2018 IEEE ICDM Research Contributions Award, and the 2021 IEEE Computer Society McCluskey Technical Achievement Award.