2024 International Conference on Image Processing and Computer Applications (IPCA 2024)




Prof. Guanbin Li

National Excellent Youth Sun Yat-sen University, China

BIO: Guanbin Li is a Professor and PhD supervisor at the School of Computer Science, Sun Yat-sen University, and a recipient of the National Science Fund for Distinguished Young Scholars. His main research areas include cross-modal visual perception, understanding, and generation. To date, he has published over 140 papers in CCF Class A/Chinese Academy of Sciences Zone 1 journals, which have been cited more than 11,000 times on Google Scholar. He has received numerous awards, including the Wu Wenjun AI Excellent Youth Award, ACM China Rising Star Nomination, the First Prize of Science and Technology by the China Society of Image and Graphics, Best Paper Nomination at ICCV.

The Speech Title: Cross-modal Alignment for Visual Content Understanding and Generation

Abstract: The rapid development of single-modal content understanding such as vision and language has put forward higher requirements for cross-modal learning technologies such as cross-modal information retrieval, content generation, and human-computer intelligent interaction. Cross-modal representation and generation are two core basic issues in cross-modal learning. Cross-modal representation aims to achieve enhanced expression of features by learning to align semantics between multiple modalities, while cross-modal generation is based on the semantic consistency between modalities to achieve mutual conversion of different modal data in form. Among them, cross-modal explicit semantic alignment is the core of realizing fine-grained, parsable cross-modal understanding technology.In this talk, I will introduce our research attempts in the area of cross-modal semantic alignment from perspectives including graph network information propagation, multimodal large model distillation, knowledge embedding, and structural consistency representation. Additionally, I will present the application and validation of these technologies in fields such as cross-modal visual target localization, cross-modal medical information processing, and digital human video generation.

jianguo Chen副教授.png

Assoc. Prof. Jianguo Chen

Joint PhD program at the University of Illinois in the United States

Young academic backbone

Sun Yat-sen University, China

BIO: Jianguo Chen is currently an Associate Professor and one of the Hundred Academic Talents in the School of Software Engineering of Sun Yat-sen University (SYSU), China. He received his Ph.D. degree in Computer Science and Technology from Hunan University, China, in 2018. Before joining SYSU, he was a postdoc at the University of Toronto in Canada and a research scientist at the Agency for Science, Technology and Research (A*STAR) in Singapore. His major research interests include high-performance artificial intelligence, computer Vision, medical image analysis, and the application in the field of intelligent medicine. He has published more than 60 research papers in international conferences and journals such as IEEE-TII, IEEE-TITS, IEEE-TPDS, IEEEE-TKDE, IEEE/ACM-TCBB, ACM-TIST, and ACM-TCPS. He is currently serving as an Associate Editor in the International Journal of Embedded Systems and Journal of Current Scientific Research and the Journal of Current Scientific Research, the guest editor of Information Sciences and Natural Computing and Applications, and the technical committee member of multiple international academic conferences.

The Speech Title: Configurable Artificial Intelligence Platform for Medical Image Analysis and Disease Diagnosis

Abstract: The intersection of artificial intelligence (AI) and medical imaging has emerged as a transformative force in healthcare, promising enhanced diagnostic capabilities and improved patient outcomes. Traditional methods of medical image analysis often face challenges in handling the intricacies and diversity of imaging data, leading to potential diagnostic errors and inefficiencies. This report will present the development and applications of a configurable AI Platform tailored for the comprehensive analysis of medical images and accurate diagnosis of various diseases. The platform integrates state-of-the-art image processing techniques and artificial intelligence models, allowing healthcare professionals to streamline the diagnostic workflow and enhance the efficiency of disease detection. Multiple applications will be introduced, such asidentification of thyroid nodules, segmentation and diagnosis of hepatocellular carcinoma, and diagnosis of gastric ulcers. The project underscores the potential impact of this Configurable Artificial Intelligence Platform in advancing medical image analysis and disease diagnosis within the healthcare domain. 


Asst. Prof. Zhang Liming, IEEE Member

University of Macau, China

BIO: Liming Zhang obtained her B.S. degree in Computer Software from Nankai University, China, and her M.S. degree in Signal Processing from Nanjing University of Science and Technology, China. She completed her PhD in image processing at the University of New England, Australia. Currently, she serves as an assistant professor in the Faculty of Science and Technology at the University of Macau. Her research interests encompass Computer Vision, Image Processing, Artificial Intelligence, Machine Learning, and Deep Learning. With over 100 publications to her credit including IEEE Transactions on Image Processing, IEEE Transactions on Signal Processing, CVPR etc., her primary contribution lies in the development of novel image and signal processing methodologies based on adaptive Fourier decomposition (AFD). The application of stochastic AFD (SAFD) for image and video compression has surpassed existing international standards such as JPEG2000,MPEG,and even outperformed popular deep networks' compression results.This technology's core stability is expected to pose a challenge to future international image and video compression standards.

The Speech Title: Image Adaptive Sparse Learning based on Stochastic Adaptive Fourier Decomposition and Its Applications

Abstract: Deep learning has been widely used in the field of computer vision. From a broad perspective, deep learning is one of the sparse representation methods of images. Sparse representation is a widely used machine learning method that represents images based on dictionaries. It can be divided into two types based on the dictionary construction: analytic or learning-based spare representation. The dictionaries used in analytic methods are predefined, while those used in learning-based methods are obtained through training. The former has a formulaic decomposition process but no adaptivity to the image, the latter has the advantage of adaptivity, but the learned dictionary is unstructured. Adaptive Fourier decomposition (AFD) is a newly developed sparse representation theory initialized in Macau that combines the advantages of both types: the dictionary is predefined using interpretable mathematical kernels, and the decomposition is achieved by adaptively choosing atoms in the dictionary. Stochastic AFD (SAFD) combines AFD and random signal theory to learn the common sparse representation of multiple images at once. This learning method is implemented for the first time in literature. It is used for image and video compression, and the results exceed the current international standards for image and video compression, including JPEG, JPEG2000, and MPEG. Due to the stability of its core technology, it has potential to challenge and become new international image and video compression standards.