Prof. Moncef Gabbouj


IEEE Fellow

Department of Signal Processing

Tampere University of Technology

Tampere, Finland

Web page

Title of the talk: 

"Novel Machine Learning Solutions for Pertinent Applications"

- A summary of the talk: 

In this talk, we present a hierarchical layered approach that exploits all types of sensor and non-sensor signals and design suitable representation, processing and analysis algorithms in order to apply machine learning, including deep and shallow learning, leading to specific applications, which include healthcare and wellbeing, surveillance, and media and entertainment to mention a few. We focus on novel neural network topologies to enable machine learning solutions in several important applications.

- A short bio:

Dr. Gabbouj is a Professor of Signal Processing at the Department of Signal Processing, Tampere University of Technology, Tampere, Finland, where he leads the Multimedia Research Group. His research interests include Big Data analytics, multimedia content-based analysis, indexing and retrieval, artificial intelligence, machine learning, pattern recognition, nonlinear signal and image processing and analysis, voice conversion, and video processing and coding. He published two books and over 700 journal and conference papers and supervised 46 doctoral and 58 Master theses.

Dr. Gabbouj is an IEEE Fellow. He is a member of the IEEE Fourier Award Committee. He served as Distinguished Lecturer for the IEEE Circuits and Systems Society in 2004-2005, and Past-Chairman of the IEEE-EURASIP NSIP (Nonlinear Signal and Image Processing) Board. He served as associate editor of the IEEE Transactions on Image Processing, and was guest editor of Multimedia Tools and Applications, the European journal Applied Signal Processing. He is the past chairman of the IEEE Finland Section, the IEEE Circuits and Systems Society, Technical Committee on Digital Signal Processing. Dr. Gabbouj is the General Co-Chair of ICIP 2020. He is also member of EURASIP Advisory Board and past member of AdCom. He also served as Publication Chair and Publicity Chair of IEEE ICIP 2005 and IEEE ICASSP 2006, respectively. He was also the supervisor of the main author receiving the IBM Best Paper Award at ICPR 2014 and IPTA 2016.

 

 Prof. Hamid Krim


IEEE Fellow

Department of Electrical & Computer Engineering

North Carolina State University

Raleigh, USA

Web page

Title of the talk: 

"Deep Structure Learning: A Scale-based Paradigm"

- A short bio:

Hamid Krim (This email address is being protected from spambots. You need JavaScript enabled to view it.) received his BSc. MSc. In EE from University of Washington and a Ph.D. degree in ECE from Northeastern University. He was  a Member of Technical Staff at AT&T Bell Labs, where  he has conducted research and development  in the areas of telephony and digital communication systems/subsystems. Following an NSF postdoctoral fellowship at Foreign Centers of Excellence, LSS/University of Orsay, Paris, France, he joined the Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA  as  a Research Scientist and where he was performing and supervising research.  He is presently Professor of Electrical Engineering in the ECE Department, North Carolina State University, Raleigh, leading the Vision, Information and Statistical Signal Theories and Applications group. His research interests are in statistical signal/image analysis and Geometric Machine Learning with a keen emphasis on applied problems in classification and recognition using geometric and topological tools.

 

Prof. Bhiksha Raj


IEEE Fellow

Language Technologies Institute

School of Computer Science

Carnegie Mellon University

USA

Web page

 Title of the talk: 

"Predicting Faces From Voices"

- A summary of the talk: 

A person's face is predictive of their voice. Biologically, this is to be expected: the same genetic, physical and environmental influences that affect the face also affect the voice. More directly, the vocal tract that generates voice also partially shapes the face. It has also been demonstrated that human subjects are often correctly able to associate voices with images of the speaker's face, given a selection to choose from. The recent availability of multi-modal datasets has spurred research on whether a person's face can be detected from their voice by algorithmic means. In this talk I will explore this idea, and present recent work, by our research group and elsewhere, on approaches, largely neural-network-based, on predicting faces from voices.  I will first describe methods that try to directly learn voice-to-face associations by assuming a direct dependence between the two.  I will then move on to show how non-random performance in this regard maybe explained by relationship to covariates such as gender, and show how similar, or even possibly better results may be obtained by directly relating the individual modalities to covariates. Finally, I will demonstrate examples of how a face image may be completely reconstructed from voice.

- A short bio:

Bhiksha Raj is a professor in the School of Computer Science at Carnegie Mellon University, with primary affiliation to the language technologies institute, and secondary affiliations to the Machine Learning,  Electrical and Computer Engineering, and Music Technology departments. His areas of interest are automatic speech recognition, audio processing, machine learning, and privacy. Dr. Raj is a fellow of the IEEE. 

 

 

Assoc. Prof. Huimin Ma


Department of Electronic Engineering

Director of 3D Image Lab

Tsinghua University

Beijing, China

Web page

Title of the talk: 

"3D Scene Cognition and Multi-Modal Learning"

- A summary of the talk: 

How to extract knowledge of objects in complex environment is a challenging task. This lecture introduces prof. Ma’s work on knowledge based object detection in autonomous driving, which includes cognitive modal, salient object detection, semantic structure model, multi-modal and multi-view learning. The cognitive modal of “thinking in 3D”achieved outstanding performance on KITTI, VOC, and Cityscapes datasets, and published in TPAMI, TIP, PR, CVPR, ICCV, etc.

- A short bio:

Huimin Ma is an associate professor of Department of Electronic Engineering of Tsinghua University, and the director of 3D Image Lab. She worked as a visiting scholar in University of Pittsburgh in 2011. She is also the vice-chairman and secretary-general of China Society of Image and Graphics.Her research and teaching interests are 3D image cognition theory and visual perception of autonomous driving. She introduces semantic priors of cognitive psychology into machine learning, and puts forward the method of “thinking in 3D”to study object detection and recognition in complex scenes.

She achieved the second class prize for the Technical Invention Award of the Ministry of Education of China and silver medal in Geneva International Invention Exhibition in 2017, the highest award of AI of China (Wu Wenjun artificial intelligence science and technology innovation award, First Class) in 2016. Her team achieved top performance in the international evaluation of object detection for autonomous driving on KITTI Benchmark in 2015.

 

Prof. Vincent Charvillat


Department of Computer Science & Applied Maths

School of ENSEEIHT Engineering

University of Toulouse

France

Web page

Title of the talk: 

"Combining human computation and visual content analysis"

- A short bio:

Vincent CHARVILLAT received the Ph.D. degree in Computer Science from the National Polytechnic Institute of Toulouse in 1997. He is currently a full professor at the University of Toulouse, IRIT research lab, ENSEEIHT Eng. School and an associate member of IPAL laboratory, Singapore. Vincent CHARVILLAT is the head of REVA research team at ENSEEIHT. His main research interests are visual processing and multimedia applications. Current topics of research include visual object processing, visual compositing, visual interaction design and crowdsourcing in multimedia.