While machine learning (ML) models (especially the foundation models) have achieved great success in many perception applications, however, every coin has two sides, and so does AI. Concerns have been raised about their potential security, robustness, privacy, and transparency issues when applied to real-world applications. Irresponsibly applying the foundation model to mission-critical and human-centric domains such as healthcare, education, and law can lead to serious misuse, inequity issues, negative economic and environmental impacts, and/or legal and ethical concerns. For example, machine learning models are often regarded as “black boxes” and can produce unreliable, unpredictable, and unexplainable outcomes, especially under domain shifts or maliciously crafted attacks, challenging the reliability of safety-critical applications (e.g., autonomous driving); Stable Diffusion may generate NSFW content and privacy violated-content.
Unlike conventional tutorials that focus on either the positive or the negative impacts made by AI, this tutorial aims to provide a holistic and complementary overview of the trustworthy issues, including security, robustness, privacy, and societal issues of the models so that researchers and developers can have a fresh perspective and some reflection on the induced impacts and responsibility and introduce the potential solutions. This tutorial aims to serve as a short lecture for researchers and students to gain awareness of the misuse and potential risks in existing AI techniques and, more importantly, to motivate them to rethink the trustworthy problem in research. Many case studies will be drawn from computer vision-based applications. The ultimate goal of this tutorial is to invoke more discussions, efforts, and actions into addressing the two sides of the same coin. The contents of this tutorial will provide sufficient background for participants to understand the motivation, research progress, known issues, and ongoing challenges in trustworthy perception systems, in addition to pointers to open-source libraries and surveys.
Probing cross-attention