transformers

Improving Interpretation Faithfulness for Transformers

Di Wang, Assistant Professor, Computer Science

Nov 20, 11:30 - 12:30

B9 L2 H2 H2

transformers nlp interpretation faithfulness

Currently, attention mechanism becomes a standard fixture in most state-of-the-art NLP, Vision and GNN models, not only due to outstanding performance it could gain, but also due to plausible innate explanation for the behaviors of neural architectures it provides, which is notoriously difficult to analyze. However, recent studies show that attention is unstable against randomness and perturbations during training or testing, such as random seeds and slight perturbation of input or embedding vectors, which impedes it from becoming a faithful explanation tool. Thus, a natural question is whether we can find some substitute of the current attention which is more stable and could keep the most important characteristics on explanation and prediction of attention.

Structure-conforming Operator Learning via Transformers

Prof. Shuhao Cao, University of Missouri–Kansas City

Apr 22, 16:00 - 17:00

KAUST

Abstract GPT, Stable Diffusion, AlphaFold 2, etc., all these state-of-the-art deep learning models use a neural architecture called "Transformer". Since the emergence of "Attention Is All You Need", Transformer is now the ubiquitous architecture in deep learning. At Transformer's heart and soul is the "attention mechanism". In this talk, we shall give a specific example the following research program: whether and how one can benefit from the theoretical structure of a mathematical problem to develop task-oriented and structure-conforming deep neural networks? An attention-based deep direct