Transformers en Salud

Transformers in Health

Course on Transformers and its application in Health

Summary

This is a course on transformers, their architecture and their evolution. Transformers are a type of neural network that have shown enormous progress in natural language processing tasks, such as translation.

The course will first present the main concepts behind the transformer architecture: "attention". It will then describe the original transformer architecture, which includes six layers of encoders and six layers of decoders, what is embedding, what is self-attention, what is self-attention in detail and the positional encoding. The course will also discuss the evolution of the transformer architecture in chronological order, starting with the Bidirectional Encoder Representations of Transformers (BERT) model. This model is based on encoders only and is trained in two stages: pre-training of BERT with masked tokens and next sentence prediction, and BERT fine-tuning for a NLP task. BERT can be applied to a variety of NLP tasks, such as classification, question and answer, etc. We will learn how to use it on health applications. We will introduce how is possible to evaluate transformers with General Language Understanding Evaluation (GLUE) and comparing the results with the average results of humans.

The course will introduce Colab, a Google web tool that allows you to work with neural networks, including transformers, TensorFlow, a Google library that creates, trains and uses neural networks, and Hugging Face, a library that simplifies the use of TensorFlow with transformers. Also presented will be BioBERT, a transformer that uses the BERT model as a base and completes its training with a dataset of medical terms. This model is applied to a variety of tasks, such as recognition of medical entities (NER) and text classification.

The course will continue with the next big revolution of the transformers, the GPT-2 and GPT-3 architectures. These are neural networks (ready to use) that increase the number of layers, attention heads and parameters in order to obtain better results. Additionally, we will review how to apply these new networks to health tasks.

Finally, we will explore how the same transformer architecture and the same attention concept can be used in other fields, such as computer vision, using the Vision Transformer and its variants. Also presented will be AlphaFold2, a neural network that is based on transformers and that is used to predict the three-dimensional shape of proteins.

Program

Lesson 1: Learn how attention works, learn about how the Transformer architecture, and how Transformer uses attention in parallel. What is embedding? What is self-attention? What is self-attention in detail? The positional encoding. The decoder. Introduction to BERT. Introduction to GLUE tasks. Introduction to BIOBERT. Introduction to Hugging Face, a simple way to use transformers and Tensorflow.

Codelab: Introduction to Colab, Tensorflow and Hugging Face. How to perform several transformer tasks using Hugging Face. Compare the results of an NER task with a medical text using BERT and BIOBERT.

Lesson 2: Review of Transformer architecture. Introduction to BERT architecture, how is the BERT input, how is the output, how BERT works? 2 stages training: pre-training of BERT with masked tokens and next sentence prediction, and BERT fine-tuning for a NLP task. Learn how to apply BERT in a health application.

Codelab: Fine-tuning a BIOBERT model for NER.

Lesson 3: GPT-2, GPT-3, text generation task. Learn the architectures of GPT-2 and GPT-3. Using AI models through a cloud API versus running an AI model through Tensorflow on our servers. How to train the GPT-2 model.

Codelab: Using GPT-2 and GPT-3 in medical texts, use cases. Compare results using the GPT-2 and GPT-3.

Lesson 4: Transformers beyond NLP tasks. Vision Transformer (ViT): How to detect pneumonia in chest x-rays. Perceiver and Perceiver IO. AlphaFold2: prediction of protein shape in 3D.

Codelab: Vision Transformer for the diagnosis of pneumonia using chest x-ray analysis.