Pre-training of Deep Bidirectional Transformers for Language Understanding — BERT

2 min readDec 4, 2021

BERT is a language representation model which pre-trains deep bidirectional representations from text by jointly conditioning on context on an unlabeled text corpus for different tasks. It is different from context-free models (word2vec), shallowly bidirectional contextual models (ELMo) and unidirectional contextual models (OpenAI GPT).The motivation…

Pre-training of Deep Bidirectional Transformers for Language Understanding — BERT

Written by Nikhil Verma