Components of Transformer Architecture

2 min readNov 27, 2021

Sequence modelling is popularly done using Recurrent Neural network(RNN) or its advancements as gated RNNs or Long-short term memory(LSTM). Handling events sequentially hinders parallel processing and when sequences are too long, then the model could potentially forget long-range dependencies in the input or could mix positional content.

Components of Transformer Architecture

Written by Nikhil Verma