Components of Transformer Architecture

Sequence modelling is popularly done using Recurrent Neural network(RNN) or its advancements as gated RNNs or Long-short term memory(LSTM). Handling events sequentially hinders parallel processing and when sequences are too long, then the model could potentially forget long-range dependencies in the input or could mix positional content.

--

--

--

Knowledge shared is knowledge squared | My Portfolio https://lihkinverma.github.io/portfolio/

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

k-NN Benchmarks Part I — Wikipedia

Using Tensorflow in Android

Step Change Improvement in Molecular Property Prediction with PotentialNet

Read&Quote: “Label-Efficient Multi-Task Segmentation using Contrastive Learning”

Multi GPU Training In Pytorch

Building AI to Make LoFi Hip Hop: an introduction to recurrent neural networks

Create Dataset for Sentiment Analysis by Scraping Google Play App Reviews using Python

Data Preparation Tools for Computer Vision 2021

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Nikhil Verma

Nikhil Verma

Knowledge shared is knowledge squared | My Portfolio https://lihkinverma.github.io/portfolio/

More from Medium

Few-Shot Learning

Experiment Variants of Graph Neural Network in Tensorflow

ML Model Optimization for low compute environments

Why should you deploy your ML model in shadow mode?