Nikhil Verma
May 10, 2023

--

Backward ops run in the same type that autocast used for corresponding forward ops. It is not recommended because say for some operation in autocast, precision was decreased to FP16 but actual precision level for that parameter was FP32, then if backward-param-update is run under autocast, then the value is not updated for that param correctly and then propagation of errors will deplete the learning of model.

--

--

Nikhil Verma
Nikhil Verma

Written by Nikhil Verma

Knowledge shared is knowledge squared | My Portfolio https://lihkinverma.github.io/portfolio/ | My blogs are living document, updated as I receive comments

No responses yet