Member-only story
Obtaining prediction via deep learning model-S- in parallel
Every machine-deep learning task is in generic sense, consisting of two steps:-
- Training
- Prediction
Generally on internet when one search for running machine learning — deep learning models in parallel, one will get handful of articles on parallel and distributing deep learning models which will every time orient towards training part of such models. But in cooperate practices, training models is just a one time task and when we have to put them into production ready environments, demand is to run such models for prediction in parallel, and that too quiet often and quickly. This in a service oriented architectural concept be best described as Prediction-As-A-Service.
Of-course there are ways to do multi-threading or multi-processing in python. I had written in past about multiprocessing via Process or Pool.
Way 1: Its easy to load some particular model in a parallel process and compute on input to return the required prediction when there are multiple models participating to make a full fledged output.
It can be visualized as:-
Problem: At the same time we know that each model take some while to get loaded and do its prediction process. All that “time” required by each service to be run in…