Tips & Tricks of Deploying Deep Learning Webapp on Heroku Cloud

Heroku Cloud is famous among web developers and machine learning enthusiasts. The platform provides easy ways to deploy and maintain the web application, but if you are not familiar with deploying deep learning applications, you might struggle with storage and dependence issues. This guide will make your deployment process smoother so that you can focus on creating amazing web applications. We will be learning about DVC integration, Git & CLI-based deployment, error code H10, playing around with Python packages, and optimizing storage.

 

Git & CLI-based Deployment

 

The Streamlit app can be deployed with git, GitHub integration, or using Docker. The git-based approach is by far a faster and easier way to deploy any data app on Heroku server.

Simple Git-based

The Streamlit app can be deployed using:

git remote add heroku https://heroku:$HEROKU_API_KEY@git.heroku.com/.git

git push -f heroku HEAD:master

 

For this to work, you need:

CLI-based

CLI-based deployment is basic and easy to learn.

Image by Author.

  1. Create a free Heroku account here.
  2. Install Heroku CLI using this link.
  3. Either clone remote repository or use git init
  4. Type heroku login and heroku create dagshub-pc-app. This will log you into the server and create an app on a web server.
  5. Now create Procfile containing the commands to run the app: web: streamlit run –server.port $PORT streamlit_app.py
  6. Finally, commit and push code to heroku server git push heroku master

 

PORT

 

If you are running the app with streamlit run app.py it will produce an error code H10 which means $PORT assigned by the server was not used by the Streamlit app.

You need to:

heroku config:set PORT=8080

 

web: streamlit run --server.port $PORT app.py

 

Tweaking Python Packages

 

This part took me 2 days to debug as Heroku cloud comes with a 500MB limitation, and the new TensorFlow package is 489.6MB. To avoid dependencies and storage issues, we need to make changes in the requirements.txt file:

  1. Add tensorflow-cpu instead of tensorflow which will reduce our slug size from 765MB to 400MB.
  2. Add opencv-python-headless instead of opencv-python to avoid installing external dependencies. This will resolve all the cv2 errors.
  3. Remove all unnecessary packages except numpy, Pillow, andstreamlit.

 

DVC Integration

 

Image by Author.

There are a few steps required for successfully pulling data from the DVC server.

  1. First, we will install a buildpack that will allow the installation of apt-files by using Heroku API
heroku buildpacks:add --index 1 heroku-community/apt

 

  1. Create a file name Aptfile and add the latest DVC version https://github.com/iterative/dvc/releases/download/2.8.3/dvc_2.8.3_amd64.deb
  2. In your app.py file add extra lines of code:
import os

if "DYNO" in os.environ and os.path.isdir(".dvc"):
    os.system("dvc config core.no_scm true")
    if os.system(f"dvc pull") != 0:
        exit("dvc pull failed")
    os.system("rm -r .dvc .apt/usr/lib/dvc")

 

After that, commit and push your code to Heroku server. Upon successful deployment, the app will automatically pull the data from DVC server.

 

Optimizing Storage

 

There are multiple ways to optimize storage, and the most common is to use Docker. By using the docker method, you can bypass the 500MB limit, and you also have the freedom to install any third-party integration or packages. To learn more about how to use docker, check out this guide.

For optimizing storage:

dvc pull {model} {sample_data1} {sample_data2}..

 

 

Outcomes

 

The initial slug size was 850MB, but with storage and package optimizations, the final slug size was reduced to 400MB. We have solved error code H10 with a simple command and added opencv-python-headless package to solve dependency issues. This guide was created to overcome some of the common problems faced by beginners on Heroku servers.

Exit mobile version