All Course > Python Machine Learning > Deploying Machine Learning Models Nov 07, 2024

Deploy Machine Learning Models with Flask and FastAPI

In the previous lesson, we learned how to export trained machine learning models using TensorFlow and Scikit-Learn. This step is crucial because it allows us to save our models in a format that can be reused later. Now, we will take the next step: deploying these models as web services. This is where Flask and FastAPI come into play. Both are lightweight web frameworks that make it easy to serve machine learning models through API endpoints.

Table of Contents

Overview of Flask and FastAPI
Creating an API Endpoint with Flask
Creating an API Endpoint with FastAPI
Deploying the Model
Conclusion

I have faced situations where I needed to deploy a model quickly for testing or production. For example, I once built a sentiment analysis model that could predict whether a customer review was positive or negative. After training and exporting the model, I used Flask to create an API endpoint that could take in a review text and return the predicted sentiment. This allowed my team to integrate the model into our web app seamlessly.

In this tutorial, we will walk through the process of serving machine learning models using Flask and FastAPI. By the end, you will know how to create an API endpoint, serve predictions, and deploy your model using these frameworks.

Overview of Flask and FastAPI

Flask and FastAPI are two popular web frameworks in Python. Flask is a lightweight and flexible framework that is easy to use for small to medium-sized projects. FastAPI, on the other hand, is a modern framework that is designed for building APIs quickly. It is known for its speed and support for asynchronous programming.

Both frameworks are great for serving machine learning models because they allow you to create API endpoints that can handle prediction requests. For example, if you have a model that predicts house prices, you can create an endpoint that takes in features like square footage and number of bedrooms and returns the predicted price.

I have used Flask for many projects because of its simplicity. However, when I needed to handle a large number of requests, I switched to FastAPI because of its performance benefits. For instance, in a project where I had to process thousands of requests per second, FastAPI’s async capabilities made a huge difference.

Creating an API Endpoint with Flask

Let’s start by creating an API endpoint using Flask. First, you need to install Flask using pip:

pip install flask

Next, create a Python file (e.g., app.py) and add the following code:

from flask import Flask, request, jsonify  
import pickle  

app = Flask(__name__)  

# Load the trained model  
with open('model.pkl', 'rb') as f:  
    model = pickle.load(f)  

@app.route('/predict', methods=['POST'])  
def predict():  
    data = request.get_json()  
    prediction = model.predict([data['features']])  
    return jsonify({'prediction': prediction.tolist()})  

if __name__ == '__main__':  
    app.run(debug=True)

In this example, we load a trained model from a pickle file and create an endpoint called /predict. This endpoint accepts POST requests with JSON data containing the input features. The model then makes a prediction and returns the result as a JSON response.

I have used this approach in a project where I deployed a spam detection model. The endpoint took in email text and returned whether the email was spam or not. This made it easy to integrate the model into an email filtering system.

Creating an API Endpoint with FastAPI

Now, let’s create the same API endpoint using FastAPI. First, install FastAPI and Uvicorn (an ASGI server) using pip:

pip install fastapi uvicorn

Next, create a Python file (e.g., main.py) and add the following code:

from fastapi import FastAPI  
import pickle  

app = FastAPI()  

# Load the trained model  
with open('model.pkl', 'rb') as f:  
    model = pickle.load(f)  

@app.post('/predict')  
def predict(features: list):  
    prediction = model.predict([features])  
    return {'prediction': prediction.tolist()}

FastAPI uses type hints to define the input and output of the endpoint. In this example, the /predict endpoint takes a list of features and returns the prediction. To run the app, use the following command:

uvicorn main:app --reload

I have found FastAPI to be very intuitive, especially when working with teams. The automatic generation of API documentation (available at /docs) is a feature that I often use to share endpoints with frontend developers.

Deploying the Model

Once you have created the API endpoint, the next step is to deploy the model. Both Flask and FastAPI can be deployed using a WSGI or ASGI server like Gunicorn or Uvicorn. For example, to deploy a Flask app, you can use:

gunicorn app:app

For FastAPI, you can use:

uvicorn main:app --host 0.0.0.0 --port 80

I have deployed models on local servers for testing and on cloud platforms like Heroku for production. For instance, I once deployed a recommendation system on Heroku using Flask. The process was straightforward, and the app was up and running in minutes.

Conclusion

In this tutorial, we covered how to serve machine learning models using Flask and FastAPI. We created API endpoints for making predictions and discussed how to deploy these endpoints. Both frameworks are powerful tools that can help you bring your machine learning models to life.

If you found this tutorial helpful, I encourage you to check out the next lesson, where we will explore how to deploy models to cloud platforms like Google Cloud, AWS, and Heroku. This will take your deployment skills to the next level and prepare you for real-world applications.

Comments

There are no comments yet.

Modules