A load balancer is a critical component in any high-availability system.

It distributes incoming network traffic across multiple servers to ensure no single server becomes overwhelmed, improving system performance and reliability.

FastAPI, known for its high performance and simplicity, can be used to create a lightweight load balancer.

In this article, we'll walk through how to build one using FastAPI.


Prerequisites

  • Basic knowledge of Python and FastAPI.
  • Python 3.7 or higher installed on your system.
  • Familiarity with concepts like REST APIs and HTTP requests.

Understanding Load Balancing

Load balancing is the process of distributing incoming network traffic across multiple servers to ensure optimal resource utilization, minimize response times, and prevent any single server from becoming a bottleneck.

It is a critical component in scalable web applications, ensuring high availability and fault tolerance.

There are various load-balancing strategies, including:

  • Round Robin: Requests are distributed sequentially across servers.
  • Least Connections: The request is routed to the server with the fewest active connections.
  • IP Hashing: Clients are consistently mapped to the same server based on their IP address.

Load balancers also perform additional tasks such as health checks, SSL termination, and request routing, improving system reliability and performance.


Introduction to FastAPI

FastAPI is a modern, high-performance web framework for building APIs with Python.

It is designed to be easy to use while offering the speed of asynchronous programming with Python’s asyncio.

Some of its key features include:

  • Asynchronous support: FastAPI leverages async and await for handling concurrent requests efficiently.
  • Automatic validation: Built-in data validation using Pydantic ensures robust API request handling.
  • Fast execution: It is one of the fastest Python frameworks, comparable to Node.js and Go-based web frameworks.
  • Interactive API docs: Automatically generates OpenAPI documentation with Swagger UI and ReDoc.

Due to its performance and simplicity, FastAPI is an excellent choice for building scalable APIs and microservices, including a custom load balancer.


Step 1: Install Required Libraries

First, install FastAPI and Uvicorn, the ASGI server we'll use to run our application, and also install httpx for the proxy requests:

pip install fastapi uvicorn httpx

Step 2: Define the Project Structure

Organize your project as follows:

load_balancer/
├── main.py
├── servers.json
  • main.py: Contains the FastAPI application.
  • servers.json: Stores the list of backend servers.

Step 3: Create the Backend Servers File

Create a servers.json file to define the backend servers the load balancer will manage:

[
    { "url": "http://localhost:8001" },
    { "url": "http://localhost:8002" },
    { "url": "http://localhost:8003" }
]

Each backend server should run a simple API or microservice. For simplicity, you can use FastAPI to create these servers, an example is provided later in the article.


Step 4: Basic Proxying

Before implementing any load balancing algorithms, let's start with basic proxying to forward requests to backend servers:

from fastapi import FastAPI, Request
import httpx
import json

# Create a FastAPI app
app = FastAPI()

# Load backend servers from JSON file
with open("servers.json") as f:
    servers = json.load(f)


@app.get("/{path:path}")
@app.post("/{path:path}")
@app.put("/{path:path}")
@app.delete("/{path:path}")
@app.patch("/{path:path}")
async def proxy(request: Request, path: str):
    backend_url = servers[0]["url"]  # Select the first server for now
    url = f"{backend_url}/{path}"

    # Forward the request
    async with httpx.AsyncClient() as client:
        response = await client.request(
            request.method, url, headers=request.headers.raw, data=await request.body()
        )

    return response.json()

This code snippet is for an application designed to forward incoming HTTP requests to a backend server, whose URL is loaded from a JSON file named servers.json.

The proxy functionality is implemented using asynchronous HTTP requests with the httpx library.

The application listens for various HTTP methods (GET, POST, PUT, DELETE, PATCH) and forwards them to the backend server specified in the JSON file.

The path of the incoming request is appended to the backend server's URL to construct the full URL for the forwarded request.

When a request is received, the application extracts the method, headers, and body of the request and uses them to make an asynchronous HTTP request to the backend server.

The response from the backend server is then returned to the client as a JSON object.


Step 5: Load Balancing Algorithms

Let's now see how we can implement the different load-balancing algorithms.

Round Robin Algorithm

Distributes requests evenly by cycling through available servers.

For example: Server A -> Server B -> Server C -> Server A.

from itertools import cycle

from fastapi import FastAPI, Request
import httpx
import json


# Create a FastAPI app
app = FastAPI()


# Load backend servers from JSON file
with open("servers.json") as f:
    servers = json.load(f)


# Implement a round-robin load balancer
class LoadBalancer:
    def __init__(self, servers):
        self.servers = servers
        self.pool = cycle(server["url"] for server in servers)

    def round_robin(self):
        return next(self.pool)


load_balancer = LoadBalancer(servers)


@app.get("/{path:path}")
@app.post("/{path:path}")
@app.put("/{path:path}")
@app.delete("/{path:path}")
@app.patch("/{path:path}")
async def proxy(request: Request, path: str):
    backend_url = load_balancer.round_robin()
    url = f"{backend_url}/{path}"

    # Forward the request
    async with httpx.AsyncClient() as client:
        response = await client.request(
            request.method, url, headers=request.headers.raw, data=await request.body()
        )

    return response.json()

This code snippet is a FastAPI application that acts as a proxy server with a round-robin load balancer:

  • LoadBalancer class is implemented to handle the round-robin load balancing.
  • The LoadBalancer class takes a list of servers and uses the cycle function from the itertools module to create an infinite iterator that cycles through the server URLs.
  • The round_robin method of the LoadBalancer class returns the next server URL in the cycle.
  • When a request is received, the proxy function uses the round_robin method of the LoadBalancer instance to get the next backend server URL.
  • The request is then forwarded to the selected backend server using the httpx library.
  • The response from the backend server is returned to the client as a JSON object.

Least Connections Algorithm

Routes the request to the server with the least active connections.

For example, if Server A has 2 active connections and Server B has 1, the request goes to Server B.

💡
If you are part of our 'Master' tier, you can download the source code for free at the end of the article. If not, you can download it here.