A load balancer is a critical component in any high-availability system.
It distributes incoming network traffic across multiple servers to ensure no single server becomes overwhelmed, improving system performance and reliability.
FastAPI, known for its high performance and simplicity, can be used to create a lightweight load balancer.
In this article, we'll walk through how to build one using FastAPI.
Prerequisites
- Basic knowledge of Python and FastAPI.
- Python 3.7 or higher installed on your system.
- Familiarity with concepts like REST APIs and HTTP requests.
Understanding Load Balancing
Load balancing is the process of distributing incoming network traffic across multiple servers to ensure optimal resource utilization, minimize response times, and prevent any single server from becoming a bottleneck.
It is a critical component in scalable web applications, ensuring high availability and fault tolerance.
There are various load-balancing strategies, including:
- Round Robin: Requests are distributed sequentially across servers.
- Least Connections: The request is routed to the server with the fewest active connections.
- IP Hashing: Clients are consistently mapped to the same server based on their IP address.
Load balancers also perform additional tasks such as health checks, SSL termination, and request routing, improving system reliability and performance.
Introduction to FastAPI
FastAPI is a modern, high-performance web framework for building APIs with Python.
It is designed to be easy to use while offering the speed of asynchronous programming with Python’s asyncio
.
Some of its key features include:
- Asynchronous support: FastAPI leverages
async
andawait
for handling concurrent requests efficiently. - Automatic validation: Built-in data validation using Pydantic ensures robust API request handling.
- Fast execution: It is one of the fastest Python frameworks, comparable to Node.js and Go-based web frameworks.
- Interactive API docs: Automatically generates OpenAPI documentation with Swagger UI and ReDoc.
Due to its performance and simplicity, FastAPI is an excellent choice for building scalable APIs and microservices, including a custom load balancer.
Step 1: Install Required Libraries
First, install FastAPI and Uvicorn, the ASGI server we'll use to run our application, and also install httpx
for the proxy requests:
pip install fastapi uvicorn httpx
Step 2: Define the Project Structure
Organize your project as follows:
load_balancer/
├── main.py
├── servers.json
main.py
: Contains the FastAPI application.servers.json
: Stores the list of backend servers.
Step 3: Create the Backend Servers File
Create a servers.json
file to define the backend servers the load balancer will manage:
[
{ "url": "http://localhost:8001" },
{ "url": "http://localhost:8002" },
{ "url": "http://localhost:8003" }
]
Each backend server should run a simple API or microservice. For simplicity, you can use FastAPI to create these servers, an example is provided later in the article.
Step 4: Basic Proxying
Before implementing any load balancing algorithms, let's start with basic proxying to forward requests to backend servers:
from fastapi import FastAPI, Request
import httpx
import json
# Create a FastAPI app
app = FastAPI()
# Load backend servers from JSON file
with open("servers.json") as f:
servers = json.load(f)
@app.get("/{path:path}")
@app.post("/{path:path}")
@app.put("/{path:path}")
@app.delete("/{path:path}")
@app.patch("/{path:path}")
async def proxy(request: Request, path: str):
backend_url = servers[0]["url"] # Select the first server for now
url = f"{backend_url}/{path}"
# Forward the request
async with httpx.AsyncClient() as client:
response = await client.request(
request.method, url, headers=request.headers.raw, data=await request.body()
)
return response.json()
This code snippet is for an application designed to forward incoming HTTP requests to a backend server, whose URL is loaded from a JSON file named servers.json
.
The proxy functionality is implemented using asynchronous HTTP requests with the httpx
library.
The application listens for various HTTP methods (GET, POST, PUT, DELETE, PATCH) and forwards them to the backend server specified in the JSON file.
The path of the incoming request is appended to the backend server's URL to construct the full URL for the forwarded request.
When a request is received, the application extracts the method, headers, and body of the request and uses them to make an asynchronous HTTP request to the backend server.
The response from the backend server is then returned to the client as a JSON object.
Step 5: Load Balancing Algorithms
Let's now see how we can implement the different load-balancing algorithms.
Round Robin Algorithm
Distributes requests evenly by cycling through available servers.
For example: Server A -> Server B -> Server C -> Server A.
from itertools import cycle
from fastapi import FastAPI, Request
import httpx
import json
# Create a FastAPI app
app = FastAPI()
# Load backend servers from JSON file
with open("servers.json") as f:
servers = json.load(f)
# Implement a round-robin load balancer
class LoadBalancer:
def __init__(self, servers):
self.servers = servers
self.pool = cycle(server["url"] for server in servers)
def round_robin(self):
return next(self.pool)
load_balancer = LoadBalancer(servers)
@app.get("/{path:path}")
@app.post("/{path:path}")
@app.put("/{path:path}")
@app.delete("/{path:path}")
@app.patch("/{path:path}")
async def proxy(request: Request, path: str):
backend_url = load_balancer.round_robin()
url = f"{backend_url}/{path}"
# Forward the request
async with httpx.AsyncClient() as client:
response = await client.request(
request.method, url, headers=request.headers.raw, data=await request.body()
)
return response.json()
This code snippet is a FastAPI application that acts as a proxy server with a round-robin load balancer:
- A
LoadBalancer
class is implemented to handle the round-robin load balancing. - The
LoadBalancer
class takes a list of servers and uses thecycle
function from theitertools
module to create an infinite iterator that cycles through the server URLs. - The
round_robin
method of theLoadBalancer
class returns the next server URL in the cycle. - When a request is received, the
proxy
function uses theround_robin
method of theLoadBalancer
instance to get the next backend server URL. - The request is then forwarded to the selected backend server using the
httpx
library. - The response from the backend server is returned to the client as a JSON object.
Least Connections Algorithm
Routes the request to the server with the least active connections.
For example, if Server A has 2 active connections and Server B has 1, the request goes to Server B.
This post is for subscribers on our tiers: Master Tier
To continue reading this article, upgrade your account to get full access.
Subscribe NowAlready have an account? Sign In