In the last post, we implemented asimple NoSQL database in Python, concentrating on storing JSON documents.
Now, we're going to improve this basic database by adding features like indexing and complex query tools.
These upgrades will make your database run more smoothly and able to handle intricate queries and tasks.
If you prefer, you can follow along with the video version of this blog post:
Getting Ready for the Upgraded Database
Make sure you have the basic NoSQL database we made earlier.
If you didn't see the last tutorial, you can check it out Creating Your Own NoSQL Database in Python.
We're going to use that code as our starting point.
Get "Python's Magic Methods - Beyond __init__ and __str__"
Magic methods are not just syntactic sugar, they're powerful tools that can significantly improve the functionality and performance of your code. With this book, you'll learn how to use these tools correctly and unlock the full potential of Python.
Improving Search Speed with Indexing
Indexing is a key feature for making queries run quicker. When we index certain fields, it makes finding information much faster.
Let's modify the Database Class for Indexing:
import json
import os
import uuid
# Database class to store data in JSON format with indexing support
class JSONNoSQLDatabase:
def __init__(self, filename='database.json', index_fields=None):
self.filename = filename
# Index fields are the fields that we want to index for faster search
self.index_fields = index_fields if index_fields else []
self.store = {}
# Indexes are the dictionary of field values to document IDs
self.indexes = {field: {} for field in self.index_fields}
if os.path.exists(self.filename):
with open(self.filename, 'r') as file:
data = json.load(file)
# Load the store and indexes from the JSON file
self.store = data.get("store", {})
# Convert the indexes from list to set
self.indexes = {field: {key: set(value) for key, value in index.items()} for field, index in
data.get("indexes", {field: {} for field in self.index_fields}).items()}
# Save the store and indexes to the JSON file
def save(self):
with open(self.filename, 'w') as file:
# Convert the indexes from set to list
data = {
"store": self.store,
"indexes": {field: {key: list(value) for key, value in index.items()} for field, index in
self.indexes.items()}
}
json.dump(data, file, indent=4)
# Insert a document into the database
def insert(self, document):
doc_id = str(uuid.uuid4())
self.store[doc_id] = document
# Update the indexes with the new document
self._update_indexes(doc_id, document)
self.save()
return doc_id
# Update a document in the database
def update(self, doc_id, document):
if doc_id not in self.store:
raise KeyError("Document ID does not exist.")
self.store[doc_id] = document
# Update the indexes with the updated document
self._update_indexes(doc_id, document)
self.save()
# Update the indexes with the new document
def _update_indexes(self, doc_id, document):
for field in self.index_fields:
if field in document:
value = document[field]
# Add the document ID to the index
if value not in self.indexes[field]:
self.indexes[field][value] = set()
self.indexes[field][value].add(doc_id)
# Get a document from the database
def get(self, doc_id):
return self.store.get(doc_id, None)
# Delete a document from the database
def delete(self, doc_id):
if doc_id in self.store:
document = self.store[doc_id]
# Remove the document from the indexes
self._remove_from_indexes(doc_id, document)
del self.store[doc_id]
self.save()
else:
raise KeyError("Document ID does not exist.")
# Remove the document from the indexes
def _remove_from_indexes(self, doc_id, document):
for field in self.index_fields:
if field in document:
value = document[field]
# Remove the document ID from the index
if value in self.indexes[field]:
self.indexes[field][value].discard(doc_id)
if not self.indexes[field][value]:
del self.indexes[field][value]
# Query the database with a condition function
def query(self, condition):
results = {}
for doc_id, document in self.store.items():
if condition(document):
results[doc_id] = document
return results
# Query the database by a field value
def query_by_index(self, field, value):
results = {}
# Check if the field is indexed and the value exists in the index
if field in self.indexes and value in self.indexes[field]:
for doc_id in self.indexes[field][value]:
results[doc_id] = self.store[doc_id]
return results
Here's a breakdown of the code:
- Import necessary libraries:
json
,os
, anduuid
. - Define the
JSONNoSQLDatabase
class with the following methods:__init__
: Initialize the database with a filename and a list of fields to be indexed. If the file exists, load the store and indexes from it.save
: Save the store and indexes to the JSON file.insert
: Add a new document to the database, update the indexes, and save the changes.update
: Modify an existing document in the database, update the indexes, and save the changes._update_indexes
: Update the indexes when a document is inserted or updated.get
: Retrieve a document from the database by its ID.delete
: Remove a document from the database and update the indexes accordingly._remove_from_indexes
: Remove a document from the indexes when it's deleted.query
: Search the database using a condition function.query_by_index
: Search the database by a specific field value using indexes.
The database stores documents in a dictionary called store
, where keys are document IDs (generated using uuid
) and values are the documents themselves.
The indexes
dictionary is used to store indexed fields, allowing for faster searches based on those fields.
Example Usage with Indexing
Let's see a simple example of how to use the indexing feature:
from database import JSONNoSQLDatabase
db = JSONNoSQLDatabase(index_fields=["age"])
# Insert some documents
doc1_id = db.insert({"name": "Alice", "age": 30})
doc2_id = db.insert({"name": "Bob", "age": 24})
# Retrieve documents by index
results = db.query_by_index("age", 30)
print(results)
# Output: {doc1_id: {"name": "Alice", "age": 30}}
Here's a step-by-step explanation of the code:
- Import the
JSONNoSQLDatabase
class from thedatabase
module. - Create an instance of the
JSONNoSQLDatabase
class, specifying the "age" field as the indexed field. - Insert two documents into the database, each containing a "name" and an "age" field. The
insert
method returns the ID of the inserted document. - Query the database to find documents with a specific "age" value (in this case, 30) using the
query_by_index
method. - Print the results of the query. The output will be a dictionary containing the document ID and the document itself. In this example, the output will be
{doc1_id: {"name": "Alice", "age": 30}}
, wheredoc1_id
is the actual ID of the document.
Adding Advanced Query Mechanisms
Advanced query mechanisms allow for more complex queries, such as range queries and composite queries.
Let's see the code changes for advanced query mechanisms:
This article is for paid members only
To continue reading this article, upgrade your account to get full access.
Subscribe NowAlready have an account? Sign In