RediSearch
Redis is famous for being a high-performance key-value store. But what if you need to find an object based on its attributes? What if you want to perform a full-text search across thousands of documents, or build a vector similarity search for an AI application? Scanning every key is an $O(N)$ operation, which will paralyze a production Redis cluster. That’s where RediSearch comes in.
1. The Challenge: Building an E-Commerce Product Search
Imagine you are tasked with building a search engine for an e-commerce platform with millions of products.
- Requirements: Users need to search by title, filter by price, and group by category. The search must be typo-tolerant (fuzzy search) and lightning-fast.
- The Naive Approach: Store each product as a JSON string in a standard Redis string, then use
SCANto retrieve them, deserialize them in the application layer, and filter. This is catastrophic for performance. - The RediSearch Solution: RediSearch acts as a secondary indexing engine. It automatically indexes your Hashes or JSON documents as they are ingested, allowing you to query them instantly.
2. The Core Concept: The Inverted Index
To understand why RediSearch is so fast, we need to look under the hood at its core data structure: the Inverted Index.
Instead of scanning every document to find the word “Matrix”, RediSearch maintains a map where the key is the word, and the value is a list of all documents containing that word. It flips the relationship from “Document → Words” to “Word → Documents”.
graph LR subgraph Documents D1[Doc 1: "Redis is fast"] D2[Doc 2: "Redis is cool"] D3[Doc 3: "Fast cars"] end subgraph Inverted_Index [Inverted Index] style Inverted_Index fill:var(--bg-main),stroke:var(--border-muted),color:var(--fg-default) T1(redis) --> L1[Doc 1, Doc 2] T2(fast) --> L2[Doc 1, Doc 3] T3(cool) --> L3[Doc 2] T4(cars) --> L4[Doc 3] end D1 -.-> T1 D2 -.-> T3 D3 -.-> T4
3. Creating an Index and Searching
To use RediSearch, you define a Schema. This schema tells Redis which fields inside your Hashes (or JSON documents) should be indexed and how they should be treated (e.g., as text, numbers, or geographical coordinates).
Java
import redis.clients.jedis.JedisPooled;
import redis.clients.jedis.search.IndexDefinition;
import redis.clients.jedis.search.IndexOptions;
import redis.clients.jedis.search.Schema;
import redis.clients.jedis.search.SearchResult;
public class RediSearchExample {
public static void main(String[] args) {
JedisPooled client = new JedisPooled("localhost", 6379);
// 1. Create Index: FT.CREATE movieIdx ON HASH PREFIX 1 movie: SCHEMA title TEXT year NUMERIC
Schema schema = new Schema()
.addTextField("title", 1.0)
.addNumericField("year");
IndexDefinition def = new IndexDefinition()
.setPrefixes(new String[]{"movie:"});
try {
client.ftCreate("movieIdx", IndexOptions.defaultOptions().setDefinition(def), schema);
} catch (Exception e) {
// Index might already exist
}
// 2. Add Data (Standard HSET - RediSearch indexes it automatically!)
client.hset("movie:1", java.util.Map.of("title", "The Matrix", "year", "1999"));
client.hset("movie:2", java.util.Map.of("title", "The Matrix Reloaded", "year", "2003"));
// 3. Search: FT.SEARCH movieIdx "Matrix"
SearchResult result = client.ftSearch("movieIdx", "Matrix");
System.out.println("Found: " + result.getTotalResults()); // Output: 2
}
}
Go
package main
import (
"context"
"fmt"
"github.com/redis/go-redis/v9"
)
func main() {
ctx := context.Background()
rdb := redis.NewClient(&redis.Options{Addr: "localhost:6379"})
// 1. Create Index (Using raw command, though go-redis has hooks)
err := rdb.Do(ctx, "FT.CREATE", "movieIdx",
"ON", "HASH",
"PREFIX", "1", "movie:",
"SCHEMA", "title", "TEXT", "year", "NUMERIC").Err()
if err != nil {
fmt.Println("Index might already exist:", err)
}
// 2. Add Data
rdb.HSet(ctx, "movie:1", "title", "The Matrix", "year", 1999)
rdb.HSet(ctx, "movie:2", "title", "The Matrix Reloaded", "year", 2003)
// 3. Search
res, _ := rdb.Do(ctx, "FT.SEARCH", "movieIdx", "Matrix").Result()
fmt.Println(res)
}
4. Interactive: Inverted Index Builder
Add documents below to visualize how the Inverted Index is built in real-time. Notice how finding documents by a keyword becomes an instant lookup rather than a slow table scan.
Documents (Storage)
Inverted Index (Lookup)
5. Advanced Features
- Vector Search: RediSearch supports KNN (K-Nearest Neighbors) to find semantically similar items, making it a foundational technology for Vector Databases in Generative AI.
- Aggregation: You can perform
GROUPBY,SORTBY, and mathematical operations directly on your search results, bypassing the need to aggregate data in your application logic. - Fuzzy Search: Adding a
%to your query enables Levenshtein distance matching, catching typos gracefully (e.g.,FT.SEARCH movieIdx "%Matrx%").