Vector Databases

[!NOTE] Vector Databases are the long-term memory for AI. They store data not as rows and columns, but as mathematical points in a multi-dimensional space.

1. What are Embeddings?

Before understanding vector databases, you must understand embeddings. An embedding is a list of floating-point numbers (a vector) that represents the meaning of a piece of text.

Input: “Apple”
Output: [0.12, -0.45, 0.88, ...] (e.g., 1536 dimensions for OpenAI’s text-embedding-3-small)

The magic is that semantically similar words end up close together in this vector space.

“Dog” and “Puppy” → Close together.
“Dog” and “Car” → Far apart.

2. Interactive: Embedding Space Visualizer

Visualize how semantic search works in a simplified 2D space. Drag the Query Point (Red) to see which concepts are considered “similar” based on their distance.

Nearest Neighbor: -

Similarity: -

Tech

Food

Animals

Query

3. How Vector Search Works

Traditional databases (SQL) use Keyword Search (exact match or regex). Vector databases use Similarity Search.

Distance Metrics

To find “similar” vectors, we calculate the distance between them.

Cosine Similarity: Measures the angle between two vectors.
- Range: -1 to 1.
- Use case: NLP, text similarity (magnitude doesn’t matter).
- Formula: A · B / (||A|| * ||B||)
Euclidean Distance (L2): Measures the straight-line distance.
- Use case: Image clustering.
Dot Product: Measures magnitude and direction.
- Use case: Recommendation systems (where magnitude = rating).

Approximate Nearest Neighbor (ANN)

Searching millions of vectors by comparing every single one (Brute Force / KNN) is too slow. Vector DBs use ANN algorithms like HNSW (Hierarchical Navigable Small World).

Trade-off: Slightly less accurate (might miss the absolute #1 closest), but blazing fast (milliseconds).
Analogy: Instead of checking every house in the city, HNSW checks neighborhoods, then streets, then houses.

4. Vector DB Landscape

Database	Type	Open Source?
Pinecone	Managed Service	No
ChromaDB	Local / Server	Yes
Weaviate	Server	Yes
Milvus	Server	Yes
pgvector	Postgres Extension	Yes

5. Code Example: Using ChromaDB

Here is how you ingest text and search for it using chromadb in Python.

import chromadb

# 1. Setup
client = chromadb.Client()
collection = client.create_collection(name="demo")

# 2. Add Data (Embeddings happen automatically by default!)
collection.add(
    documents=["I love python programming", "I hate snakes", "Pizza is great"],
    metadatas=[{"category": "tech"}, {"category": "animals"}, {"category": "food"}],
    ids=["id1", "id2", "id3"]
)

# 3. Query
# Searching for "coding" should match "I love python programming"
results = collection.query(
    query_texts=["coding"],
    n_results=1
)

print(results['documents'])
# Output: [['I love python programming']]
# Note: "coding" and "python programming" are semantically close!

6. Inverted Index vs Vector Index

Feature	Inverted Index (Elasticsearch)	Vector Index (Pinecone)
Matches	Exact keywords (“bank” ≠ “river bank”)	Meanings (“bank” ≈ “finance”)
Handling Synonyms	Needs manual list	Automatic
Handling Typos	Fuzzy matching required	Robust to small errors
Best For	Specific product codes, names	Conceptual questions, recommendations

[!TIP] Hybrid Search is the best of both worlds. It combines keyword search (BM25) for precision with vector search for recall.

7. Next Steps

Now that we can retrieve data, how do we structure our RAG pipeline for complex queries? Learn about Advanced RAG Architectures next.