An operator used in the $group stage to calculate values across a group of documents. Examples include $sum, $avg, $max, $min, $push, and $addToSet.
Aggregation Framework
A pipeline-based data processing framework in MongoDB. It allows you to transform, filter, and group documents to perform complex analytics. Similar to SQL GROUP BY but more powerful.
Arbiter
A replica set member that does not hold data. Its only function is to participate in elections to ensure a majority vote. Arbiters are used to maintain a quorum without the cost of storing an extra copy of the data.
Atlas
MongoDB’s fully managed cloud database service (DBaaS). It runs on AWS, Azure, and Google Cloud.
B
Term
Definition
Balancer
A background process that monitors the number of chunks on each shard. If the distribution is uneven, it migrates chunks from shards with more chunks to shards with fewer chunks.
BSON
Binary JSON. The binary-encoded serialization format used to store documents and make remote procedure calls in MongoDB. It supports more data types than JSON (e.g., Date, ObjectId, Binary data).
C
Term
Definition
Cardinality
The number of elements in a set or other grouping, often used to describe the “many” side of a relationship (e.g., One-to-Few vs. One-to-Squillions).
Chunk
A contiguous range of shard key values. MongoDB partitions sharded data into chunks and distributes them across shards. The default chunk size is 64 MB.
Collection
A grouping of MongoDB documents. Equivalent to an RDBMS Table. Collections exist within a single database and do not enforce a schema by default.
Compass
The official GUI for MongoDB. It allows you to visually explore data, run queries, and optimize performance.
Config Server
A mongod instance that stores the metadata for a sharded cluster. It maps chunks of data to specific shards.
Cursor
A pointer to the result set of a query. Clients can iterate through a cursor to retrieve results.
D
Term
Definition
Denormalization
The process of optimizing read performance by adding redundant data or grouping data (Embedding). In MongoDB, this means storing related data in a single document to avoid joins.
Document
A record in a MongoDB collection and the basic unit of data. Documents are analogous to JSON objects but exist as BSON. Equivalent to an RDBMS Row.
E
Term
Definition
Election
The process by which members of a replica set determine which node will become the Primary. Elections occur during initialization or when the current Primary becomes unavailable.
F
Term
Definition
Field
A key-value pair in a document. A document has zero or more fields. Fields are analogous to columns in a relational database.
I
Term
Definition
Index
A data structure that improves the speed of data retrieval operations on a database table. MongoDB uses B-Tree indexes.
J
Term
Definition
Journaling
A write-ahead logging mechanism used by the WiredTiger storage engine to ensure durability. Writes are first recorded in the journal before being applied to the data files, allowing recovery after a crash.
JSON Schema
A vocabulary that allows you to annotate and validate JSON documents. MongoDB uses it to enforce schema validation rules.
M
Term
Definition
mongod
The primary daemon process for the MongoDB system. It handles data requests, manages data access, and performs background management operations.
Mongos
The query router in a sharded cluster. It acts as the interface between client applications and the sharded cluster. Clients connect to mongos, which routes queries to the appropriate shards.
O
Term
Definition
ObjectId
A 12-byte BSON type used as the default value for the _id field. It consists of a timestamp, random value, and counter, ensuring uniqueness across distributed systems.
Oplog (Operations Log)
A special capped collection (local.oplog.rs) that keeps a rolling record of all operations that modify the data. Secondaries replicate the Primary by tailing and replaying the oplog.
P
Term
Definition
Pipeline
A series of stages that documents pass through in the Aggregation Framework. Each stage transforms the documents as they pass through.
Primary
The single member in a replica set that receives all write operations. It records changes in its oplog.
R
Term
Definition
Read Concern
An option that controls the consistency and isolation properties of the data read from a replica set. Levels include local, available, majority, linearizable, and snapshot.
Read Preference
A setting that determines which member of a replica set the driver should read from. Options include primary (default), primaryPreferred, secondary, secondaryPreferred, and nearest.
Replica Set
A group of mongod processes that maintain the same data set. Replica sets provide redundancy and high availability. Consists of one Primary and multiple Secondaries.
S
Term
Definition
Schema Validation
A feature that allows you to define rules for document structure and data types, rejecting writes that do not conform.
Secondary
A member of a replica set that replicates the data from the Primary. Secondaries can serve read operations (if configured) but cannot accept writes.
Shard
A single replica set that holds a subset of the data in a sharded cluster.
Shard Key
The field or fields used to partition a collection’s documents across shards. The choice of shard key is critical for performance and scalability.
Sharding
The process of storing data records across multiple machines. It is MongoDB’s approach to meeting the demands of data growth (Horizontal Scaling).
Stage
A single operation in an aggregation pipeline, such as $match, $group, or $project. Stages process documents and pass the results to the next stage.
V
Term
Definition
View
A read-only queryable object created from an aggregation pipeline on a source collection.
W
Term
Definition
WiredTiger
The default storage engine for MongoDB (since 3.2). It provides document-level concurrency, compression, and data integrity (via journaling).
Write Amplification
The phenomenon where a small logical update results in a large physical write to disk. In MongoDB, updating a field in a document requires rewriting the entire document if it grows or moves on disk.
Write Concern
The level of acknowledgement requested from MongoDB for write operations. (e.g., w: 1 means acknowledge after writing to Primary; w: majority means acknowledge after writing to a majority of replicas).
Z
Term
Definition
Zone Sharding
A feature that allows administrators to associate ranges of shard key values with specific shards (zones). This is often used for geographic data distribution (e.g., keeping EU data on EU servers).
Operators
Term
Definition
$bucket
Categorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries.
$facet
Processes multiple aggregation pipelines within a single stage on the same set of input documents.
$group
Groups input documents by a specified identifier expression and applies the accumulator expression(s), if specified, to each group.
$limit
Limits the number of documents passed to the next stage in the pipeline.
$lookup
Performs a left outer join to an unsharded collection in the same database to filter in documents from the “joined” collection for processing.
$match
Filters the documents to pass only the documents that match the specified condition(s) to the next pipeline stage.
$project
Passes along the documents with the requested fields to the next stage in the pipeline. The specified fields can be existing fields from the input documents or newly computed fields.
$sort
Reorders the document stream by a specified sort key.
$unwind
Deconstructs an array field from the input documents to output a document for each element. Each output document is the input document with the value of the array field replaced by the element.
Found this lesson helpful?
Mark it as mastered to track your progress through the course.