Tensors (Rank-3+) & Operations
1. Introduction: Beyond Matrices
So far, we’ve dealt with:
- Scalars: Single numbers (Rank 0).
- Vectors: Lists of numbers (Rank 1).
- Matrices: Grids of numbers (Rank 2).
But real-world data is often more complex.
- Color Images: A standard image is Height × Width × 3 Color Channels (Red, Green, Blue). This is a Rank-3 Tensor.
- Video Batches: A batch of videos is (Batch Size × Time × Height × Width × Channels). This is a Rank-5 Tensor.
In Deep Learning frameworks like PyTorch and TensorFlow, everything is a Tensor.
2. Tensor Ranks & Shapes
The Rank (or Order) of a tensor is the number of indices required to access a specific element.
| Rank | Name | Example Shape | Analogy |
|---|---|---|---|
| 0 | Scalar | () |
A single point. |
| 1 | Vector | (3) |
A line of points. |
| 2 | Matrix | (3, 3) |
A sheet of paper (grid). |
| 3 | Tensor | (3, 3, 3) |
A book (stack of sheets). |
| 4 | Tensor | (10, 3, 3, 3) |
A shelf of books. |
| 5 | Tensor | (5, 10, 3, 3, 3) |
A library of shelves. |
3. Interactive Visualizer: The Tensor Operator
Explore two key concepts: Slicing (cutting a tensor) and Broadcasting (stretching a tensor).
4. Key Operations
A. Broadcasting: The Elastic Ruler
What if you add a Vector (4) to a Matrix (4, 4)?
The vector acts like an Elastic Ruler. It “stretches” (copies) itself across the missing dimension to match the matrix shape.
The Rules of Broadcasting:
- Align Shapes from Right: Start with the last dimension.
- Match or Stretch:
- If dimensions are equal, great.
- If one dimension is 1 (or missing), it is stretched to match the other.
- If dimensions mismatch (e.g., 3 vs 4) and neither is 1, Error.
Matrix A: (4, 4)
Vector B: ( , 4) <-- Stretches vertically to (4, 4)
Result: (4, 4)
B. Slicing
Extracting a lower-rank sub-tensor.
T[0, :, :]gives you the first matrix (Rank 2).T[0, 0, :]gives you the first row of the first matrix (Rank 1).
5. Summary
- Tensor: A multidimensional array of numbers.
- Rank: The number of axes (dimensions).
- Broadcasting: Implicitly copying data to match shapes, saving memory and code lines.