Matrix Multiplication: The Engine of Neural Nets

1. Introduction

If you open the source code of any Deep Learning library (PyTorch, TensorFlow), 90% of the compute time is spent on one operation: Matrix Multiplication (GEMM - General Matrix Multiply).

Why? Because a Neural Network layer is just a matrix multiplication followed by an activation function:

output = σ(W ċ x + b)

2. The Dot Product

The fundamental building block is the Dot Product (or Scalar Product) of two vectors. It returns a single number.

a ċ b = ∑ aibi = a1b1 + a2b2 + … + anbn

Geometric Interpretation

a ċ b = ||a|| ||b|| cos(θ)

  • If vectors point in the same direction, dot product is Positive (High Similarity).
  • If vectors are perpendicular (90°), dot product is Zero (Orthogonal/Unrelated).
  • If vectors point in opposite directions, dot product is Negative.

[!TIP] ML Application: In Recommendation Systems, if User Vector u and Movie Vector m have a high dot product, the user will likely enjoy the movie. This is the basis of Cosine Similarity.


3. Matrix-Vector Multiplication (Ax)

When we multiply a matrix A by a vector x, we are transforming the vector x. Ax = b

The matrix A acts as a function f(x). It can rotate, scale, or skew the vector space.

10
02
×
1
1
=
1
2

(This matrix stretched the y-axis by 2).


4. Matrix-Matrix Multiplication (AB)

Multiplying two matrices is just applying two transformations in sequence (Composition). C = AB

Calculating Cij involves taking the dot product of Row i of A and Column j of B.

Rule: Inner dimensions must match! (m × n) ċ (n × p) → (m × p)

[!WARNING] Order Matters! unlike scalar multiplication (2 × 3 = 3 × 2), Matrix Multiplication is not commutative. ABBA. Applying a Rotation then a Shear is different from a Shear then a Rotation.


5. Interactive Visualizer: The Linear Transformer

Modify the 2x2 Matrix M to see how it transforms the grid space. The basis vectors i (Red) and j (Green) show where the x and y axes land.

M =
The columns of M tell us where i and j land.

6. Summary

  • Dot Product: Measures similarity between vectors.
  • Matrix-Vector: Transforms a vector (Scale, Rotate, Skew).
  • Matrix-Matrix: Combines multiple transformations.

Next: Systems of Equations →