Understanding Non-Square Matrix Multiplication

Hongtao Hao / 2022-12-12

Through this tutorial, I want to explain how to compuate the multiplications of non-square matrix; for example, the result of multiplying a $2 \times 5$ matrix by a $5 \times 2$ matrix (a by c in the following example.)

import numpy as np
a = np.array([[1, 2,7,9,10], [3, 4,5,12,11]])
array([[ 1,  2,  7,  9, 10],
       [ 3,  4,  5, 12, 11]])

a is a $2 \times 5$ matrix. It means the transformation from a 5d space to a 2d space. You can consider each column of a as the corresponding coordinates of each of the five axes (of the original 5d space).

Suppose we have a vector in a 5d space, i.e., a $5 \times 1$ vector:

b= np.array([1,2,3,4,5]).reshape(-1, 1)

Multiplying a by b means mapping b onto a. The resulting $2 \times 1$ matrix is the result vector.

a @ b

Matrix multiplication #

Suppose we have a $5 \times 2$ matrix:

c = np.array([[1,2], [3,4], [4,7], [9,10], [11,-1]])
array([[ 1,  2],
       [ 3,  4],
       [ 4,  7],
       [ 9, 10],
       [11, -1]])

c means transforming a 2d space to 5d. A 2d space has two axes ($\vec{x}$ and $\vec{y}$). The first column of c is the coordinates of $\vec{x}$ in the new 5d space, and the second column, $\vec{y}$.

What does $c \times a$ means? This matrix multiplication is the result of two space transformations: first a and then c. That is to say, it means we first transform a 5d space to 2d and then transform 2d space to 5d.

To understand what $c \times a$ means, let’s first consider multiplying a $2 \times 1$ matrix, i.e., a vector in two-dimensional space, by c:

d = np.array([[1], [2]])

$\vec{d}$ means $1 \vec{x} + 2 \vec{y}$. Since in the new 5d space, the coordinates of $\vec{x}$ is [1, 3, 4, 9, 11] and the coordinates of $\vec{y}$ is [2, 4, 7, 10, -1], the result of $c \times d$ will be:

c @ d
array([[ 5],
       [ 9]])

That is the coordinates of $\vec{d}$ in the new 5d space.

The calculation of $c \times a$ is the same. You can view each column of a as a $\vec{d}$.

e = c @ a
array([[  7,  10,  17,  33,  32],
       [ 15,  22,  41,  75,  74],
       [ 25,  36,  63, 120, 117],
       [ 39,  58, 113, 201, 200],
       [  8,  18,  72,  87,  99]])

For any vector in a five-dimensional space, say f = [1, 5, -9, 9, 8], multiplying it by a and then by c is the same as multiplying it by e directly.

f = np.array([[1], [5], [-9], [9], [8]])
g = a @ f
h = c @ g
array([[ 457],
e @ f
array([[ 457],

Last modified on 2022-12-12