My Understanding of np.transpose() function

As a machine and deep learning engineer in China, I meet lots of coding scenarios where need to use matrices transposition. Although having been used the numpy transpose() function to achieve this goal, but I didn't dive into a deep understanding of this common function.

The concept of transposition

This idea comes from 2D matrices operation which is used to flip the rows and columns.

Let's illustrate this with some code examples.

import numpy as np

# create a 2x2 matrix
a = np.array([[1, 2], [3, 4]])
# a: 
# array([[1, 2],
#        [3, 4]])

# the transposed version of a
a_t = np.transpose(a)
# a_t: 
# array([[1, 3],
#        [2, 4]])

There We created a 2D square matrix a its first row is [1, 2] and the second row is [3, 4]. As in math, We could also say the first column of a is [1, 3] and the second is [2, 4].

Using numpy.transpose() function, We created a transposed version a_t of the a. Because a is a 2D square matrix, so its transposed version will have the same dimension of itself, but with the rows and columns swapped. We can see that a_t's first row contains [1, 3] which is the first column of a, and the second row of a_t is [2, 4], which is the second column of a. Have you understood? It's simple.

Notable usage of the np.transpose()

Here comes the first advanced scenarios. Numpy's transpose() function also supports multi-dimensional matrix transposition.

Let's illustrate this with a code example.

import numpy as np

# create a 2x2x2 cubic matrix
a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
# a: 
# array([[[1, 2],
#         [3, 4]],
#        [[5, 6],
#         [7, 8]]])

# the transposed version
a_t = np.transpose(a)
# a_t: 
# array([[[1, 5],
#         [3, 7]],
#        [[2, 6],
#         [4, 8]]])

Here We first create a 3D cubic matrix a which has two 2x2 sub-matrix. The first sub-matrix is [[1, 2], [3, 4]] , naming it a_s1, and the second sub-matrix is [[5, 6], [7, 8]], a_s2.

Same as before, We use the transpose() function to create a transposition of a, the a_t. From the result, We can see the a_t still has two sub-matrix, the first is [[1, 5], [3, 7]], at_s1, the second is [[2, 6], [4, 8]], at_s2. What happened?

Let's dive into the detail. For the first sub-matrix at_s1, its first row [1, 5], the first element 1 is a_s1[0][0], the first row and first column. And the second element 5 is a_s2[0][0]. For the second row [3, 7], same logical applied, the first element 3 is a_s1[1][0], the first element of the second row of a_s1. And element 7 is a_s2[1][0]. Do you get the mystery?

Okay, let's continue. For the second sub-matrix at_s2, its first row is [2, 6], the first element 2 comes from the second element of the first row of a_s1, and the second element 6 comes from the second element of the first row of a_s2.

Let's clear this in a simple word. With numpy.transpose() function, We can rearrange the axes of a array or matrix in reversed order. For 2D matrix, it just flip or swap the row and column dimension.