PCA does projection to reduce dimensionality. Intuitively, it just draws orthogonal lines from a high-dimensional data point to a low-dimensional hyperplane and finds the landing spot. By using these orthogonal projections onto the best hyperplane that captures the most variance of the data, PCA achieves low information loss, hence does a good compression of the data.

import numpy as np
from sklearn.decomposition import PCA
from sklearn import datasets
np.random.seed(5)

centers = [[1, 1], [-1, -1], [1, -1]]
iris = datasets.load_iris()
X = iris.data
y = iris.target

X.shape, y.shape
((150, 4), (150,))
mu = np.mean(X, axis=0)
mu.shape
(4,)

Task 1: Reduce 4D to 2D

X is 4-dimensional.

pca = PCA(n_components=2)
P = pca.fit_transform(X)
P.shape
(150, 2)
pc2 = pca.components_
pc2
array([[ 0.36138659, -0.08452251,  0.85667061,  0.3582892 ],
       [ 0.65658877,  0.73016143, -0.17337266, -0.07548102]])

Task 2: Recover from 2D to 4D

pc2 is the 2 highest ranking principal components. The two vectors are in the original feature space, so they are 4-dimensional.

Matrix P above is of shape (150, 2), i.e. 150 examples in the reduced 2D space. Each row in P is a 2D vector with 2 coordinates, the first coordinate corresponds to the first principal component, the second corresponds to the second.

The 2D plane we are projecting to is spanned by the 2 vectors (principal components) in pc2. They are the basis vectors for the plane. Let's call them u and v. This 2D plane lives in the 4D space, so from the original feature space - the 4D space's point of view, what PCA does is projecting the original 4D data points onto this 2D plane.

We can get the 4D coordinates of the projections of all the data points by using the two vectors in pc2 as basis vectors, and multiply them by each row of matrix P, i.e. the projected 2D coordinates in the space of u and v. This way we can find the 4D coordinates of these projections, so we can "recover" the compressed data to a state that is closest to the original data.

pc2.shape, P.shape
((2, 4), (150, 2))
X_recover = P.dot(pc2)
X_recover.shape
(150, 4)

Find the distance between X_recover and the original feature matrix X

X[0]
array([5.1, 3.5, 1.4, 0.2])
X_recover[0] + mu
array([5.08303897, 3.51741393, 1.40321372, 0.21353169])

In short, to recover the original data from reduced dimensions, find the 4D coordinates by doing matrix multiplication between P (the projected 2D coordinates based on the basis vectors u and v in pc2) with the basis vectors (pc2), and then add back the mean mu.

Great resource: CrossValidated answer to question How to reverse PCA and reconstruct original variables from several principal components?