Linear Algebra - Week 3#

Dive into the fundamentals of linear algebra for machine learning and data science. This week you’ll learn about vectors, projections and linear transformations.

[1]:
import math
from functools import partial

import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import sympy as sp
from IPython.display import display, Math
from scipy.ndimage import rotate

plt.style.use("seaborn-v0_8-whitegrid")

Vectors#

Let’s consider these two vectors:

a=[13] and b=[41]

[2]:
a = np.array([1, 3])
b = np.array([4, 1])

plt.quiver(
    [0, 0],
    [0, 0],
    [a[0], b[0]],
    [a[1], b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange"],
)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.5, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 - 0.7], color="tab:orange", fontsize=12)
plt.annotate("$\\theta$", [0.4, 0.4], fontsize=10)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.gca().set_aspect("equal")
plt.title("Vectors $\\vec{a}$ and $\\vec{b}$ and their angle $\\theta$")
plt.show()
../_images/linear_algebra_la_w3_5_0.png

The angle between vectors#

To calculate θ we can use the Law of Cosines

📐 c2=a2+b22abcosθ

which relates the lengths of the sides of a triangle to the cosine of one of its angles.

We don’t have c though, but we can demonstrate that c=ba

[3]:
a = np.array([1, 3])
b = np.array([4, 1])
c = b - a

plt.quiver(
    [0, 0, 0, a[0]],
    [0, 0, 0, a[1]],
    [a[0], b[0], c[0], c[0]],
    [a[1], b[1], c[1], c[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange", "tab:pink", "tab:pink"],
    alpha=[1.0, 1.0, 1.0, 0.3],
)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.5, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 - 0.7], color="tab:orange", fontsize=12)
plt.annotate("$\\vec{c}$", [c[0] / 2, c[1] / 2 - 0.6], color="tab:pink", fontsize=12)
plt.annotate(
    "$\\vec{c}$ from tip of $\\vec{a}$",
    [b[0] / 2, a[1] - 0.5],
    color="tab:pink",
    fontsize=12,
)
plt.annotate("$\\theta$", [0.4, 0.4], fontsize=10)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.title("Proof $\\vec{c} = \\vec{b} - \\vec{a}$")
plt.gca().set_aspect("equal")
plt.show()
../_images/linear_algebra_la_w3_9_0.png

🔑 Vectors are unique in that they maintain their direction and magnitude regardless of where they “start” or “end” in space. Vectors are typically drawn starting from the origin to clearly depict their direction and magnitude. However, the true essence of a vector is that it represents a direction and magnitude in space and can be shifted anywhere. When we compute c=ba we’re calculating the vector that starts from the tip of a nd goes to the tip of b. We can draw it starting from the origin or starting from the tip of a.

Now, that we’ve established c=ba, let’s isolate cosθ from the cosine formula

c2=a2+b22abcosθ.

c2=cc

c2=(ba)(ba)

c2=bb+aa2ab

c2=b2+a22ab

Let’s verify what we’ve derived so far.

[4]:
assert np.isclose(np.linalg.norm(c) ** 2, np.dot(c, c))
assert np.isclose(np.linalg.norm(c) ** 2, np.dot(b - a, b - a))
assert np.isclose(
    np.linalg.norm(c) ** 2, np.dot(b, b) + np.dot(a, a) - 2 * np.dot(a, b)
)
assert np.isclose(
    np.linalg.norm(c) ** 2,
    np.linalg.norm(b) ** 2 + np.linalg.norm(a) ** 2 - 2 * np.dot(a, b),
)

Let’s substitute it into the cosine formula.

b2+a22ab=a2+b22abcosθ

2ab=2abcosθ

2ab2ab=cosθ

📐 abab=cosθ

The numerator is the dot product of a and b. The denominator is a normalization scalar.

We can actually rewrite it as

aabb=cosθ

where aa and bb are the unit vectors of a and b.

And we can verify that the two are indeed the same.

[5]:
assert np.isclose(
    np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b)),
    np.dot(a / np.linalg.norm(a), b / np.linalg.norm(b)),
)

Once we have cosθ we can calculate θ with the inverse cosine function.

[6]:
cos_theta = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
print(f"cos(theta): {cos_theta:.2f}")
print(f"theta (radians): {np.arccos(cos_theta):.2f}")
print(
    f"theta (degrees): {np.degrees(np.arccos(cos_theta)):.2f}\N{DEGREE SIGN}"
)  # or multiply radians by 180/math.pi

a = np.array([1, 3])
b = np.array([4, 1])

plt.quiver(
    [0, 0],
    [0, 0],
    [a[0], b[0]],
    [a[1], b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange"],
)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.5, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 - 0.7], color="tab:orange", fontsize=12)
plt.annotate(
    f"{np.degrees(np.arccos(cos_theta)):.1f}\N{DEGREE SIGN}", [0.4, 0.4], fontsize=10
)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.gca().set_aspect("equal")
plt.title("Value of $\\theta$")
plt.show()
cos(theta): 0.54
theta (radians): 1.00
theta (degrees): 57.53°
../_images/linear_algebra_la_w3_16_1.png

Vector projections#

Now, let’s say we want to project b onto a.

🔑 The vector projection of b onto a (denoted as projab) is a vector with the same direction as a and a magnitude such that the tip of b lies perpendicularly onto a.

It’s like b casting its shadow onto a.

[7]:
a = np.array([1, 3])
b = np.array([4, 1])
proj_b = (np.dot(a, b) / np.linalg.norm(a)) * (a / np.linalg.norm(a))
d = b - proj_b

img = plt.imread("../_static/flashlight.jpg")
angle = math.degrees(math.atan2(a[1], a[0])) - 90
imgbox = mpl.offsetbox.OffsetImage(
    rotate(img, angle, reshape=False, cval=255), zoom=0.05
)
imgabb = mpl.offsetbox.AnnotationBbox(imgbox, (5, 0.5), xycoords="data", frameon=False)
angle = math.degrees(math.atan2(a[1], a[0]))

shadow = plt.Polygon(
    [proj_b, b, [0, 0]],
    closed=True,
    fill=True,
    edgecolor="gray",
    facecolor="gray",
    alpha=0.2,
)

plt.quiver(
    [0, 0],
    [0, 0],
    [a[0], b[0]],
    [a[1], b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange"],
)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.gca().add_artist(imgabb)
plt.gca().add_patch(shadow)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.5, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 - 0.7], color="tab:orange", fontsize=12)
plt.annotate("$\\theta$", [0.4, 0.4], fontsize=10)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.title("Projection as the 'shadow' cast by the vector")
plt.gca().set_aspect("equal")
plt.show()
../_images/linear_algebra_la_w3_19_0.png

The definition of cosθ in a right triangle is adjacent/hypotenuse.

6274b185747c49aa808f9af0103c2f46 Source: www.bbc.co.uk/bitesize

The hypotenuse is the length of vector we want to project (b).

The adjacent is the length of such projection (projab).

So, by definition:

cosθ=projabb

and, the length of the projection of b is:

projab=bcosθ

e9b537f0befb4017bf4863a484f179a3 Source: www.ncetm.org.uk

In the image above, we can see an interesting fact.

If the length of the vector we want to project is 1, then the length of the projection is cosθ.

projab=cosθ when b=1

It turns out we don’t need cosθ to calculate the length of the projection.

We can substitute the definition of cosθ into the definition of the length of the projection.

Definition of cosθ:

abab=cosθ

Definition of length of the projection:

projab=bcosθ

So it becomes:

projab=babab

which simplifies to

projab=aba

What about the direction?

By definition the projection of b onto a must have the same direction as a.

🔑 A unit vector has direction a1,a2,...,anRn and length of 1 (a=1).

Let projab be the length of the projection and aa be the unit vector of a, we get that

projab=projabaa

Finally, we can substitute the definition of projab and we obtain the formula of the projection of b onto a:

📐 projab=abaaa

[8]:
a = np.array([1, 3])
b = np.array([4, 1])
proj_b = (np.dot(a, b) / np.linalg.norm(a)) * (a / np.linalg.norm(a))

plt.quiver(
    [0, 0, 0],
    [0, 0, 0],
    [a[0], b[0], proj_b[0]],
    [a[1], b[1], proj_b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange", "tab:green"],
    alpha=[0.5, 1.0, 1.0],
)
plt.plot([proj_b[0], b[0]], [proj_b[1], b[1]], "k--", alpha=0.5)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.annotate(
    "$\\vec{a}$",
    [a[0] / 2 - 0.1, a[1] / 2 + 1],
    color="tab:blue",
    fontsize=12,
    alpha=0.5,
)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 - 0.7], color="tab:orange", fontsize=12)
plt.annotate(
    "$\\vec{proj_{a}b}$",
    [proj_b[0] / 2 - 1.1, proj_b[1] / 2],
    color="tab:green",
    fontsize=12,
)
plt.annotate("$\\theta$", [0.4, 0.4], fontsize=10)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.title("Projection of $\\vec{b}$ onto $\\vec{a}$")
plt.gca().set_aspect("equal")
plt.show()
../_images/linear_algebra_la_w3_26_0.png

We can see that projab (adjacent) and b (hypotenuse) form a right triangle.

[9]:
a = np.linalg.norm(proj_b)
h = np.linalg.norm(b)
o = np.linalg.norm(proj_b - b)

cos_theta = a / h
sin_theta = o / h

From the Pythagorean theorem we have

h2=o2+a2

Equivalently:

1=(oh)2+(ah)2

1=cosθ2+sinθ2

Let’s verify it.

[10]:
assert h**2 == o**2 + a**2
assert 1 == (o / h) ** 2 + (a / h) ** 2
assert 1 == cos_theta**2 + sin_theta**2

We can also verify that the angles of the triangle sum up to 180.

We already have one angle, and one is 90 by definition. We only need the one between b and its adjacent projbb.

[11]:
a = np.array([1, 3])
b = np.array([4, 1])
proj_b = (np.dot(a, b) / np.linalg.norm(a)) * (a / np.linalg.norm(a))
c = proj_b - b

a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc_1 = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc_1)
b_deg = math.degrees(math.atan2(b[1], b[0]))
c_deg = math.degrees(math.atan2(c[1], c[0]))
arc_2 = mpl.patches.Arc((b[0], b[1]), 1, 1, angle=0, theta1=-180 - b_deg, theta2=-c_deg)
plt.gca().add_patch(arc_2)
arc_3 = plt.Rectangle(
    proj_b,
    -0.3,
    -0.3,
    angle=a_deg,
    fill=False,
    edgecolor="k",
)
plt.gca().add_patch(arc_3)

plt.quiver(
    [0, 0, 0],
    [0, 0, 0],
    [a[0], b[0], proj_b[0]],
    [a[1], b[1], proj_b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange", "tab:green"],
    alpha=[0.5, 1.0, 1.0],
)
plt.plot([proj_b[0], b[0]], [proj_b[1], b[1]], "k--", alpha=0.5)

plt.annotate(
    "$\\vec{a}$",
    [a[0] / 2 - 0.1, a[1] / 2 + 1],
    color="tab:blue",
    fontsize=12,
    alpha=0.5,
)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 - 0.7], color="tab:orange", fontsize=12)
plt.annotate(
    "$\\vec{proj_{a}b}$",
    [proj_b[0] / 2 - 1.1, proj_b[1] / 2],
    color="tab:green",
    fontsize=12,
)
plt.annotate(
    f"{np.degrees(np.arccos(cos_theta)):.1f}\N{DEGREE SIGN}", [0.4, 0.4], fontsize=10
)
plt.annotate("$\\theta_3$", [3.0, 0.95], fontsize=10)
plt.annotate("90\N{DEGREE SIGN}", [0.9, 1.4], fontsize=10)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.gca().set_aspect("equal")
plt.title("The sum of the 3 angles is 180")
plt.show()
../_images/linear_algebra_la_w3_33_0.png

Let’s find cosθ3 and verify that the sum of the 3 angles is 180.

[12]:
theta_1_deg = np.degrees(np.arccos(cos_theta))

a = np.linalg.norm(proj_b - b)
h = np.linalg.norm(b)
o = np.linalg.norm(proj_b)

cos_theta_2 = a / h
theta_2_deg = np.degrees(np.arccos(cos_theta_2))

theta_3_deg = 90

assert theta_1_deg + theta_2_deg + theta_3_deg == 180

Let’s consider a different pair of vectors.

a=[23] and b=[41]

[13]:
a = np.array([-2, 3])
b = np.array([4, 1])

plt.quiver(
    [0, 0],
    [0, 0],
    [a[0], b[0]],
    [a[1], b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange"],
)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.7, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 + 0.3], color="tab:orange", fontsize=12)
plt.annotate("$\\theta$", [0.1, 0.6], fontsize=10)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.gca().set_aspect("equal")
plt.title("Two vectors that form an obtuse angle")
plt.show()
../_images/linear_algebra_la_w3_37_0.png

We can use the ‘shadow’ metaphor to get an intuition of what the projection of b onto a might look like.

[14]:
a = np.array([-2, 3])
b = np.array([4, 1])
proj_b = (np.dot(a, b) / np.linalg.norm(a)) * (a / np.linalg.norm(a))
d = b - proj_b

img = plt.imread("../_static/flashlight.jpg")
angle = math.degrees(math.atan2(a[1], a[0])) - 90
imgbox = mpl.offsetbox.OffsetImage(
    rotate(img, angle, reshape=False, cval=255), zoom=0.05
)
imgabb = mpl.offsetbox.AnnotationBbox(imgbox, (5, 1.5), xycoords="data", frameon=False)
angle = math.degrees(math.atan2(a[1], a[0]))

shadow = plt.Polygon(
    [proj_b, b, [0, 0]],
    closed=True,
    fill=True,
    edgecolor="gray",
    facecolor="gray",
    alpha=0.2,
)

plt.quiver(
    [0, 0],
    [0, 0],
    [a[0], b[0]],
    [a[1], b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange"],
)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.gca().add_artist(imgabb)
plt.gca().add_patch(shadow)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.7, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 + 0.3], color="tab:orange", fontsize=12)
plt.annotate(
    f"{np.degrees(np.arccos(cos_theta)):.1f}\N{DEGREE SIGN}", [0.1, 0.6], fontsize=10
)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.title("Projection as the 'shadow' cast by the vector")
plt.gca().set_aspect("equal")
plt.show()
../_images/linear_algebra_la_w3_39_0.png

Let’s project b onto a.

[15]:
a = np.array([-2, 3])
b = np.array([4, 1])
proj_b = (np.dot(a, b) / np.linalg.norm(a)) * (a / np.linalg.norm(a))

plt.quiver(
    [0, 0, 0],
    [0, 0, 0],
    [a[0], b[0], proj_b[0]],
    [a[1], b[1], proj_b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange", "tab:green"],
)
plt.plot([proj_b[0], b[0]], [proj_b[1], b[1]], "k--", alpha=0.5)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.7, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 + 0.3], color="tab:orange", fontsize=12)
plt.annotate(
    "$\\vec{proj_{a}b}$",
    [proj_b[0] / 2 - 1.2, proj_b[1] / 2 - 0.2],
    color="tab:green",
    fontsize=12,
)
plt.annotate(
    f"{np.degrees(np.arccos(cos_theta)):.1f}\N{DEGREE SIGN}", [0.1, 0.6], fontsize=10
)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.title("Projection of $\\vec{b}$ onto $\\vec{a}$")
plt.gca().set_aspect("equal")
plt.show()
../_images/linear_algebra_la_w3_41_0.png

Geometric intuition of Dot product#

Let’s revisit the definition of cosθ which we obtained from the Law of Cosines.

📐 abab=cosθ

If we move ab back to the RHS we get

ab=abcosθ

And when cosθ>0 we can substitute bcosθ with projab (whose equivalence was obtained from the general definition cosθ=adjacent/hypotenuse)

ab=aprojab

🔑 When a and b “agree” on the direction (0°<θ<90°, that is cosθ>1) the dot product between a and b is the length of a times the length of projection b onto a.

Let’s verify it.

[16]:
a = np.array([1, 3])
b = np.array([4, 1])

cos_theta = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
proj_b = np.linalg.norm(b) * cos_theta * a / np.linalg.norm(a)

assert cos_theta > 0
assert np.isclose(np.dot(a, b), np.linalg.norm(a) * np.linalg.norm(proj_b))

Let’s imagine b was parallel to a, that is, cosθ=1 (0° angle).

Then b=projab. In other words, b is already projected onto a.

In this case

ab=ab

[17]:
a = np.array([1, 3])
b = np.array([4, 1])

cos_theta = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
proj_b = np.linalg.norm(b) * cos_theta * a / np.linalg.norm(a)

plt.quiver(
    [0, 0],
    [0, 0],
    [a[0], proj_b[0]],
    [a[1], proj_b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:green"],
)
plt.annotate(
    "$\\vec{a}$",
    [a[0] / 2 - 0.1, a[1] / 2 + 1],
    color="tab:blue",
    fontsize=12,
)
plt.annotate(
    "$\\vec{proj_{a}b}$",
    [proj_b[0] / 2 - 1.1, proj_b[1] / 2],
    color="tab:green",
    fontsize=12,
)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.title(r"$\vec{a} \cdot \vec{b} = \|\vec{a}\|\|\vec{b}\|$ when $\cos\theta = 1$")
plt.show()
../_images/linear_algebra_la_w3_46_0.png

Let’s see the case when the equivalence ab=aprojab doesn’t hold, but ab=abcosθ does.

[18]:
a = np.array([-2, 3])
b = np.array([4, 1])

cos_theta = np.dot(a, b) / (np.linalg.norm(a) * np.linalg.norm(b))
proj_b = np.linalg.norm(b) * cos_theta * a / np.linalg.norm(a)

plt.quiver(
    [0, 0, 0],
    [0, 0, 0],
    [a[0], b[0], proj_b[0]],
    [a[1], b[1], proj_b[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    color=["tab:blue", "tab:orange", "tab:green"],
)
plt.plot([proj_b[0], b[0]], [proj_b[1], b[1]], "k--", alpha=0.5)
a_deg = math.degrees(math.atan2(a[1], a[0]))
b_deg = math.degrees(math.atan2(b[1], b[0]))
arc = mpl.patches.Arc((0, 0), 1, 1, angle=0, theta1=b_deg, theta2=a_deg)
plt.gca().add_patch(arc)
plt.annotate("$\\vec{a}$", [a[0] / 2 - 0.7, a[1] / 2], color="tab:blue", fontsize=12)
plt.annotate("$\\vec{b}$", [b[0] / 2, b[1] / 2 + 0.3], color="tab:orange", fontsize=12)
plt.annotate(
    "$\\vec{proj_{a}b}$",
    [proj_b[0] / 2 - 1.2, proj_b[1] / 2 - 0.2],
    color="tab:green",
    fontsize=12,
)
plt.annotate(
    f"{np.degrees(np.arccos(cos_theta)):.1f}\N{DEGREE SIGN}", [0.1, 0.6], fontsize=10
)
plt.xticks(np.arange(-3, 7, 1))
plt.yticks(np.arange(-3, 6, 1))
plt.title("Projection of $\\vec{b}$ onto $\\vec{a}$")
plt.show()
../_images/linear_algebra_la_w3_48_0.png

Since the angle is more than 90°, cosθ<0.

So ab will be negative.

But aprojab is always positive.

[19]:
print(f"Dot product: {np.dot(a, proj_b):.2f}")
print(
    f"Norm of a times norm of projection: {np.linalg.norm(a) * np.linalg.norm(proj_b):.2f}"
)
print(
    f"Norm of a times norm of b times cos theta: {np.linalg.norm(a) * np.linalg.norm(b) * cos_theta:.2f}"
)
Dot product: -5.00
Norm of a times norm of projection: 5.00
Norm of a times norm of b times cos theta: -5.00

Linear transformations#

Let’s define some transformation matrices.

Horizontal scaling by 2:

A1=[2001]

Horizontal reflection:

A2=[1001]

Rotation by 90 degrees clockwise:

A3=[0110]

Horizontal shear by 0.5:

A4=[10.501]

[20]:
hscaling = np.array([[2, 0], [0, 1]])
reflection_yaxis = np.array([[-1, 0], [0, 1]])
rotation_90_clockwise = np.array([[0, 1], [-1, 0]])
shear_x = np.array([[1, 0.5], [0, 1]])

Let’s apply these transformations to the basis vectors.

e1=[10] and e2=[01]

[21]:
e1 = np.array([1, 0])
e2 = np.array([0, 1])

A transformation is applied by multiplying Ak by ei.

For e1 we have:

[2001][10]=[2×1+0×00×1+1×0]=[20]

For e2 we have:

[2001][01]=[2×0+0×10×0+1×1]=[01]

Let’s verify it.

[22]:
display(
    Math(
        "T(\\vec{e_1})="
        + sp.latex(sp.Matrix(list(hscaling @ e1)))
        + "T(\\vec{e_2})="
        + sp.latex(sp.Matrix(list(hscaling @ e2)))
    )
)
T(e1)=[20]T(e2)=[01]

Let’s visualize it.

[23]:
def plot_transformation(T, title, ax, basis=None, lim=5):
    if basis is None:
        e1 = np.array([[1], [0]])
        e2 = np.array([[0], [1]])
    else:
        e1, e2 = basis
    zero = np.zeros(1, dtype="int")
    c = "tab:blue"
    c_t = "tab:orange"
    ax.set_xticks(np.arange(-lim, lim))
    ax.set_yticks(np.arange(-lim, lim))
    ax.set_xlim(-lim, lim)
    ax.set_ylim(-lim, lim)
    _plot_vectors(e1, e2, c, ax)
    ax.plot(
        [zero, e2[0], e1[0] + e2[0], e1[0]],
        [zero, e2[1], e1[1] + e2[1], e1[1]],
        color=c,
    )
    _make_labels(e1, "$e_1$", c, y_offset=(-0.2, 1.0), ax=ax)
    _make_labels(e2, "$e_2$", c, y_offset=(-0.2, 1.0), ax=ax)
    e1_t = T(e1)
    e2_t = T(e2)
    _plot_vectors(e1_t, e2_t, c_t, ax)
    ax.plot(
        [zero, e2_t[0], e1_t[0] + e2_t[0], e1_t[0]],
        [zero, e2_t[1], e1_t[1] + e2_t[1], e1_t[1]],
        color=c_t,
    )
    _make_labels(e1_t, "$T(e_1)$", c_t, y_offset=(0.0, 1.0), ax=ax)
    _make_labels(e2_t, "$T(e_2)$", c_t, y_offset=(0.0, 1.0), ax=ax)
    ax.set_aspect("equal")
    ax.set_title(title)


def _make_labels(e, text, color, y_offset, ax):
    e_sgn = 0.4 * np.array([[1] if i == 0 else i for i in np.sign(e)])
    return ax.text(
        e[0] - 0.2 + e_sgn[0],
        e[1] + y_offset[0] + y_offset[1] * e_sgn[1],
        text,
        fontsize=12,
        color=color,
    )


def _plot_vectors(e1, e2, color, ax):
    ax.quiver(
        [0, 0],
        [0, 0],
        [e1[0], e2[0]],
        [e1[1], e2[1]],
        color=color,
        angles="xy",
        scale_units="xy",
        scale=1,
    )


def T(A, v):
    w = A @ v
    return w


fig, axs = plt.subplots(nrows=2, ncols=3, figsize=(3 * 4, 2 * 4))
ax1, ax2, ax3, ax4, ax5, ax6 = axs.flatten()
plot_transformation(partial(T, hscaling), title="Horizontal scaling by 2", ax=ax1)
plot_transformation(partial(T, reflection_yaxis), title="Horizontal reflection", ax=ax2)
plot_transformation(
    partial(T, rotation_90_clockwise), title="Rotation by 90 degrees clockwise", ax=ax3
)
plot_transformation(partial(T, shear_x), title="Horizontal shear by 0.5", ax=ax4)
plot_transformation(
    partial(T, rotation_90_clockwise @ shear_x), title="Rotation and shear", ax=ax5
)
plot_transformation(
    partial(T, shear_x @ rotation_90_clockwise), title="Shear and rotation", ax=ax6
)
plt.tight_layout()
plt.show()
../_images/linear_algebra_la_w3_60_0.png

Linear transformations and rank#

Since linear transformations are matrices, they can be singular and non-singular and also have a rank.

[24]:
non_sing_tr = np.array([[3, 1], [1, 2]])
sing_tr = np.array([[1, 1], [2, 2]])

fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(2 * 4, 1 * 4))
plot_transformation(
    partial(T, non_sing_tr), title="Non-singular transformation", ax=ax1
)
plot_transformation(partial(T, sing_tr), title="Singular transformation", ax=ax2)
plt.tight_layout()
plt.show()
../_images/linear_algebra_la_w3_63_0.png

We can also verify that the first linear transformations has rank 2, while the second one has rank 1.

So the first linear transformations doesn’t reduce the amount of information of the original matrix, while the second one does as it has reduced the rank from 2 to 1, that is transforms a matrix with 2 linearly independent rows to one with only 1 linearly independent row.

🔑 The singularity of a linear transformation determines whether there is dimensionality reduction

🔑 The rank of a linear transformation quantifies the dimensionality reduction

[25]:
m, p = sp.Matrix(non_sing_tr).rref()
print("Number of pivots (rank):", len(p))
m
Number of pivots (rank): 2
[25]:
[1001]
[26]:
m, p = sp.Matrix(sing_tr).rref()
print("Number of pivots (rank):", len(p))
m
Number of pivots (rank): 1
[26]:
[1100]

Linear transformations and determinant#

A linear transformation also has a determinant.

🔑 The determinant of a linear transformation is the area or volume of the transformed basis vectors

Let’s consider thes non-singular transformations

[3112]

whose determinant is 5.

If we apply it to the basis vectors (whose area is 1) we get a parallelogram with area 5.

[27]:
fig, ax = plt.subplots()
plot_transformation(partial(T, non_sing_tr), title="Non-singular transformation", ax=ax)
t_e1 = partial(T, non_sing_tr)(e1)
t_e2 = partial(T, non_sing_tr)(e2)
b_area = plt.Rectangle(
    [0, 0],
    1,
    1,
    fill=True,
    facecolor="tab:blue",
    alpha=0.2,
)
t_area = plt.Polygon(
    [[0, 0], t_e1, t_e1 + t_e2, t_e2],
    closed=True,
    fill=True,
    facecolor="tab:orange",
    alpha=0.2,
)
plt.gca().add_patch(b_area)
plt.gca().add_patch(t_area)
plt.title("Determinant as the area of the parallelogram")
plt.show()
../_images/linear_algebra_la_w3_70_0.png

To verify it, we can use the formula for the area of a triangle At=bh2. For a parallelogram it’s just Ap=bh.

To calculate Ap=bh we only need h, because b=T(e1).

To find h we can project T(e2) onto T(e1) and subtract the projection from T(e2).

[28]:
t_e1 = partial(T, non_sing_tr)(e1)
t_e2 = partial(T, non_sing_tr)(e2)
proj_t_e2 = (np.dot(t_e1, t_e2) / np.linalg.norm(t_e1)) * (t_e1 / np.linalg.norm(t_e1))
h = t_e2 - proj_t_e2

plt.quiver(
    [0, 0, 0, proj_t_e2[0], t_e1[0]],
    [0, 0, 0, proj_t_e2[1], t_e1[1]],
    [t_e1[0], t_e2[0], proj_t_e2[0], h[0], h[0]],
    [t_e1[1], t_e2[1], proj_t_e2[1], h[1], h[1]],
    angles="xy",
    scale_units="xy",
    scale=1,
    fc=["tab:orange", "tab:orange", "tab:pink", "none", "none"],
    ec=["none", "none", "none", "tab:green", "tab:green"],
    ls=["solid", "solid", "solid", "dashed", "dashed"],
    linewidth=1,
)
t_area = plt.Polygon(
    [[0, 0], t_e1, t_e1 + t_e2, t_e2],
    closed=True,
    fill=True,
    facecolor="tab:orange",
    alpha=0.2,
)
plt.gca().add_patch(t_area)
plt.plot(
    [0, t_e2[0], t_e1[0] + t_e2[0], t_e1[0]],
    [0, t_e2[1], t_e2[1] + t_e1[1], t_e1[1]],
    color="tab:orange",
)
plt.annotate("$T(e_1)$", [t_e1[0], t_e1[1] - 0.4], color="tab:orange", fontsize=12)
plt.annotate("$T(e_2)$", [t_e2[0], t_e2[1] + 0.4], color="tab:orange", fontsize=12)
plt.annotate(
    "$proj_{T_(e_1)}T(e_2)$",
    [proj_t_e2[0] - 1.0, proj_t_e2[1] - 1.0],
    color="tab:pink",
    fontsize=12,
)
plt.annotate(
    "$h$",
    [t_e2[0] + 0.5, t_e2[1] - 0.8],
    color="tab:green",
    fontsize=12,
)
plt.xticks(np.arange(-5, 5))
plt.yticks(np.arange(-5, 5))
plt.xlim(-5, 5)
plt.ylim(-5, 5)
plt.gca().set_aspect("equal")
plt.title("The height of the triangles/parallelogram")
plt.show()
../_images/linear_algebra_la_w3_72_0.png

Now that we have h, let’s calculate Ap and verify it’s the same as the determinant of the linear transformation.

[29]:
assert np.isclose(np.linalg.norm(t_e1) * np.linalg.norm(h), np.linalg.det(non_sing_tr))