Michael Lugo: Fun with the Separating Axis Theorem

Overview

Like most things mathematical, there is a nice Wikipedia article describing the Hyperplane Separation Theorem. An immediate consequence of the Hyperplane Separation Theorem is the Separating Axis Theorem (SAT): two closed convex sets are disjoint if and only if there exists a hyperplane between the two. This theorem is extremely useful in game programming to detect object collisions. Although this theorem works for any dimension, I will focus on the easiest to draw: the second dimension.

The Second Dimension

First, lets consider how to interpret the components of the SAT in my favorite space, $\mathbb{R}^2$. If you are familiar with hyperplanes in the second dimension, then you can safely skip this section. In any dimension, a hyperplane is the set of all vectors $\vec{u}$ that satisfy the equation $\vec{v} \cdot \vec{u} = c$ for a fixed vector $\vec{v}$ and constant $c$. When we specialize to $\mathbb{R}^2$, then we can expand this equation in terms of components. Let $\vec{v} = \langle v_1, v_2 \rangle$ and $\vec{u} = \langle x,y \rangle$.

$$ \vec{v} \cdot \vec{u} = c \Rightarrow v_1x + v_2y = c $$

This form should look familiar, since it is the standard form for a line in $\mathbb{R}^2$! So hyperlanes in $\mathbb{R}^2$ are straight lines. As I continuously remind my students, the only information you need to define a line is a point and a slope. If we simply reinterpret "slope" as "direction", then we recover that any hyperplane can be defined by a point and a direction. Only now, our direction is normal to the line we want. Note that this avoids the slope issue of a vertical line. Use the canvas below to drag the base point of the line and rotate the normal direction.

Convex Polygons

Now that we have the first ingredient for the SAT (a 2D definition of hyperplane), we focus on the hypothesis of a closed convex set. A set is simply a collection of objects, which is not very enlightening. In this setting, we specifically want a set of points in $\mathbb{R}^2$, which is a little more enlightening since this has a geometric interpretation as some points on the plane. A rough definition of a closed set is a set of points that contains its boundary. So if you were to draw it, there would be no dashed lines. A convex set of points satisfies the following condition: if $x$ and $y$ are points of your set, then any point $c$ lying on the line segment between $x$ and $y$ is also in the set. For example, a circle is closed, but it is not convex. A disk without its boundary circle is convex, but it is not closed. A disk with its boundary is closed and convex.

The class of closed convex sets in $\mathbb{R}^2$ is quite large, and many such sets are computationally expensive to verify that they are indeed convex and closed. That is why here we focus on a familiar and easy to use class of sets: polygons. If you have a filled polygon (you include all the points within the polygon) then the set of those points is automatically closed. The only thing we would need to check quickly and efficiently is if the set is also convex. For our most famous regular polygons, this is automatically true. However, this is not true for all polygons. To quickly test for convexity, a programmer may assume the polygon's information is presented in a certain way. In my canvas below, I have assumed that the points of my polygon are given clockwise, with the edges formed sequentially along pairs of vertices. With information like this, a simple cross product (imagining the vectors nested in $\mathbb{R}^3$) reveals the orientation between adjacent edges, and a consistent orientation signals the presence of a convex polygon.

Use the canvas below to test this with a polygon containing . Click and drag points to reposition them. If you break the convexity of the polygon (with non-intersecting lines), the background will turn red.

Projections onto Normals

We now have all the pieces necessary to discuss the SAT in the case of two convex polygons: two convex polygons are disjoint if and only if there exists some line that can be drawn between them. The question now becomes, "How do we find such a line?" or "Where can this line be?". Once again, our choice to use polygons aids us. If you stare at two disjoint polygons, you will find that you can always draw a line betwen them, and this line can be chosen to be parallel to an edge. It's a fact that two disjoint convex polygons will have a separating line whose normal direction is a normal direction to an edge of one of the polygons. Thus our question simplifies to, "How can we tell when there is a gap between two convex polygons along in a fixed direction?". The answer is to use projections.

When we fix a direction vector $\vec{v}$, then we can project any other vector $\vec{p}$ onto $\vec{v}$ using the equation $$ \operatorname{proj}\,_{\vec{v}}(\vec{p}) = \frac{\vec{v}\cdot\vec{p}}{\vec{v}\cdot \vec{v}}\,\vec{v} $$ If we really only cared about the direction of our vector $\vec{v}$, then we might as well use a unit vector for $\vec{v}$, which simplifies the above to $\operatorname{proj}\,_{\vec{v}}(\vec{p}) = (\vec{v} \cdot \vec{p})\vec{v}$. Now, if we think of $\vec{v}$ as defining the direction of an axis, then all the information we need is the magnitude of the projection, which is called the component of projection. Still using a unit vector $\vec{v}$, this simplifies to $$ \operatorname{comp}\,_{\vec{v}}(\vec{p}) = |\operatorname{proj}\,_{\vec{v}}(\vec{p})| = \vec{v} \cdot \vec{p} $$

So now we can quickly calculate the component of projection in any fixed direction. If we do that for every point of a polygon, we get a list of numbers. Now, suppose I told you to project the entire polygon onto the axis in the direction of unit vector $\vec{v}$? We would geometrically think of casting a shadow of the polygon onto the axis, which would result in an interval along this axis. To find this interval, we simply need to find the maximum and minimum components of projection for each point of the polygon, which is again an incredibly quick operation! Then to determine if there is a gap between the convex polygons in this fixed direction, we simply check for a gap in the projected intervals.

Use the canvas below to rotate the axis. The projection of these shapes onto the axis are shown in red and green for square and triangle respectively. Note that by rotating our axis, we can make the intervals intersect. However, this (clearly) doesn't imply the shapes intersect. To show intersection, we would need to show that for EVERY axis normal to an edge, the projection intervals intersect.

Putting It Together

Here you can see a complete example. The shape you formed above and a box are places on a canvas, and you can drag around your shape. The program projects each shape against the normal axis for each edge. The box's projections are green, while your shape's projections are blue. If the shapes are colliding, then the you will see each normal axis will have intersecting projection intervals. Otherwise, there will be at least one axis where the intervals are disjoint. (If the canvas is blank, then your shape above is not convex.)

Conclusion

The Separating Axis Theorem is used to determine collisions. Using convex polygons, this is quickly computed using projections onto normal axes. This enables simple and efficient collision detection between two convex polygons.

P.S. Try extending this idea to collisions between a disk and a convex polygon.