How to Move? Lie Group Robotics

Suppose we’d like to design a humanoid robot, which data-structures do we need to represent the state of the robot at each point in time and how do we actually move the robot around?

We can abstract away the muscles and nervous system of a humanoid and only work with the skeleton, imagine taking an X-ray of the above image to get.

With the skin stripped away we’re left with a graph like structure with vertices V called joints and edges E called links.

There are many different types of joints but the most popular one is Revolute joint which represents a rotation about an axis.

Other useful joints to know about are the Prismatic, Universal and Spherical joint. Each joint defines a particular topology, as in we can describe an any motion of this joint using a particular kind of shape.

E.g: A Revolute joint is a rotation which can represented as a complex number which can in turn be represented by a simple sphere S¹.

Revolute joints can only rotate so their movement can be represented with a complex number z = cos θ + i sin θ where i² = -1.

Let’s say you start at a position p = a + ib, a rotation by z is performed via multiplication

And if you stack the real part on top of the imaginary part, you end with what’s a called a rotation matrix which pops up often in Computer Graphics.

However, a robot is made up of many joints and we can determine a configuration space by taking the Cartesian product of the topology of each joint.

E.g: A robot with 2 Revolute joints is determined by taking S¹ x S¹ = T² which is just shorthand for a torus.

Representing Rotations

Let’s say we’d like to build a robot that can hand write letters. We have an arm with 3 links, 2 Revolute joints and an end effector equipped with a pencil.

The arm is in 3-d space but the paper is in 2-d so working with robotics involves all sorts of coordinate transforms very similarly to the way projective geometry is relevant for Computer Graphics.

We can represent the position of the entire robot by specifying the positions of each joint and the end effector.

Euler Angles via roll, pitch and yaw are a popular way to represent the position of a 3-d object but they have 2 major issues

  1. Singularities in representation: Let’s say the pen is facing 90 degrees upwards, then a sensor at the top of it can’t tell the difference between yaw and roll. Motors can burn themselves out, joints can snap and are likely to get you hurt.
  2. Coordinate dependent: Let’s say we have 10 joints, which one of them should be the reference frame. How do you transform the coordinates of joint 9 relative to joint 2? Each new kind of robot would require deriving complex trigonometric expressions and applying them.

Quaternions help alleviate the singularity issue but it still requires a coordinate transform for each joint which is why we’ll be looking at an alternate formalism inspired by Lie Groups called Screws and Twists.

Rotations form a Group

Rotations have particularly interesting properties, more specifically Rotations form an algebraic group which is just a fancy way to say that rotations can be inverted and that the composition of a bunch of rotations is still a rotation.

A group G is a group if for any two elements of the group a, b the following properties hold.

  1. Closure: ∀ a, b ∈ G → a ∘ b ∈ G
  2. Associativity: ∀ a, b, c ∈ G → (a ∘ b) ∘ c = a ∘ (b ∘ c)
  3. Identity: ∃ e ∈ G s.t ∀ a ∈ G → a ∘ e = e ∘ a = a
  4. Inverse: ∀ a ∈ G, ∃ b s.t a ∘ b = e

But if the above definitions are too opaque, here’s how I identity a Group.

  1. Closure: You can compose elements of the group
  2. Associativity: Parentheses don’t matter
  3. Identity: There’s a null operator that does nothing
  4. Inverse: The action of any element can be reversed

2-d rotations have an additional property called Commutativity a ∘ b = b ∘ a which means that the order in which you apply 2 different rotations doesn’t matter.

Examples are heavily inspired by this Stack Overflow answer

5 x 3 = 3 x 5 but if you have two matrices A, B then AB ≠ BA.

3-d rotations are not commutative, as in the order in which you apply 2 rotations in 3-d does matter.

Lie Group

A Lie Group is a smoothly differentiable Group which means it’s an algebraic group in which there are no singularities i.e: holes or points at infinity. This means that no single small change in a joint configuration will cause a large change in the end effector position which means smooth and stable motions.

The group of rotation matrices is called the Special Orthogonal Group SO(n) and it’s a Lie Group.

In 3-d SO(3) is equivalent to all 3 × 3 matrices orthogonal matrices with determinant = 1.

A matrix is orthogonal if

so(3) consists of all 3 × 3 skew-symmetric real matrices which is just a fancy way of saying

Now the mind blowing part

If SO(3) represents a rotation then so(3) represents a tiny step towards that rotation or more specifically a differentiation.

SO(3) is a Lie Group and so(3) is a Lie algebra.

It’s important to remember that rotation matrices have distinct applications

  1. Displacement: multiplication by a rotation matrix represents a displacement
  2. Representation of position: a rotation matrix by itself represents a position
  3. Change of reference frame: a base reference frame can be multiplied by a rotation matrix to get a new reference frame

Rigid Body Motion

Lie Groups only let us represent rotations but motions are a combination of translations and rotations so we need to introduce the Special Euclidean Group SE(3) and it’s associated Algebra se(3).

To fully describe a rigid body in space (e.g: joint, link) we need a total of 6 numbers

  • 3 numbers for the x, y, z position
  • 3 numbers for the rotation yaw, roll, pitch

A rigid body means anything that doesn’t look like cloth, jello or air. In particular joints and links are rigid bodies.

So given some starting position p we can represent a rigid body motion to a new position p’ via a translation t and a rotation matrix R.

Rigid body motions are also known as the special Euclidean Group SE(3) and can be represented as a matrix where R ∈ SO(3) and p ∈ ℝ³.

Transformation matrices form a group, so they can be composed and inverted.

An alternate way of thinking of a transformation matrix is via what’s called a Screw. Think of a Screw as a stick attached to the back of an object like a link or joint. You can turn the screw about its axis \hat{w} to rotate the end object by angle of θ and you can move the screw around to translate the object.

A Twist has both a linear velocity v (translation) and angular velocity w (rotation) component so we can package them up in a matrix format.

A rotation of size θ about an axis w, can be represented via an axis angle representation using Rodrigues’ formula.

The [] notation is a shorthand for skew symmetric matrices. Skew symmetric matrices simply calculations of angular velocity since w × R = [w]R.

Similarly to Rotations matrices, Transformation matrices can be differentiated if we take the skew symmetric representation of a screw S.

Where [S] is defined as

Screws provide us with a technique for smooth rigid body motions. For any kind of motion we can construct its derivative (infinitely small step towards that motion). The advantage is that our new representation lacks the singularities associated with the Euler Angle representation which means we won’t get unstable behavior when we start modeling the Forward and Inverse Kinematics problem.

The screw representation allows us to easily transform coordinates from some point a to a point b using the Adjoint operation.

Where the Adjoint operation of a transformation matrix T is

Product of Exponentials Formula

Suppose we’re working with a robot with an open chain robot with 3 joints.

We’d like to solve the Forward Kinematics problem where given a robot configuration and changes to its joint positions θ, we ask what is the end effector position?

Typically we would need to derive the Forward Kinematics formula analytically depending on the configuration of the robot but this is tedious and error prone.

Instead if we imagine each joint as applying a screw motion to all its descendants then we can solve the Forward Kinematics problem very easily using the Product of Exponentials Formula.

Where

  • T ∈ SE(3) is the final end effector position
  • θ_n is the new position of the n’th joint
  • parents represents the parents of the end effector
  • [S_n] = (w_n, v_n) is the screw axis of joint n
  • M ∈ SE(3) is the end effector position when all joint values are zeroed out

If we apply the formula to our 3 joint open chain robot we get.

The mirror problem of Forward Kinematics is Inverse Kinematics. It’s much more complex and is typically solved using numerical approximations. Given a desired end effector position T, what changes should be made to the joint positions θ?

We can rewrite the Product of Exponentials Formula as

where

  • T is given
  • The inverse of M can be computed or approximated by its pseudo-inverse
  • S are known
  • We solve for θ

Code Deep Dive

To make sure that everything make sense we’ll be going over the implementation of these algorithms by ferrolho which are written in Julia. The codebase is in turn based off the code companion to the Modern Robotics textbook.

The entire codebase is less than a 1,000 lines of code so I’d highly recommend you skim through the first 500 or so lines.

Vector to so(3)

Adjoint

Rotation Matrix Exponential

Transformation Matrix Exponential

Forward Kinematics

Putting everything together

Next Steps

This will most likely be my last chapter before 2020 but I have enough next steps to last you well beyond that.

  • Read Modern Robotics — it’s by far my favorite robotics textbook, it comes with a free online class and an invaluable code companion
  • Inverse Kinematics in F# — is a codebase that really helped me grok the inverse and forward kinematics problem, it uses numerical techniques but is still a valuable resource because of how clear and short the code is

On the more theoretical side, this chapter used a lot of ideas from abstract algebra and topology but we barely scratched the surface. Mathematicians often work on extremely useful problems but can’t communicate why they’re useful. Both Robotics and Machine Learning involve working with spaces and their shapes so it’s only natural that Mathematics designed for that very purpose would prove very useful.

  • Visual Group Theory — this is my favorite theoretical math book because it takes a subject as dense and opaque as group theory and makes it extremely intuitive. The book is filled with colorful explanations that do a lot more than the Greek text dumps that are more traditional in this area. You’ll feel a lot more comfortable reading other Abstract Algebra and Category Theory books after reading this one
  • Elementary Applied Topology — this book makes it explicit that it sacrifices rigor but with the purpose of giving you a broad overview of all the applications of Topology. There are tons of algorithms in this area that have not become mainstream.
  • Naive Lie Theory — goes deeper in talking about Lie Groups and Lie algebras. This book is a more traditional Theorem, Proof format but it’s valuable as a reference to check whether your intuition is well founded or not

There’s been more math invented in the last 50 years than the entire history of mankind. Many of those innovations have not found a natural applications but if you do find one that’s a company or at least an important OSS project.

Robots will save us

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store