# 6DoF Visual Positioning

It is possible to position and orientate an object relative to a camera, given only three of its points (e.g. LEDs) in image-space. In fact, this is the minimal visual positioning system. It infers depth from the object’s size, rather than using a stereo camera system. It also infers the object’s 3-DOF orientation (with a single 2-way ambiguity).

The pose (position + orientation) of an object is a 6-DOF system (see Expressing Rotation). The three points in image-space each provide 2 coordinate variable, giving 6 variables in total.

The transformation from the 3×2 coordinate variables to 6-DOF pose is a non-trivial, non-linear transformation. By making a “far-field” approximation (that the object’s extent is much smaller than its distance from the camera) we can derive closed-form formulas that accomplish this transformation extremely efficiently.

These formulas provide the position in spherical coordinates ($\phi, \theta, r$) and the orientation as a quaternion ($a + b \boldsymbol{i} + c \boldsymbol{j} + d \boldsymbol{k}$)

The video below demonstrates this positioning system in action. It uses 4 LEDs rather than 3. Within this set of 4 LEDs, there are four different triples of LEDs that can be positioned using the technique above. This redundancy gives the system additional reliability, in case it temporarily looses track of one of the LEDs.

### 3 Responses to “6DoF Visual Positioning”

• Wang Yu says:

nice~

• Max DeVos says:

Could you help me understand how the coordinates are returned? I understand the quaternion, but I don’t understand how all three spherical coodinates are returned, and I don’t understand how “r” is found. Could you, or someone, please help me with this.

Anything is appreciated,
-Max DeVos