Time for a math-post.

Linear regression, or fitting a line on a list of 2D points, is quite common with lots of code samples. But once you want to go further torwards more dimensions, there are no examples whatsoever… You find the wikipedia article about the theory, but eveything is in general terms. Its about calculating the invers of a matrix and lots of stuff that you dont want to do when thinking about performance.

Additionally, those versions all assume a form

y = Sum(x_i*b_i)

Means you minimize the distance in one dimension, not the distance to the plane itself. Furthermore if the pointcloud is spreading wide in the choosen dimension and less in the others (e.g. the plan would be vertical), then you run into heaps of errors from floating-precision.

### Minimize point to plane distance

So i decided to try it myself and calculate a closed form, ready to be coded. You can describe a plane with 4 parameters like that:

a*x+b*y+c*z+d = 0

Fortunatly, this also gives you the distance of any point to the plane, given that (a^2+b^2+c^2) = 1. and (a,b,c) is the normal vector of the plane.

So what i wanted to do is minimize this value over all given points:

Sum(a*x_i+b*y_i+c*z_i+d) -> min

Best way to find a minimum is throu partial derivatives. If all of them are zero, you either got a minimum or maximum. Since you can choose values that make the distance arbitrarily bad, you will never find a maximum. And since its a linear function, we dont need to worry about local minima… sounds easy as pie, doesnt it?

(just one note: of course, making all paramters 0 is a solution to the minima, but we just ignore that one)

### Partial derivatives in the morning

So lets do some calculus:

4 Parameters means 4 partial derivatives. To make writing easier, the sum always goes from 0 to n-1 with n being the number of points.

A: a*Sum(x_i^2) + b*Sum(x_i*y_i) + c*Sum(x_i*z_i) + d*n*meanX = 0 B: a*Sum(x_i*y_i) + b*Sum(y_i*y_i) + c*Sum(y_i*z_i) + d*n*meanY = 0 C: a*Sum(x_i*z_i) + b*Sum(y_i*z_i) + c*Sum(z_i*z_i) + d*n*meanZ = 0 D: a*meanX + b*meanY + c*meanZ + d = 0

From D you will get an easy estimation for d once the estimations for a, b and c are there. So lets eliminate d from the rest by substracting. Again for easier ways of writing i introduce the term

Tab = Sum(a_i*b_i) - n*meanA*meanB

so for example:

Txx = Sum(x_i^2) - n*meanX^2 Tyz= Sum(y_i*z_i) - n*meanY*meanZ

Then we get the following equations:

I : (A - n*meanX*D): a*Txx + b*Txy + c*Txz = 0 II : (B - n*meanY*D) : a*Txy + b*Tyy + c* Tyz = 0 II: (C - n*meanZ*D) : a*Txz + b*Tyz + c*Tzz = 0

As it turns out Txx is a scaled version of the estimate for the variance of the pointcloud in the x-dimension. Till now we didn’t need to make any assumption on the data to perform the calculations. We have to change that now, since we need to divide by a Taa term to get further. Its safe to say that there has to be a Term with Taa != 0 for one dimension. Otherwise all points are exactly the same and there is nothing to do.

Lets assume **Txx != 0** and that** |Txx| >= |Tyy| **>= ** |Tzz|** (otherwise just substitute with y or z). This means that the pointcloud spreads the most in x, therefore we can assume that *a* will probably be the smallest (possibly 0) parameter of the 3. This means we want to calculate it from the others.

Lets introduce another short term (otherwise the equations will get really lenghty):

Sab = Tab*Txx - Txa*Txb

U: II - (Txy/Txx)*I : b*Syy + c*Syz = 0 V: III - (Txz/Txx)*I : b*Syz + c*Szz = 0

Since there is an infinite number of possible solutions (we do not require (a^2+b^2+c^2 ) = 1 yet ), we need to choose at least one parameter. To make it easier we choose it to be 1, its just important to not choose one that should be zero.

First and easiest: all S are zero -> then we can choose b and c as we want so b = c= 1;

If Szz != 0 then we set b= 1 and c= -Syz/Szz.

If Szz = 0 but Syy != 0, then c= 1 and b= 0 (since Syz is automatically 0 now)

### Once b and c are done, a and d is easy

Determining a with given b and c is easy. Specially since we already “know” that Txx != 0. We just use equation I:

a = -(b*Txy+c*Txz)/Txx

As said earlier d is already known from D :

d= -a*meanX - b*meanY - c*meanZ

And thats it. A simple, closed form to fitting a plane into a pointcloud. Why didnt anyone else write that down before?

Feel free to leave a comment for questions or if you spot an error.