In my previous post we walked through the first algorithm we are putting together in this series - Linear Regression. I noted in the post that before we moved on it would be wise to cover the MathUtil and Matrix classes in detail to save us some time later. So let’s do it!
MatrixOperations
We are going to first deal with the MatrixOperations class as we will find later on that we will be using this class to perform some of the operations within our MathUtil class. The first method we will discuss is multiply.
Going back to our initial discussion of matrices in part one, remember that a matrix is a set of numbers in rows and columns such as:
[A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \ a_{21} & a_{22} & a_{23} \ a_{31} & a_{32} & a_{33} \ \end{bmatrix}]
where the subscript reperesents the row and column, i.e. 32 means row 3 column 2. Unlike programming arrays, the index starts at 1 and mathematicians describe matrices as being of type m x n, where m is the number of rows and n is the number of columns.
We won’t cover the reasoning behind why matrix multiplication works in the way it does here, but we will describe how it works. The why is a little outside the scope of what we wish to cover, but you can find some good explanations online. In order to multiply matrices A and B together, denoted AB, the number of columns in A must be the same as the number of rows in B. So if A is 3x2 and B is 2x7 we are all good. However, were to to try and perform the operation BA, this would not work because the number of columns in B (7) is not the same as the number of rows in A (3). This fact:
[AB \neq BA]
is called non-commutativity.
So, what does AB actually look like? What we do is take the entries of a column in A and a row in B, multiply them term-by-term and then sum them to get our new entry. This means that if we multiply an m x n matrix A with an n x p matrix B then AB is an m x p matrix. Let’s see this written down in an example. Firstly lets take a 2 x 3 matrix A:
[A = \begin{bmatrix} a_{11} & a_{12} & a_{13} \ a_{21} & a_{22} & a_{23} \ \end{bmatrix}]
and another 3 x 4 matrix B:
[B = \begin{bmatrix} b_{11} & b_{12} & b_{13} & b_{14} \ b_{21} & b_{22} & b_{23} & b_{24} \ b_{31} & b_{32} & b_{33} & b_{34} \ \end{bmatrix}]
then AB is:
[AB = \begin{bmatrix}
a_{11} & a_{12} & a_{13} \
a_{21} & a_{22} & a_{23} \
\end{bmatrix}
\begin{bmatrix}
b_{11} & b_{12} & b_{13} & b_{14} \
b_{21} & b_{22} & b_{23} & b_{24} \
b_{31} & b_{32} & b_{33} & b_{34} \
\end{bmatrix}
= \begin{bmatrix}
(a_{11}b_{11} + a_{12}b_{21} + a_{13}b_{31}) & (a_{11}b_{12} + a_{12}b_{22} + a_{13}b_{32}) & (a_{11}b_{13} + a_{12}b_{23} + a_{13}b_{33}) & (a_{11}b_{14} + a_{12}b_{24} + a_{13}b_{34})\ (a_{21}b_{11} + a_{22}b_{21} + a_{23}b_{31}) & (a_{21}b_{12} + a_{22}b_{22} + a_{23}b_{32}) & (a_{21}b_{13} + a_{22}b_{23} + a_{23}b_{33}) & (a_{21}b_{14} + a_{22}b_{24} + a_{23}b_{34}) \ \end{bmatrix}]
As we can see here we have a 2 x 4 matrix produced.
So we can see how this works in theory, but how do we implement this in our MatrixOperations class?
public static Matrix multiply(Matrix A, Matrix B) {
if(A.columns != B.rows) {
return null;
}
Integer A_rows = A.rows, A_cols = A.columns, B_cols = B.columns;
Matrix C = new Matrix(A_rows, B_cols);
C.zeros();
for(Integer i = 0; i < A_rows; i++) {
List<Double> A_row = A.getRow(i);
for(Integer j = 0; j < B_cols; j++) {
for(Integer k = 0; k < A_cols; k++) {
C.setElement(i, j, C.getElement(i, j) + A.getElement(i,k)*B.getElement(k, j));
}
}
}
return C;
}
The first thing we do is check that we have matrices that the number of columns in A needs to be the same as the number of rows in B and if that is not true we return nothing. The next thing we do is instantiate some integers for us to use in our counting. We create a new matrix C to hold our results and fill it with zeros. We then start to loop through our counter variables, incrementing through the rows of A, then the columns of B, then the columns of A. We could perform these loops in any order, but I have structured them like this to help in maintaining the sort of sequencing given in the mathematical description (i.e. row of A with column of B term-by-term). We set the value of the corresponding element of C to be the existing value plus our new value which is why we initially filled the matrix with zeros. Note we have structured our loops to use the best loop implementation.
The next operation is the transpose function. This is where a matrix is flipped along its leading diagonal, so that an m x n matrix becomes an n x m one. The transpose is denoted AT. So as an example:
[A = \begin{bmatrix} 1 & 2 & 3 \ 4 & 5 & 6 \ \end{bmatrix} \Rightarrow A^T = \begin{bmatrix} 1 & 2 \ 3 & 4 \ 5 & 6 \ \end{bmatrix}]
In code we implement this as follows:
public static Matrix transpose(Matrix A) {
Integer A_rows = A.rows, A_cols = A.columns;
Matrix B = new Matrix(A_cols, A_rows);
for(Integer i = 0; i < A_rows; i++) {
for(Integer j = 0; j < A_cols; j++) {
B.setElement(j, i, A.getElement(i, j));
}
}
return B;
}
Again, we instantiate some integers for use in our loops and create a new matrix with the row and column numbers swapped. We then loop using these counters and set the elements on our new matrix using the values from our existing matrix.
Next up is addition and subtraction. To add 2 matrices together, they must have the same dimensionality (number of rows and columns), and we add term-by-term, exactly the same as we did for the subtraction as we discussed in the first post. Implemented in code they look like
public static Matrix add(Matrix A, Matrix B) {
if((A.rows != B.rows) && (A.columns != B.columns)) {
return null;
}
Integer A_rows = A.rows, A_cols = A.columns;
Matrix C = new Matrix(A_rows, A_cols);
for(Integer i = 0; i < A_rows; i++) {
for(Integer j = 0; j < A_cols; j ++) {
C.setElement(i, j) = A.getElement(i, j) + B.getElement(i, j);
}
}
return C;
}
public static Matrix subtract(Matrix A, Matrix B) {
if((A.rows != B.rows) && (A.columns != B.columns)) {
return null;
}
Integer A_rows = A.rows, A_cols = A.columns;
Matrix C = new Matrix(A_rows, A_cols);
for(Integer i = 0; i < A_rows; i++) {
for(Integer j = 0; j < A_cols; j ++) {
C.setElement(i, j) = A.getElement(i, j) - B.getElement(i, j);
}
}
return C;
}
Much the same as before, some simple looped and setting of items on a new matrix.
Our final methods are pointwise multiply and pointwise exponent (which we described in the first post). if you cast your mind back to that first post, I noted that “pointwise exponent” is correctly called the “Hadamard Power”, and similarly what we will term “pointwise multiplication” is known as the “Hadamard Product”. For this operation both matrices must have the same dimensionality, and we can write out the operation as:
[A \circ B = \begin{bmatrix} a_{11} & a_{12} \ a_{21} & a_{22} \ \end{bmatrix} \circ \begin{bmatrix} b_{11} & b_{12} \ b_{21} & b_{22} \ \end{bmatrix} = \begin{bmatrix} a_{11}b_{11} & a_{12}b_{22} \ a_{21}b_{21} & a_{22}b_{22} \ \end{bmatrix}]
Here the small circle symbol denotes the Hadamard product instead of the standard matrix mnultiplication action. In code this looks like:
public static Matrix pointwiseMultiply(Matrix A, Matrix B) {
if((A.rows != B.rows) && (A.columns != B.columns)) {
return null;
}
Integer A_rows = A.rows, A_cols = A.columns;
Matrix C = new Matrix(A_rows, A_cols);
for(Integer i = 0; i < A_rows; i++) {
for(Integer j = 0; j < A_cols; j ++) {
C.setElement(i, j) = A.getElement(i, j)*B.getElement(i, j);
}
}
return C;
}
public static Matrix pointwiseExponent(Matrix A, Double exponent) {
Integer A_rows = A.rows, A_cols = A.columns;
for(Integer i = 0; i < A_rows; i++) {
for(Integer j = 0; j < A_cols; j++) {
A.setElement(i, j, Math.pow(A.getElement(i, j), exponent));
}
}
return A;
}
Again for both of these functions we are using our speedier loops implementation to move through the items within the matrix, setting each value as we go. In the case of the pointwise multiplication, this is simple multiplying the 2 matrix entries together, and for the exponent is applying the exponent to each entry.
MathUtil
The MathUtil class is a holding place for all the thngs we need to be able to do quickly that there is no default Apex implementation for. The first such method is sum.
public static Double sum(List<Double> items) {
Double sum = 0;
Integer numItems = items.size();
for(Integer i = 0; i < numItems; i++) {
sum += items[i];
}
return sum;
}
Sum takes in a list of numbers (doubles) and returns their sum, that is, all of them added together. As you can see from the code above, this is pretty simple, the only thing worth noting really is that we are using our faster loop implementation as this will help us as the system scales to summing some larger lists of numbers.
Next up we have a function to calculate the mean (or average) which we discussed in the first post. As we highlighted the mean of a list of numbers is their sum divided by the size of the list. Our implementation is:
public static Double mean(List<Double> items) {
Double sum = sum(items);
Integer numItems = items.size();
return sum/numItems;
}
which simply uses our sum function to generate the sum and then divides by the number of items we have.
We are going to discuss the variance function next, as the standard deviation function simply returns the square rrot of the variance that we calculate. As we discussed in the first post, the variance is calculated as the sum of the square of the difference of each item from the mean, divided by the total sample size - 1. Our code implementation is:
public static Double variance(List<Double> items) {
Double mean = mean(items);
Matrix itemMatrix = new Matrix(items);
Matrix meanMatrix = new Matrix(1, items.size());
meanMatrix.fill(mean);
Matrix diffs = MatrixOperations.subtract(itemMatrix, meanMatrix);
diffs = MatrixOperations.pointwiseExponent(diffs, 2);
List<Double> diffList = diffs.getRow(0);
return sum(diffList)/(diffList.size() - 1);
}
For our sample list of items we start by calculating the mean and then creating 2 matrices, one to hold our sample data and another of equal size holding the mean value everywhere. We then subtract the mean matrix from the sample matrix and run our pointwise exponent (Hadamard power) function on this new matrix to square these differences. We then cast this back to a list of data and calculate the sum of this data divided by the size - 1. This function brings together a lot of the work we have done previously in the MathUtil and MatrixOperations classes to provide an important function.
Last but by no means least is our function to calculate the Pearson correlation coefficient for us to use in the Linear Regression model.
public static Double pearsonCorrelation(List<Double> x, List<Double> y) {
Double mean_x = mean(x);
Double mean_y = mean(y);
Double std_x = standardDeviation(x);
Double std_y = standardDeviation(y);
List<Double> xy = new List<Double>();
Integer numItems = x.size();
Double x_diff, y_diff;
for(Integer i = 0; i < numItems; i++) {
x_diff = x[i] - mean_x;
y_diff = y[i] - mean_y;
xy.add((x_diff/std_x)*(y_diff/std_y));
}
return sum(xy)/(numItems - 1);
}
Again, because we worked to have functions that calculate items such as the standard deviation and mean, we can simply use these functions and plug them into the appropriate places for us in the formula we saw in the previous post.
Our Tour Stops Here For Now
Next up is working on the Gradient Descent algorithm and starting to move deeper into some harder statistics as we cross the barrier into machine learning.