I recently finished Jose Portilla’s excellent Udemy course on PySpark, and of course I wanted to try out some things I learned in the course. I have been transitioning over to AWS Sagemaker for a lot of my work, but I haven’t tried using it with PySpark yet …

In the last post I talked about how to find the coefficients that give us the line of best fit for a OLS regression problem using the normal solution. The core of this approach is the equation:

When I first learned least-squares linear regression in my undergrad degree, I remember that we approached it in the “calculus” way: taking the sum of the squared differences for each observation and solving a massive (and tedious) equation until we arrived at our coefficients and line of best fit. I …

Today we will continue our discussion of the basic operations you can do with matrices in linear algebra. In the last post we covered addition, subtraction, scalar multiplication and matrix multiplication. This week, we’ll cover matrix inversion.

So far we’ve seen matrix addition, subtraction and multiplication, but what …