Linear Regression

Model Equation

f(x) = w*x + b or f_x = w*x + b

Simply Plot Function f(x) output and observe pattern

1. Model Representation

i) Key Objective

ii) Problem Statement (Housing Price Prediction)

This lab will use a simple data set with only two data points

Size (1000 sqft) Price (1000s of dollars)
1.0 300
2.0 500

You would like to fit a linear regression model through these two points, so you can then predict price for other houses - say, a house with 1200 sqft.

Linear Regression Function

$$ f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{1}$$

Suppose w = 100 ; b = 100

Let's compute the value of $f_{w,b}(x^{(i)})$ for your two data points. You can explicitly write this out for each data point as -

for $x^{(0)}$, f_wb = w * x[0] + b

for $x^{(1)}$, f_wb = w * x[1] + b

Now let's call the compute_model_output function and plot the output..

iii) Prediction

Let's predict the price of a house with 1200 sqft.

2. Cost Function

i) Key Objective

ii) Problem Statement (Housing Price Prediction)

Same data set previously used:

Size (1000 sqft) Price (1000s of dollars)
1.0 300
2.0 500

You would like a model which can predict housing prices given the size of the house.

ii) Computing Cost

The term 'cost' in this assignment might be a little confusing since the data is housing cost. Here, cost is a measure how well our model is predicting the target price of the house. The term 'price' is used for housing data.

The equation for cost with one variable is: $$J(w,b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})^2 \tag{1}$$

where $$f_{w,b}(x^{(i)}) = wx^{(i)} + b \tag{2}$$

3. Gradient Descent for Linear Regression

i) Key Objective

ii) Problem Statement (Housing Price Prediction)

Same data set previously used:

Size (1000 sqft) Price (1000s of dollars)
1.0 300
2.0 500

You would like a model which can predict housing prices given the size of the house.

iii) Compute Cost

iv) Gradient Descent

In lecture, gradient descent was described as:

$$\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline \; w &= w - \alpha \frac{\partial J(w,b)}{\partial w} \tag{3} \; \newline b &= b - \alpha \frac{\partial J(w,b)}{\partial b} \newline \rbrace \end{align*}$$

where, parameters $w$, $b$ are updated simultaneously.
The gradient is defined as: $$ \begin{align} \frac{\partial J(w,b)}{\partial w} &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)})x^{(i)} \tag{4}\\ \frac{\partial J(w,b)}{\partial b} &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{w,b}(x^{(i)}) - y^{(i)}) \tag{5}\\ \end{align} $$

Here simultaniously means that you calculate the partial derivatives for all the parameters before updating any of the parameters.

iv) Implement Gradient Descent

You will implement gradient descent algorithm for one feature. You will need three functions.

Conventions:

v) compute_gradient

compute_gradient implements (4) and (5) above and returns $\frac{\partial J(w,b)}{\partial w}$,$\frac{\partial J(w,b)}{\partial b}$.

The embedded comments describe the operations.

Go to Home