Multiple Variable Linear Regression

1. Vectorization

i) Numpy Vector Operations

ii) Dot Product using Custom Function

Let's implement our own version of the dot product:

Using a for loop, implement a function which returns the dot product of two vectors. The function to return given inputs $a$ and $b$: $$ x = \sum_{i=0}^{n-1} a_i b_i $$ Assume both a and b are the same shape.

iii) Dot Product using Numpy np.dot() Function

2. Multiple Variable Linear Regression

i) Goals

ii) Notations

Here is a summary of some of the notations you will encounter, updated for multiple features.

General
Notation
Description Python (if applicable)
$a$ scalar, non bold
$\mathbf{a}$ vector, bold
$\mathbf{A}$ matrix, bold capital
Regression
$\mathbf{X}$ training example matrix X_train
$\mathbf{y}$ training example targets y_train
$\mathbf{x}^{(i)}$, $y^{(i)}$ $i_{th}$Training Example X[i], y[i]
m number of training examples m
n number of features in each example n
$\mathbf{w}$ parameter: weight, w
$b$ parameter: bias b
$f_{\mathbf{w},b}(\mathbf{x}^{(i)})$ The result of the model evaluation at $\mathbf{x^{(i)}}$ parameterized by $\mathbf{w},b$: $f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x}^{(i)}+b$ f_wb

iii) Problem Statement

You will use the motivating example of housing price prediction. The training dataset contains three examples with four features (size, bedrooms, floors and, age) shown in the table below. Note that, unlike the earlier labs, size is in sqft rather than 1000 sqft. This causes an issue, which you will solve in the next lab!

Size (sqft) Number of Bedrooms Number of floors Age of Home Price (1000s dollars)
2104 5 1 45 460
1416 3 2 40 232
852 2 1 35 178

You will build a linear regression model using these values so you can then predict the price for other houses. For example, a house with 1200 sqft, 3 bedrooms, 1 floor, 40 years old.

Please run the following code cell to create your X_train and y_train variables.

iv) Parameter vector w, b

$$\mathbf{w} = \begin{pmatrix} w_0 \\ w_1 \\ \cdots\\ w_{n-1} \end{pmatrix} $$

For demonstration, $\mathbf{w}$ and $b$ will be loaded with some initial selected values that are near the optimal. $\mathbf{w}$ is a 1-D NumPy vector.

v) Non-Vectorized Implementation of f(x) having Multiple Variable

vi) Vectorized Implementation of f(x) having Multiple Variable

vii) Compute Cost With Multiple Variables

The equation for the cost function with multiple variables $J(\mathbf{w},b)$ is: $$J(\mathbf{w},b) = \frac{1}{2m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})^2 \tag{3}$$ where: $$ f_{\mathbf{w},b}(\mathbf{x}^{(i)}) = \mathbf{w} \cdot \mathbf{x}^{(i)} + b \tag{4} $$

In contrast to previous labs, $\mathbf{w}$ and $\mathbf{x}^{(i)}$ are vectors rather than scalars supporting multiple features.

Below is an implementation of equations (3) and (4). Note that this uses a standard pattern for this course where a for loop over all m examples is used.

3 Gradient Descent With Multiple Variables

Gradient descent for multiple variables:

$$\begin{align*} \text{repeat}&\text{ until convergence:} \; \lbrace \newline\; & w_j = w_j - \alpha \frac{\partial J(\mathbf{w},b)}{\partial w_j} \tag{5} \; & \text{for j = 0..n-1}\newline &b\ \ = b - \alpha \frac{\partial J(\mathbf{w},b)}{\partial b} \newline \rbrace \end{align*}$$

where, n is the number of features, parameters $w_j$, $b$, are updated simultaneously and where

$$ \begin{align} \frac{\partial J(\mathbf{w},b)}{\partial w_j} &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)})x_{j}^{(i)} \tag{6} \\ \frac{\partial J(\mathbf{w},b)}{\partial b} &= \frac{1}{m} \sum\limits_{i = 0}^{m-1} (f_{\mathbf{w},b}(\mathbf{x}^{(i)}) - y^{(i)}) \tag{7} \end{align} $$

i) Compute Gradient with Multiple Variables (Features)

ii) Compute Gradient Descent With Multiple Variables (Features)

Go to Home