Midterm Question: Why optimize for the g class, need to update other class’s corresponding parameters as well
Linear Classifier
The linear classifier is a hyperplane separates the space to .
But it has a limit: In cases where the data in nonlinear the linear classifier doesn’t work. Imagine the XOR problem.
Multi-Layer Neural Networks
The “deep” approach: stack multiple layers of linear transformations interspersed with nonlinearities.

Before:
Where and
Now: