Optimization

Loss Function, Convexity & Optimization

In this section we will understand Optimization in Machine Learning and related concepts, such as, Loss Function, Convexity, & Optimization.


💡 Whenever we build a Machine Learning model, we try to make sure that the model makes least mistakes in its predictions.
How do we measure and minimize these mistakes in predictions made by the model?

To measure, how wrong the are the predictions made by a Machine Learning model, every model is formulated as
minimizing a loss function.
💡 Find the point on the line 2x + 3y = 13 that is closest to the origin.

Objective: To minimize the distance between point (x,y) on the line 2x + 3y = 13 and the origin (0,0).
distance, d = \(\sqrt{(x-0)^2 + (y-0)^2}\)
=> Objective function = minimize distance = \( \underset{x^*, y^*}{\mathrm{argmin}}\ f(x,y) = \underset{x^*, y^*}{\mathrm{argmin}}\ x^2+y^2\)
Constraint: Point (x,y) must be on the line 2x + 3y = 13.
=> Constraint (equality) function = \(g(x,y) = 2x + 3y - 13 = 0\)
Lagragian function =

\[ \mathcal{L}(\lambda, x, y) = f(x,y) - \lambda(g(x,y)) \\[10pt] => \mathcal{L}(\lambda, x, y) = x^2+y^2 - \lambda(2x + 3y - 13) \]

To find the optimum solution, we solve the below unconstrained optimization problem.

\[ \underset{x^*, y^*, \lambda}{\mathrm{argmin}}\ \mathcal{L}(\lambda, x, y) = \underset{x^*, y^*, \lambda}{\mathrm{argmin}}\ x^2+y^2 - \lambda(2x + 3y - 13) \]

Take the derivative and equate it to zero.
Since, it is multi-variable function, we take the partial derivatives, w.r.t, x, y and \(\lambda\).

\[ \tag{1} \frac{\partial}{\partial x} \mathcal{L}(\lambda, x, y) = \frac{\partial}{\partial x} (x^2+y^2 - \lambda(2x + 3y - 13)) = 0 \\[10pt] => 2x - 2\lambda = 0 \\[10pt] => x = \lambda \]


\[ \frac{\partial}{\partial y} \mathcal{L}(\lambda, x, y) = \frac{\partial}{\partial y} (x^2+y^2 - \lambda(2x + 3y - 13)) = 0 \\[10pt] \tag{2} => 2y - 3\lambda = 0 \\[10pt] => y = \frac{3}{2} \lambda \]


\[ \frac{\partial}{\partial \lambda} \mathcal{L}(\lambda, x, y) = \frac{\partial}{\partial \lambda} (x^2+y^2 - \lambda(2x + 3y - 13)) = 0 \\[10pt] \tag{3} => -2x -3y + 13 = 0 \]


Now, we have 3 variables and 3 equations (1), (2) and (3), lets solve them.

\[ -2x -3y + 13 = 0 \\[10pt] => 2x + 3 y = 13 \\[10pt] => 2*\lambda + 3*\frac{3}{2} \lambda = 13 \\[10pt] => \lambda(2+9/2) = 13 \\[10pt] => \lambda = 13 * \frac{2}{13} \\[10pt] => \lambda = 2 => x = \lambda = 2 \\[10pt] => y = \frac{3}{2} \lambda = \frac{3}{2} * 2 = 3\\[10pt] => x = 2, y = 3 \]

Hence, the point (x=2, y=3) on the line 2x + 3y = 13 that is closest to the origin.



End of Section