The separating plane is defined by the set of points $\mathbf{x}$ that satisfy:
Where $\mathbf{w}$ is the normal vector and $b$ is a bias term.
An SVM maximizes the margin between classes. The width is:
This is expressed as a constrained optimization problem: Find $\mathbf{w}$ and $b$ that...
Minimize: $\frac{1}{2} ||\mathbf{w}||^2$
...subject to the constraint that all points are correctly classified:
Where $y_i$ is the class label (+1 or -1) for each data point $\mathbf{x}_i$.