Sum of Squared Errors (SSE)

The sum of squared errors (SSE) is a critical metric in statistics and machine learning, used to measure the discrepancy between observed and predicted values in a regression model. It calculates the sum of the squared differences between each actual data point and the corresponding predicted value generated by the model. The formula for SSE is \( SSE = \sum_{i=1}^{n} (y_i - \hat{y}_i)^2 \), where \( y_i \) represents the observed values, \( \hat{y}_i \) represents the predicted values, and \( n \) is the total number of observations. Introduced alongside linear regression in the early 19th century, SSE is pivotal in optimizing regression models by minimizing the error term.

https://en.wikipedia.org/wiki/Residual_sum_of_squares

In practice, SSE is widely used for evaluating the performance of predictive models. A smaller SSE indicates that the model fits the data better, as the residual errors are minimal. However, it is sensitive to the scale of the data, making it less effective when comparing models with different datasets or when the number of observations varies significantly. To address these issues, metrics like the mean squared error (MSE) and root mean squared error (RMSE) are often preferred, as they normalize the error value, providing a more interpretable measure of model performance.

https://en.wikipedia.org/wiki/Mean_squared_error

SSE is also integral to iterative algorithms like gradient descent, where it serves as the cost function. By calculating the SSE for a given set of model parameters, gradient descent adjusts these parameters to minimize the error iteratively, ensuring a better fit to the data. Modern machine learning libraries, such as scikit-learn and TensorFlow, automate the computation of SSE and its derivatives, enabling efficient optimization of complex models. Despite its simplicity, SSE remains a cornerstone of statistical and predictive modeling frameworks.

https://scikit-learn.org/stable/

https://www.tensorflow.org/