Question

# What’s a simple way to describe bias and variance in machine learning?

By Justin Stoltzfus | Last updated: June 28, 2022

There are any number of complicated ways to describe bias and variance in machine learning. Many of them utilize significantly complex mathematical equations and show through graphing how specific examples represent various amounts of both bias and variance.

Here’s a simple way to describe bias, variance and the bias/variance trade-off in machine learning.

At its core, bias is an oversimplification. It can be important to add to the definition of bias some assumption or assumed error.

If a highly biased result was not in error — if it was on the money — it would be highly accurate. The problem is that the simplified model contains some error, so it is not on the bull’s-eye — the significant error keeps getting repeated or even amplified as the machine learning program works.

The simple definition of variance is that the results are too scattered. This often leads to overcomplexity of the program and problems between test and training sets.

High variance means that small changes create great changes in outputs or results.

Another way to simply describe variance is that there’s too much noise in the model, and so it gets harder for the machine learning program to isolate and identify the real signal.

So one of the simplest ways to compare bias and variance is to suggest that machine learning engineers have to walk a fine line between too much bias or oversimplification, and too much variance or overcomplexity.

Another way to represent this well is with a four-quadrant chart showing all combinations of high and low variance. In the low bias/low variance quadrant, all of the results are gathered together in an accurate cluster. In a high bias/low variance result, all of the results are gathered together in an inaccurate cluster. In a low bias/high variance result, the results are scattered around a central point that would represent an accurate cluster, while in a high bias/high variance result, the data points are both scattered and collectively inaccurate.

#### Tags

Written by Justin Stoltzfus | Contributor, Reviewer

Justin Stoltzfus is a freelance writer for various Web and print publications. His work has appeared in online magazines including Preservation Online, a project of the National Historic Trust, and many other venues.

#### Related Articles

Emerging Technology
##### AI's Got Some Explaining to Do
Emerging Technology
Tech 101
##### The Technologies Around Fighting Fake News
Artificial Intelligence
##### Debunking the Top 4 Myths About Machine Learning
Term of the Day

Soft Skills

Soft skills, also known as people skills, are personal qualities that complement the technical requirements necessary to...

Tech moves fast! Stay ahead of the curve with Techopedia!

Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.

Resources