Why is data annotation important in some machine learning projects?
Data annotation is important in machine learning because in many cases, it makes the work of the machine learning program much easier.
This has to do with the difference between supervised and unsupervised machine learning. With supervised machine learning, the training data is already labeled so the machine can understand more about the desired results. For example, if the purpose of the program is to identify cats in images, the system already has a large number of photos tagged as cat or not. It then uses those examples to contrast new data to make its results.
Free Download: Machine Learning and Why It Matters |
With unsupervised machine learning, there are no labels, and so the system has to use attributes and other techniques to identify the cats. Engineers can train the program on recognizing visual features of cats like whiskers or tails, but the process is hardly ever as straightforward as it would be in supervised machine learning where those labels play a very important role.
Data annotation is the process of affixing labels to the training data sets. These can be applied in many different ways – above we talked about binary data annotation – cats or not cats – but other kinds of data annotation are important as well. For example, in the medical field, data annotation may involve tagging specific biological images with tags identifying pathology or disease markers for other medical properties.
Data annotation takes work – and is often done by teams of people – but it is a fundamental part of what makes many machine learning projects function accurately. It provides that initial setup for teaching a program what it needs to learn and how to discriminate against various inputs to come up with accurate outputs.
Tags
Written by Justin Stoltzfus | Contributor, Reviewer

Justin Stoltzfus is a freelance writer for various Web and print publications. His work has appeared in online magazines including Preservation Online, a project of the National Historic Trust, and many other venues.
More Q&As from our experts
- Why is feature selection so important in machine learning?
- Why is semi-supervised learning a helpful model for machine learning?
- How is the master algorithm changing the machine learning world?
Related Terms
- Machine Learning
- Supervised Learning
- Unsupervised Learning
- Semi-Supervised Learning
- Training Data
- Neural Turing Machine
- Artificial Neural Network
- Machine Learning as a Service
- Artificial General Intelligence
- Two-Phase Commit
Related Articles

Required Skill for the Information Age: Pattern Recognition

Machine Learning & Hadoop in Next-Generation Fraud Detection

Machine Learning 101

Deep Learning: How Enterprises Can Avoid Deployment Failure
Tech moves fast! Stay ahead of the curve with Techopedia!
Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.
- The CIO Guide to Information Security
- Robotic Process Automation: What You Need to Know
- Data Governance Is Everyone's Business
- Key Applications for AI in the Supply Chain
- Service Mesh for Mere Mortals - Free 100+ page eBook
- Do You Need a Head of Remote?
- Web Data Collection in 2022 - Everything you need to know