Question

How might companies use random forest models for predictions?

Answer
By Justin Stoltzfus | Last updated: May 21, 2018

Companies often use random forest models in order to make predictions with machine learning processes. The random forest uses multiple decision trees to make a more holistic analysis of a given data set.

A single decision tree works on the basis of separating a certain variable or variables according to a binary process. For example, in assessing data sets related to a set of cars or vehicles, a single decision tree could sort and classify each individual vehicle by weight, separating them into heavy or light vehicles.

The random forest builds on the decision tree model, and makes it more sophisticated. Experts talk about random forests as representing “stochastic discrimination” or the “stochastic guessing” method on data applied to multidimensional spaces. Stochastic discrimination tends to be a way to enhance the analysis of data models beyond what a single decision tree can do.

Basically, a random forest creates many individual decision trees working on important variables with a certain data set applied. One key factor is that in a random forest, the data set and variable analysis of each decision tree will typically overlap. That's important to the model, because the random forest model takes the average results for each decision tree, and factors them into a weighted decision. In essence, the analysis is taking all of the votes of various decision trees and building a consensus to offer productive and logical results.

One example of using a random forest algorithm productively is available at the R-blogger site, where writer Teja Kodali takes the example of determining wine quality through factors such as acidity, sugar, sulfur dioxide levels, pH value and alcohol content. Kodali explains how a random forest algorithm uses a small random subset of features for each individual tree, and then utilizes resulting averages.

With this in mind, enterprises wanting to use random forest machine learning algorithms for predictive modeling will first isolate the predictive data that needs to be boiled down into a set of productions, and then apply it to the random forest model utilizing a certain set of training data. Machine learning algorithms take that training data and work with it to evolve beyond the constraints of their original programming. In the case of random forest models, the technology learns to form more sophisticated predictive results using those individual decision trees to build its random forest consensus.

One way that this could be applied to business is to take various product property variables and use a random forest to indicate potential customer interest. For example, if there are known customer interest factors such as color, size, durability, portability or anything else that customers have indicated interest in, those attributes can be fed into the data sets and analyzed on the basis of their own unique impact for multifactor analysis.

Share this Q&A

  • Facebook
  • LinkedIn
  • Twitter

Tags

Computer Science Artificial Intelligence Emerging Technology Machine Learning Data Science

Written by Justin Stoltzfus | Contributor, Reviewer

Profile Picture of Justin Stoltzfus

Justin Stoltzfus is a freelance writer for various Web and print publications. His work has appeared in online magazines including Preservation Online, a project of the National Historic Trust, and many other venues.

More Q&As from our experts

Related Terms

Related Articles

Term of the Day

Digital Divide

The digital divide is the gap in social and economic equality that occurs when some segments of a given population do not...
Read Full Term

Tech moves fast! Stay ahead of the curve with Techopedia!

Join nearly 200,000 subscribers who receive actionable tech insights from Techopedia.

Resources
Go back to top