Gentle Introduction to Predictive Modeling

阿新 • • 發佈：2019-01-12

When you’re an absolute beginner it can be very confusing. Frustratingly so.

Even ideas that seem so simple in retrospect are alien when you first encounter them. There’s a whole new language to learn.

I recently received this question:

So using the iris exercise as an example if I were to pluck a flower from my garden how would I use the algorithm to predict what it is?

It’s a great question.

In this post I want to give a gentle introduction to predictive modeling.

Basics of Predictive Modeling
Photo by Steve Jurvetson, some rights reserved.

1. Sample Data

Data is information about the problem that you are working on.

Imagine we want to identify the species of flower from the measurements of a flower.

The data is comprised of four flower measurements in centimeters, these are the columns of the data.

Each row of data is one example of a flower that has been measured and it’s known species.

The problem we are solving is to create a model from the sample data that can tell us which species a flower belongs to from its measurements alone.

Sample of Iris flower data

2. Learn a Model

This problem described above is called supervised learning.

The goal of a supervised learning algorithm is to take some data with a known relationship (actual flower measurements and the species of the flower) and to create a model of those relationships.

In this case the output is a category (flower species) and we call this type of problem a classification problem. If the output was a numerical value, we would call it a regression problem.

The algorithm does the learning. The model contains the learned relationships.

The model itself may be a handful of numbers and way of using those numbers to relate input (flower measurements in centimeters) to an output (the species of flower).

We want to keep the model after we have learned it from our sample data.

Create a predictive model from training data and an algorithm.

3. Make Predictions

We don’t need to keen the training data as the model has summarized the relationships contained within it.

The reason we keep the model learned from data is because we want to use it to make predictions.

In this example, we use the model by taking measurements of specific flowers of which don’t know the species.

Our model will read the input (new measurements), perform a calculation of some kind with it’s internal numbers and make a prediction about which species of flower it happens to be.

The prediction may not be perfect, but if you have good sample data and a robust model learned from that data, it will be quite accurate.

Use the model to make predictions on new data.

Summary

In this post we have taken a very gentle introduction to predictive modeling.

The three aspects of predictive modeling we looked at were:

Sample Data: the data that we collect that describes our problem with known relationships between inputs and outputs.
Learn a Model: the algorithm that we use on the sample data to create a model that we can later use over and over again.
Making Predictions: the use of our learned model on new data for which we don’t know the output.

We used the example of classifying plant species based on flower measurements.

This is in fact a famous example in machine learning because it’s a good clean dataset and the problem is easy to understand.

Action Step

Take a moment and really understand these concepts.

They are the foundation of any thinking or work that you might do in machine learning.

Your action step is to think through the three aspects (data, model, predictions) and relate them to a problem that you would like to work on.

Any questions at all, please ask in the comments. I’m here to help.

Gentle Introduction to Predictive Modeling

1. Sample Data

2. Learn a Model

3. Make Predictions

Summary

Action Step

Gentle Introduction to Predictive Modeling

A Gentle Introduction to Autocorrelation and Partial Autocorrelation (譯文)

A Gentle Introduction to Applied Machine Learning as a Search Problem (譯文)

Gentle Introduction to the Adam Optimization Algorithm for Deep Learning

Text Mining 101: A Stepwise Introduction to Topic Modeling using Latent Semantic Analysis (using…

A gentle introduction to decision trees using R

A Gentle Introduction to Transfer Learning for Deep Learning

A Gentle Introduction to RNN Unrolling

Gentle Introduction to Models for Sequence Prediction with Recurrent Neural Networks

A Gentle Introduction to Matrix Factorization for Machine Learning

A Gentle Introduction to Autocorrelation and Partial Autocorrelation

A Gentle Introduction to Exploding Gradients in Neural Networks

A Gentle Introduction to Broadcasting with NumPy Arrays

A Gentle Introduction to Deep Learning Caption Generation Models

Gentle Introduction to Transduction in Machine Learning

翻譯 COMMON LISP: A Gentle Introduction to Symbolic Computation

Brief introduction to Java String Split 【簡單介紹下Java String Split】

Brief introduction to Cassandra 【Cassandra簡介】

Introduction to Mathematical Thinking - Week 3

An introduction to parsing text in Haskell with Parsec

Gentle Introduction to Predictive Modeling

1. Sample Data

2. Learn a Model

3. Make Predictions

Summary

Action Step

相關推薦