1. 程式人生 > >How To Investigate Machine Learning Algorithm Behavior

How To Investigate Machine Learning Algorithm Behavior

Machine learning algorithms are complex systems that require study to understand.

Static descriptions of machine learning algorithms are a good starting point, but are insufficient to get a feeling for how the algorithm behaves. You need to see the algorithm in action.

Experimenting on a running machine learning algorithms will allow you to build an intuition for the cause and effect relationship of the algorithm parameters with the results you can achieve on different classes of problem.

In this post you will discover how to investigate a machine learning algorithm. You will learn about a simple 5-step process that you can use today to design and complete your first machine learning algorithm experiment.

You will discover that machine learning experiments are not just for academics, that you can do them to, and that experimentation is required on the path to mastery as the empirical cause-and-effect knowledge that you will gain is simply not available anywhere else.

Investigate Machine Learning Algorithm Behavior

Investigate Machine Learning Algorithm Behavior
Photo by

U.S. Army RDECOM, some rights reserved

What is Investigating Machine Learning Algorithms

Your objective when investigating a machine learning algorithm is to find behaviors that lead to good results that are generalizable across problems and classes of problems.

You investigate machine learning algorithms by performing systematic research into the algorithms behavior. This is done by designing and executing controlled experiments.

Once you have completed an experiment, you can interpret and  present the results. The results give you glimpses into the cause and effect between changes to the algorithm, it’s behaviors and the results you can achieve.

Get your FREE Algorithms Mind Map

Machine Learning Algorithms Mind Map

Sample of the handy machine learning algorithms mind map.

I've created a handy mind map of 60+ algorithms organized by type.

Download it, print it and use it. 

Download For Free


Also get exclusive access to the machine learning algorithms email mini-course.

How to Investigate Machine Learning Algorithms

In this section we will look at a simple 5-step procedure that you can use to investigate a machine learning algorithm.

1. Select an Algorithm

Select an algorithm that you have questions about.

This may be an algorithm that you are using in ernest on a problem, or an algorithm that you see doing well in other contexts that you may want to use in the future.

For the purposes of experimentation, it is useful to take an off-the-shelf implementation of the algorithm. This gives you a baseline that most likely has few if any bugs.

Implementing the algorithm yourself can be a great way to learn about the algorithm procedure, but can also introduce additional variables into the experiment such as bugs and the myriad of micro-decisions that must be made for each algorithm implementation.

2. Identify a Question

You must have a research question that you are seeking to answer. The more specific the question, the more useful the answer.

Some example questions include:

  • What is the effect of increasing k in kNN as a fraction of the training dataset size?
  • What is the effect of selecting different kernels in SVM on binary classification problems?
  • What are the effects of different attribute scaling on logistic regression on binary classification problems?
  • What is the effect of adding random attributes to the training dataset on classification accuracy in random forest?

Design the question that you want answered about your algorithm. Consider listing five variations of the question and hone in on the one that is the most specific.

3. Design the Experiment

Pick the elements out of the question that will make-up your experiment.

For example, take the following question from above: “What are the effects of different attribute scaling on logistic regression on binary classification problems?

The elements you can pick out of this question for the design of your experiment are:

  • Attribute Scaling Methods. You could include methods like normalization, standardization, raising an attribute to a power, taking the logarithm, etc.
  • Logistic Regression. Which implementation of logistic regression you want to use.
  • Binary Classification Problems. Different standard binary classification problems that have numeric attributes. Multiple problems will be required, some with attributes all the same scale (like ionosphere) and others that have attributes with a variety of scales (like diabetes).
  • Performance. A model performance score is required such as classification accuracy.

Take the time to carefully select the elements of your question to best answer your question.

4. Execute the Experiment and Report Results

Complete your experiment.

If the algorithm is stochastic, you may need to repeat experimental runs multiple times and take a mean and standard deviation.

If you are looking for differences in results between experimental runs (such as different parameters), you may want to use a statistical tool to indicate whether the differences are statistically significant (such as the student t-test).

Some tools like R and scikit-learn/SciPy have the tools available to complete these types of experiments, but you will need to bring them together and script the experiment. Other tools like Weka have the tools built into a graphical user interface (see this tutorial on running your first experiment in Weka). The tools you use matter less than the question and the rigor of your experimental design.

Summarize the results of your experiment. You may want to use tables and graphs. Presenting results alone is insufficient. They are just numbers. You must tie the numbers back to your question and filter their meaning through the design of your experiment.

What do the results indicate about your research question?

Put on your skeptical hat. What holes or limitations can you place on the results. Do not shy away from this part. Knowing the limitations is just as important as knowing the outcomes of an experiment.

5. Repeat

Repeat the process.

Continue to investigate your selected algorithm. You may even want to repeat the same experiment with different parameters or different test datasets. You may want to address the limitations in your experiment.

Don’t stop with one experiment, start building up a knowledge base and an intuition for the algorithm.

With some simple tools, some good questions and a good splash of rigor and skepticism, you can very quickly start coming up with world-class understandings into the behavior of an algorithm.

Investigating Algorithms is Not Just for Academics

You can investigate the behaviors of machine learning algorithms.

You do not need a higher degree, you do not need to be trained in research methods, you do not need to be an academic.

Careful systematic investigation of machine learning algorithms is open to anyone with a computer and a deep interest. In fact, if you want to master machine learning, you must get comfortable with systematic investigations of machine learning algorithms. The knowledge is simply not out there, you must go out and collect it yourself, empirically.

You do need to be skeptical and to be careful when talking about the applicability of your findings.

You do not need to have unique questions. You will gain a lot by investigating the standard questions, such as the effect of one parameter generalized across a few standard datasets. You may very well find limitations or counter points to common best practice heuristics.

Action Steps

In this post you discovered the importance of investigating the behaviors of machine learning algorithms through controlled experimentation. You discovered a simple 5-step process that you can use you design and execute your first experiment on a machine learning algorithm.

Take action. Use the process you learned in this blog post and complete you first machine learning experiment. Once you have completed one, even a very small one, you will have the confidence, tools, and ability to complete a second and many more.

I would love to hear about your first experiment. Leave a comment and share your results or what you learned.


Frustrated With Machine Learning Math?

Mater Machine Learning Algorithms

See How Algorithms Work in Minutes

…with just arithmetic and simple examples

It covers explanations and examples of 10 top algorithms, like:
Linear Regression, k-Nearest Neighbors, Support Vector Machines and much more…

Finally, Pull Back the Curtain on
Machine Learning Algorithms

Skip the Academics. Just Results.


相關推薦

How To Investigate Machine Learning Algorithm Behavior

Tweet Share Share Google Plus Machine learning algorithms are complex systems that require study

How to Scale Machine Learning Data From Scratch With Python

Tweet Share Share Google Plus Many machine learning algorithms expect data to be scaled consiste

How to Improve Machine Learning Results

Tweet Share Share Google Plus Having one or two algorithms that perform reasonably well on a pro

How to Use Machine Learning Results

Tweet Share Share Google Plus Once you have found and tuned a viable model of your problem it is

How to Evaluate Machine Learning Algorithms with R

Tweet Share Share Google Plus What algorithm should you use on your dataset? This is the most co

How to Evaluate Machine Learning Algorithms

Tweet Share Share Google Plus Once you have defined your problem and prepared your data you need

How to Implement a Machine Learning Algorithm

Tweet Share Share Google Plus Implementing a machine learning algorithm in code can teach you a

How to Learn a Machine Learning Algorithm

Tweet Share Share Google Plus The question of how to learn a machine learning algorithm has come

How to Tune a Machine Learning Algorithm in Weka

Tweet Share Share Google Plus Weka is the perfect platform for learning machine learning. It pro

How Microsoft Uses Machine Learning to Help You Build Machine Learning Pipelines

Last week at its Ignite Conference, Microsoft unveiled the preview version of Automated Machine Learning(ML), a component of Azure ML that allows non-data

How I Used Machine Learning to Inspire Physical Paintings

In recent years, I haven’t had the same leeway to paint in public. There was a greater cultural acceptance of street art when I lived abroad. Painting on w

6 Steps To Write Any Machine Learning Algorithm From Scratch: Perceptron Case Study

This goes back to what I originally stated. If you don't understand the basics, don't tackle an algorithm from scratch. For the Perceptron, let's go ahead

Step Methodology To The Best Machine Learning Algorithm

Tweet Share Share Google Plus How do you choose the best algorithm for your dataset? Machine lea

6 Questions To Understand Any Machine Learning Algorithm

Tweet Share Share Google Plus There are a lot of machine learning algorithms and each algorithm

how to study reinforcement learning(answered by Sergio Valcarcel Macua on Quora)

work asi -a recommend practical man glob alua iteration link: https://www.quora.com/What-are-the-best-books-about-reinforcement-learning

A Gentle Introduction to Applied Machine Learning as a Search Problem (譯文)

​ A Gentle Introduction to Applied Machine Learning as a Search Problem 原文作者:Jason Brownlee 原文地址:https://machinelearningmastery.com/applied-m

機器學習專案開發過程(End-to-End Machine Learning Project)

引言:之前對於機器學習的認識停留在演算法的分析上,這篇文章主要從專案開發的角度分析機器學習的應用。這篇文章主要解釋實際專案過程中的大致方針,每一步涉及的技術不會介紹很細緻。機器學習專案開發步驟如下: 1. Look at the big picture. 2. Get the dat

[Machine Learning & Algorithm] 隨機森林(Random Forest)

閱讀目錄 回到頂部 1 什麼是隨機森林?   作為新興起的、高度靈活的一種機器學習演算法,隨機森林(Random Forest,簡稱RF)擁有廣泛的應用前景,從市場營銷到醫療保健保險,既可以用來做市場營銷模擬的建模,統計客戶來源,保留和流失,也可用來預測疾病的風險和病患

How AI and Machine Learning Are Redefining Cybersecurity

Cybersecurity has been emerging as one of the most important sectors of the digital world. The last few years have seen a lot of cyber attacks all around t

Steak & chips: how IoT and machine learning will disrupt risk in animal insurance

On the face of it, the connection between the internet of things (IoT) and animals is not an obvious one. However, a number of trials and larger-scale impl