1. 程式人生 > >Practical Machine Learning Problems

Practical Machine Learning Problems

What is Machine Learning? We can read authoritative definitions of machine learning, but really, machine learning is defined by the problem being solved. Therefore the best way to understand machine learning is to look at some example problems.

In this post we will first look at some well known and understood examples of machine learning problems in the real world. We will then look at a taxonomy (naming system) for standard machine learning problems and learn how to identify a problem as one of these standard cases. This is valuable, because knowing the type of problem we are facing allows us to think about the data we need and the types of algorithms to try.

10 Examples of Machine Learning Problems

Machine Learning problems are abound. They make up core or difficult parts of the software you use on the web or on your desktop everyday. Think of the “do you want to follow” suggestions on twitter and the speech understanding in Apple’s Siri.

Below are 10 examples of machine learning that really ground what machine learning is all about.

  • Spam Detection: Given email in an inbox, identify those email messages that are spam and those that are not. Having a model of this problem would allow a program to leave non-spam emails in the inbox and move spam emails to a spam folder. We should all be familiar with this example.
  • Credit Card Fraud Detection: Given credit card transactions for a customer in a month, identify those transactions that were made by the customer and those that were not. A program with a model of this decision could refund those transactions that were fraudulent.
  • Digit Recognition: Given a zip codes hand written on envelops, identify the digit for each hand written character. A model of this problem would allow a computer program to read and understand handwritten zip codes and sort envelops by geographic region.
  • Speech Understanding: Given an utterance from a user, identify the specific request made by the user. A model of this problem would allow a program to understand and make an attempt to fulfil that request. The iPhone with Siri has this capability.
  • Face Detection: Given a digital photo album of many hundreds of digital photographs, identify those photos that include a given person. A model of this decision process would allow a program to organize photos by person. Some cameras and software like iPhoto has this capability.
Face Detection

Example of Face Detection in a Photo.
Photo by mr. ‘sto Licensed under a Attribution-ShareAlike 2.0 Generic Creative Commons License.

  • Product Recommendation: Given a purchase history for a customer and a large inventory of products, identify those products in which that customer will be interested and likely to purchase. A model of this decision process would allow a program to make recommendations to a customer and motivate product purchases. Amazon has this capability. Also think of Facebook, GooglePlus and LinkedIn that recommend users to connect with you after you sign-up.
  • Medical Diagnosis: Given the symptoms exhibited in a patient and a database of anonymized patient records, predict whether the patient is likely to have an illness. A model of this decision problem could be used by a program to provide decision support to medical professionals.
  • Stock Trading: Given the current and past price movements for a stock, determine whether the stock should be bought, held or sold. A model of this decision problem could provide decision support to financial analysts.
  • Customer Segmentation: Given the pattern of behaviour by a user during a trial period and the past behaviours of all users, identify those users that will convert to the paid version of the product and those that will not. A model of this decision problem would allow a program to trigger customer interventions to persuade the customer to covert early or better engage in the trial.
  • Shape Detection: Given a user hand drawing a shape on a touch screen and a database of known shapes, determine which shape the user was trying to draw. A model of this decision would allow a program to show the platonic version of that shape the user drew to make crisp diagrams. The Instaviz iPhone app does this.

These 10 examples give a good sense of what a machine learning problem looks like. There is a corpus of historic examples, there is a decision that needs to be modelled and a business or domain benefit to having that decision modelled and efficaciously made automatically.

Some of these problems are some of the hardest problems in Artificial Intelligence, such as Natural Language Processing and Machine Vision (doing things that humans do easily). Others are still difficult, but are classic examples of machine learning such as spam detection and credit card fraud detection.

Think about some of your interactions with online and offline software in the last week. I’m sure you could easily guess at another ten or twenty examples of machine learning you have directly or indirectly used.

Types of Machine Learning Problems

Reading through the list of example machine learning problems above, I’m sure you can start to see similarities. This is a valuable skill, because being good at extracting the essence of a problem will allow you to think effectively about what data you need and what types of algorithms you should try.

There are common classes of problem in Machine Learning. The problem classes below are archetypes for most of the problems we refer to when we are doing Machine Learning.

  • Classification: Data is labelled meaning it is assigned a class, for example spam/non-spam or fraud/non-fraud. The decision being modelled is to assign labels to new unlabelled pieces of data. This can be thought of as a discrimination problem, modelling the differences or similarities between groups.
  • Regression: Data is labelled with a real value (think floating point) rather then a label. Examples that are easy to understand are time series data like the price of a stock over time, The decision being modelled is what value to predict for new unpredicted data.
  • Clustering: Data is not labelled, but can be divided into groups based on similarity and other measures of natural structure in the data. An example from the above list would be organising pictures by faces without names, where the human user has to assign names to groups, like iPhoto on the Mac.
  • Rule Extraction: Data is used as the basis for the extraction of propositional rules (antecedent/consequent aka if-then). Such rules may, but are typically not directed, meaning that the methods discover statistically supportable relationships between attributes in the data, not necessarily involving something that is being predicted. An example is the discovery of the relationship between the purchase of beer and diapers (this is data mining folk-law, true or not, it’s illustrative of the desire and opportunity).

When you think a problem is a machine learning problem (a decision problem that needs to be modelled from data), think next of what type of problem you could phrase it as easily or what type of outcome the client or requirement is asking for and work backwards.

Resources

There are few resources that provide lists of real-world machine learning problems. They may be out there, but I can’t find them. I still found some cool resources for you though:

  • The Annual “Humies” Awards: These are a list of prizes awarded to results achieved by algorithms that are competitive with those results come up with by humans. It’s exciting because the algorithms are working only from data or cost functions and are able to be creative and inventive enough to infringe on patents. Amazing!
  • The AI Effect: The notion where as soon as an Artificial Intelligence program achieves a good enough result it is no longer regarded as Artificial Intelligence, instead it is just technology and gets used in every day things. Applies just as equally to Machine Learning.
  • AI-Complete: refers to very difficult problems in Artificial Intelligence that if solved would be an example of Strong AI (AI as envisioned in science fiction, true AI). The problems of Computer Vision and  Natural Language Processing are both examples of AI-Complete problems and may also be considered domain-specific categories of machine learning problems.
  • What are the Top 10 problems in Machine Learning for 2013? This Quora question has some excellent answers, and one that lists some broad categories of practical machine learning problems.

We have reviewed some common examples of real-world machine learning problems and a taxonomy of classes of machine learning problems. We now have some confidence to comment on whether a problem is a machine learning problem or not and to pick out the elements from a problem description and determine whether it is a classification, regression, clustering or rule extraction type of problem.

Do you know of some more real-world machine learning problems? Leave a comment and share your thoughts.

相關推薦

Practical Machine Learning Problems

Tweet Share Share Google Plus What is Machine Learning? We can read authoritative definitions of

Practical Machine Learning Books for the Holidays: A Quick Look at the New Offerings from O'Reilly

Tweet Share Share Google Plus O’Reilly books have a reputation for being practical, hands on and

World Machine Learning Problems

Tweet Share Share Google Plus Real-world examples make the abstract description of machine learn

Template for Working through Machine Learning Problems in Weka

Tweet Share Share Google Plus When you are getting started in Weka, you may feel overwhelmed. Th

Work on Machine Learning Problems That Matter To You

Tweet Share Share Google Plus It is difficult to stay motivated when self-studying machine learn

MLP Coursework Machine Learning Practical

代做MLP作業、代寫Machine Learning作業、代做Python程式語言作業、Python課程設計作業代寫MLP 2018/19: Coursework 2 Due: 23 November 2018Machine Learning Practical 2018/19: Coursework 2Re

How AI, Machine Learning Are Solving Global Problems

Although developments in the field of artificial intelligence began around the 1950s, its capacities have significantly increased in the recent years. Owin

6 Practical Books for Beginning Machine Learning

Tweet Share Share Google Plus There are a lot of good books on machine learning, but most people

10 Challenging Machine Learning Time Series Forecasting Problems

Tweet Share Share Google Plus Machine learning methods have a lot to offer for time series forec

Practical Advice for Getting Started in Machine Learning

Tweet Share Share Google Plus David Mimno is an assistant professor in the Information Sciences

machine learning--L1 ,L2 norm

lan font 更多 ora net 例如 參數 而已 內容   關於L1範數和L2範數的內容和圖示,感覺已經看過千百遍,剛剛看完此大牛博客http://blog.csdn.net/zouxy09/article/details/24971995/,此時此刻終於弄懂了那麽

Ng第十一課:機器學習系統的設計(Machine Learning System Design)

未能 計算公式 pos 構建 我們 行動 mic 哪些 指標 11.1 首先要做什麽 11.2 誤差分析 11.3 類偏斜的誤差度量 11.4 查全率和查準率之間的權衡 11.5 機器學習的數據 11.1 首先要做什麽 在接下來的視頻將談到機器

[Machine Learning (Andrew NG courses)]V. Octave Tutorial (Week 2)

img and learning text net con fonts http .net [Machine Learning (Andrew NG courses)]V. Octave Tutorial (Week 2)

Machine Learning in Action-chapter2-k近鄰算法

turn fma 全部 pytho label -c log eps 數組 一.numpy()函數 1.shape[]讀取矩陣的長度 例: import numpy as np x = np.array([[1,2],[2,3],[3,4]]) print x

Ng第十七課:大規模機器學習(Large Scale Machine Learning)

在線 src 化簡 ima 機器學習 learning 大型數據集 machine cnblogs 17.1 大型數據集的學習 17.2 隨機梯度下降法 17.3 微型批量梯度下降 17.4 隨機梯度下降收斂 17.5 在線學習 17.6 映射化簡和數據並行

Machine Learning:Neural Network---Representation

white div and for 設計 rop out fcm multi Machine Learning:Neural Network---Representation 1。Non-Linear Classification 假設還採取簡

Machine Learning — 關於過度擬合(Overfitting)

機器學習 gis ear http 問題 正則化 數據集 技術 wid 機器學習是在模型空間中選擇最優模型的過程,所謂最優模型,及可以很好地擬合已有數據集,並且正確預測未知數據。 那麽如何評價一個模型的優劣的,用代價函數(Cost function)來度量預測錯誤的程度。代

Machine Learning — 邏輯回歸

url home mage 簡化 bsp 線性 alt 邏輯回歸 sce 現實生活中有很多分類問題,比如正常郵件/垃圾郵件,良性腫瘤/惡性腫瘤,識別手寫字等等,這些可以用邏輯回歸算法來解決。 一、二分類問題 所謂二分類問題,即結果只有兩類,Yes or No,這樣結果{0,

Machine Learning~初探

Y軸 ron 當我 什麽 http 過程 網上 數據 大坑   最近接觸了機器學習,感覺很夢幻,能實現的我的夢想,看網上說的花天酒地的難,但是想做就要做下去,毅然決然的跳入這個大坑。   讓我們慢慢來,先懟它幾個概念。 監督學習   我們給出了關於每個數據的“正確答案”。監

<Machine Learning in Action >之二 樸素貝葉斯 C#實現文章分類

options 直升機 water 飛機 math mes 視頻 write mod def trainNB0(trainMatrix,trainCategory): numTrainDocs = len(trainMatrix) numWords =