Work on Machine Learning Problems That Matter To You

阿新 • • 發佈：2019-01-12

It is difficult to stay motivated when self-studying machine learning.

The standard test datasets can be quite obtuse and disconnected from you and from your everyday life. Boring even. A trick that you might like to use is to find and work on a dataset that matters to you.

In this post, we will look at some ideas for datasets that you could use to motivate and even accelerate your journey into applied machine learning.

Problems with Impact

We have looked before at the need to work on problems that have an impact. The problems that have the biggest impact are the problems in which the outcome affects you directly.

These may be problems related to your personal life, hobbies or even your work. They are problems that may or may not be addressed right now. The size and scope of the problem does not matter as long as you are invested in the outcome in some way. The results matter to you.

This is a powerful method for two reasons:

It gives you permission to treat the problem objectively and apply your rational problem-solving skills to it which may result in some interesting results.
Caring about the outcome is more likely to motivate you to learn new and different methods, to go deep into the definition of the problem and to write up your findings. Because you care about the outcome, you will treat the project more seriously.

You can’t pick any old problem. There are some additional considerations:

Data: Machine learning algorithms model problems with data, and the quality of the modeling is typically proportional to the quality of the data. You need to be able to have access and be able to collect data for the problem.
Public: Can the data and/or the results be made public? This may matter to you if you want to use the project as a part of your machine learning portfolio, which I strongly encourage you to do.
Question: Start with a question to be sure that there is a problem to be solved. The question will clarify the data you need to collect and the impact the answer will have on you.

In the next sections, we will look at three areas of your life that you might find problems that you could investigate with machine learning.

Machine Learning at Home

Are there problems and sources of data in your personal life that you can model using machine learning methods?

Track and model your own fitness.
Photo by Phil Gradwell, some rights reserved.

Five examples that come to my mind are:

Personal Finance: You can model some aspect of your personal finance. This could be something like weekly expenditure prediction or large purchase prediction. It could also be something related to your investment portfolio if that is your thing.
Transport: You can model some aspect of your personal transport. This may be which train or bus you take on your commute on a given day, the commute time or some detail like work arrival time prediction or fuel consumption.
Food: You can model something about the food that you consume. This could be the quantity, calories, snack prediction or a model of what you need to purchase in a given week.
Media: You could model your media consumption, such as TV, movies, books, music or websites. An obvious approach would be to model it as a recommendation problem, but also consider models of consumption volume such as how much you consume when you consume it and other related patterns you could predict.
Fitness: You could model some aspect of personal fitness. This could be weight, BMI, a body measurement, or an aspect of endurance like the number of sit-ups or time to complete your routine. How about modeling whether you will go to the gym or not on a given day (what would the inputs be?).

Remember, you have to have access to the data, which very likely means you have to spend some time measuring and collecting the data.

Machine Learning with a Hobby

Do you have a hobby other than machine learning? Consider what data you could collect model related to your hobby.

Apply machine learning to your hobbies.
Photo by You As A Machine, some rights reserved.

Five examples of hobbies you might have or want to model include:

Sports: You can model the performance of a team or a league. You may be into fantasy sports teams and be interested in modeling the performance of individual players. There is also a gambling side to sports outcomes that might spark your interest (be careful). Maybe you have a child or family member that plays a sport on weeks that might provide a problem and source of data a little more connected to you.
Games: You can model an aspect of game you play. This may be a boardgame, card game or computer game. You could model and predict win/loss outcomes, specific outcome scores or specific moves within the game.
Arts/Crafts: Maybe you’re an amateur artist or crafts person and post your photos to a public social photo album of your creations. You could model and predict whether a given photo you post is liked or interesting to third parties (in the form of views or comments). A similar approach could be used in-person with control groups (family members?) and for various other art forms that may require a subjective assessment of interest or quality (painting, music, paper mache, etc).
Language: You could model some aspect of a language you or a friend or family member is learning. If flash cards are being used, you could get into the interesting problem of modeling whether a given card’s contents will remembered. You could also model other aspects of language learning such as rate of new works acquired and frequency of errors. Collecting data may be an interesting challenge.
Photography: Maybe you’re a bird watcher, nature lover, or have some other reason to photograph nature in all of its variety. You could model the problem of classifying photos of leaves/birds/animals into their groups. You could also model the problem of whether a given photo includes an object of interest, like your pet dog or your own face.

Gravitate towards hobbies that have datasets readily available for you to draw upon and model.

Machine Learning at Work

Do you have access to data at work or the things you work on? This could be your blog or something else online, or it could be data on or related to something your work creates or releases.

Apply machine learning at work.
Photo by BiblioArchives / LibraryArchives, some rights reserved.

Visitors: Can you model something about the visits to your website (this could be your own blog or web property). Perhaps a demographic feature of a visitor such as platform, browser, etc., or perhaps the source of visitors or volume of page views in a period based on content posted.
Customers: Like visitors, are their properties of customers that can be modeled? This might be purchase volumes, shopping cart contents, purchase times or similarly demographics information. I like this area because it can flush out a lot of new knowledge (support with data) about a business that was taken for granted.
Conversion: Are their quality of conversion that can be modeled? This may be aspects of conversion such as time or customer demographics. It may be the prediction of conversion chains such as trial, paid, up-sell.
Churn: For service industries, churn is something that is very important is likely already being modelled. Is there some form of churn that is not being modelled? Churn from trials perhaps. Churn from email lists or from RSS subscriptions?
Proprietary data: Is their some unique or interesting data that you organization creates or has access to. What questions you can ask of the data that might be worth modeling. For example, meteorological data, manufacturing data, mining data, etc.

Be mindful of privacy concerns and data ownership. You may require permission before accessing the data and have to keep the results confidential or internal to your organization.

I hope you have found this useful and perhaps thought of a problem that you could investigate that will give you that push to dive deeper into applied machine learning.

If so, leave a comment, I’d love to hear what you came up with.

Work on Machine Learning Problems That Matter To You

Tweet Share Share Google Plus It is difficult to stay motivated when self-studying machine learn

How to deliver on Machine Learning projects

As Machine Learning (ML) is becoming an important part of every industry, the demand for Machine Learning Engineers (MLE) has grown dramatically. MLEs comb

OReilly.Hands-On.Machine.Learning.with.Scikit-Learn.and.TensorFlow學習筆記彙總

其中用到的知識點我都記錄在部落格中了：https://blog.csdn.net/dss_dssssd 第一章知識點總結： supervised learning k-Nearest Neighbors Linear Regression

Hands-on Machine Learning with Scikit-Learn and TensorFlow（中文版）和深度學習原理與TensorFlow實踐-學習筆記

監督學習：新增標籤。學習的目標是求出輸入與輸出之間的關係函式y=f(x)。樸素貝葉斯、邏輯迴歸和神經網路等都屬於監督學習的方法。監督學習主要解決兩類核心問題，即迴歸和分類。迴歸和分類的區別在於強調一個是連續的，一個是離散的。非監督學習：不新增標籤。學習目標是為了探索樣本資料之間是否

Rackspace teams up with Splunk on machine learning

Managed cloud services company Rackspace Inc. has revealed how it's using Splunk Inc.'s data analytics software to ensure its well-oiled business processes

Regina Barzilay, James Collins, and Phil Sharp join leadership of new effort on machine learning in health

Regina Barzilay and James Collins have been named the faculty co-leads of the Abdul Latif Jameel Clinic for Machine Learning in Health, or J-Clinic, effect

Lockheed Martin partners with Uni of Adelaide on machine learning

Technology and innovation company Lockheed Martin Australia has become the first Foundation Partner with the University of Adelaide's new Australian Instit

AI in Your Wallet: Capital One Banks on Machine Learning The Official NVIDIA Blog

When you hear of AI and machine learning, it's easy to think of technology companies leading the charge. Capital One is determined to change that. "My firs

How do you explain Machine Learning and Data Mining to a layman?

Suppose you go shopping for mangoes one day. The vendor has laid out a cart full of mangoes. You can handpick the mangoes, the vendor will weigh them, and

二、《Hands-On Machine Learning with Scikit-Learn and TensorFlow》一個完整的機器學習專案

本章中，你會假裝作為被一家地產公司剛剛僱傭的資料科學家，完整地學習一個案例專案。下面是主要步驟： 1. 專案概述。 2. 獲取資料。 3. 發現並可視化資料，發現規律。 4. 為機器學習演算法準備資料。 5. 選擇模型，進行訓練。 6. 微調模型。 7. 給出解決方案。 8. 部

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》讀書筆記第一章機器學習概覽

一、機器學習概覽為什麼使用機器學習？機器學習善於：需要進行大量手工調整或需要擁有長串規則才能解決的問題：機器學習演算法通常可以簡化程式碼、提高效能。問題複雜，傳統方法難以解決：最好的機器學習方法可以找到解決方案。環境有波動：機器學習演算法可以適

Hands on Machine Learning with Sklearn and TensorFlow學習筆記——機器學習概覽

一、什麼是機器學習？　　計算機程式利用經驗E（訓練資料）學習任務T（要做什麼，即目標），效能是P（效能指標），如果針對任務T的效能P隨著經驗E不斷增長，成為機器學習。【這是湯姆米切爾在1997年定義】　　大白話：類比於學生學習考試，你先練習一套有一套的模擬卷（這就相當於訓練資料），在這幾

Machine learning masters the fingerprint to fool biometric systems: Synthetic fingerprints can spoof smartphone fingerprint sens

Much the way that a master key can unlock every door in a building, these "DeepMasterPrints" use artificial intelligence to match a large number of prints

Work on Machine Learning Problems That Matter To You

Problems with Impact

Machine Learning at Home

Machine Learning with a Hobby

Machine Learning at Work

Work on Machine Learning Problems That Matter To You

How to deliver on Machine Learning projects

OReilly.Hands-On.Machine.Learning.with.Scikit-Learn.and.TensorFlow學習筆記彙總

Hands-on Machine Learning with Scikit-Learn and TensorFlow（中文版）和深度學習原理與TensorFlow實踐-學習筆記

Rackspace teams up with Splunk on machine learning

Regina Barzilay, James Collins, and Phil Sharp join leadership of new effort on machine learning in health

Lockheed Martin partners with Uni of Adelaide on machine learning

AI in Your Wallet: Capital One Banks on Machine Learning The Official NVIDIA Blog

How do you explain Machine Learning and Data Mining to a layman?

二、《Hands-On Machine Learning with Scikit-Learn and TensorFlow》一個完整的機器學習專案

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》讀書筆記第一章機器學習概覽

Hands on Machine Learning with Sklearn and TensorFlow學習筆記——機器學習概覽

Machine learning masters the fingerprint to fool biometric systems: Synthetic fingerprints can spoof smartphone fingerprint sens

World Machine Learning Problems

Template for Working through Machine Learning Problems in Weka

Practical Machine Learning Problems

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》讀書筆記第六章決策樹

《Hands-On Machine Learning with Scikit-Learn & TensorFlow》讀書筆記第五章支援向量機

使用PreparedStatement，出現You have an error in your SQL syntax; check the manual that corresponds to you

Understand Any Machine Learning Tool Quickly (even if you are a beginner)

Work on Machine Learning Problems That Matter To You

Problems with Impact

Machine Learning at Home

Machine Learning with a Hobby

Machine Learning at Work

相關推薦