1. 程式人生 > >Build machine learning model for analyzing financial credit risk using Watson Studio

Build machine learning model for analyzing financial credit risk using Watson Studio

IBM Watson Studio is a data science platform that provides all of the tools needed to develop a data-centric solution on the cloud. It uses Apache Spark clusters to provide the computational power needed to develop complex machine learning models. You can choose to create assets in Python, Scala, and R, and use open source frameworks that are already installed on Watson Studio. You can also use Watson Studio to manage models and flows that you deploy without having to leave the project workspace.

Learning objectives

This tutorial shows how easy it is to analyze the data and build prediction models without writing code. You can either choose algorithm (manual mode) to do classification (in this case) or Watson Studio can pick the right classification based on data (automatic mode). You can create and deploy the model within Watson Studio, so no additional platform is required. The deployed model can be called as a service from an external application also.

Prerequisites

Before beginning this tutorial, you’ll need:

  • Financial data related to credit card customers available from the open source database KAGGLE-UCIML

Estimated time

It should take you approximately 30 minutes to complete this tutorial.

Steps

  1. Log in to your IBM Cloud account, navigate to Catalog

    , and choose Object Storage.

  2. Navigate to Catalog and create a Watson Studio instance.

  3. After you have created the Watson Studio instance, click Get Started to launch the platform.

  4. Click New project to create a new project.

    Observe that the Object Storage created in step 1 is linked to your project.

  5. Because you want to use the model functions provided by Watson Studio, select Model.

  6. Associate the Machine LearningB service instance. If you do not have one, associate it by clicking “Associate a Machine Learning service instance.”

  7. Choose New to create a new Machine LearningB service instance or use Existing to reuse an existing one.

  8. Reload the Machine Learning instance by clicking Reload.

  9. Associate an Apache Spark instance with the project. If you do not have one, associate it by clicking “Associate an IBM Analytics for Apache Spark instance.”

  10. Choose New to create a new Spark instance or use Existing to reuse an existing one.

  11. Reload the Apache Spark instance by clicking Reload.

  12. Define the various details like naming the model and selecting whether you would like Watson Studio to use an Automatic or Manual mode. In this tutorial, we use Automatic.

  13. Add the Data Source to the project to build the model. Click Add Data Assets, then click Load. If you have not downloaded the data set already, you can download it from KAGGLE-UCIML. You can drag or browse the downloaded file.

    The data set is used to build and train the model.

  14. Select the data set, and click Next to load the data.

  15. Select the field for which the prediction is to be made from the Select Label Col drop-down. In this case, we store the predicted result in the “default payment next month” field. Remember that the All (default) drop-down takes the remaining data fields into account.

    Notice that Watson Studio suggests an algoritm to be used to analyze the data set. You can view the “Suggested technique” against the selected classification technique.

  16. Click Next to continue training the model.

  17. Watson Studio trains the model selected with the data made available to the platform.

  18. After it’s trained, the model displays the statistics about the performance, weighed true positive rate, and other parameters.

  19. Click Save to save the model.

  20. Look at the details about the model trained on the Overview and Evaluation tabs.

  21. Deploy the model by clicking Deployments. Then click Add Deployment.

  22. Select Web Service as the deployment type and provide a deployment name.

  23. Look at the details about the deployed model on the Overview and Implementation tabs. The Implementation tab provides code snippets in various programming languages to embed in an application for execution. The Test lets a user provide input and observe the prediction results.

  24. In Test, provide the relevant data and click Predict to see whether the customer’s test data will default at the credit card payment for the upcoming month.

  25. Select Graph to view the predicted result in the form of a horizontal bar graph. If the customer is going to default it is indicated by a 1 and if not, then a 0.

Summary

This tutorial explained how to create and use the IBM Cloud services that are required by Watson Studio. It then showed how to use Watson Studio to predict whether a customer is going to default on their payment based on their historical data — all without writing any code.