Deep Neural Networks for Regression Problems
阿新 • • 發佈:2018-12-29
First : Processing the dataset
We will not go deep in processing the dataset, all we want to do is getting the dataset ready to be fed into our models .
We will get rid of any features with missing values, then we will encode the categorical features, that’s it.
Load the dataset :
- Load train and test data into pandas DataFrames
- Combine train and test data to process them together
combined.describe()
let’s define a function to get the columns that don’t have any missing values
Get the columns that do not have any missing values .
Let’s see how many columns we got
[out]:Number of numerical columns with no nan values : 25 Number of nun-numerical columns with no nan values : 20
The correlation between the features
From the correlation heat map above, we see that about 15 features are highly correlated with the target.
One Hot Encode The Categorical Features :
We will encode the categorical features using one hot encoding.
[out]:There were 45 columns before encoding categorical features There are 149 columns after encoding categorical features
Now, split back combined dataFrame to training data and test data