[Python] Normalize the data with Pandas

阿新 • • 發佈：2017-12-18

orm cnblogs port pre .sh use panda 技術分享 height

import os
import pandas as pd
import matplotlib.pyplot as plt

def test_run():
    start_date=‘2017-01-01‘
    end_data=‘2017-12-15‘
    dates=pd.date_range(start_date, end_data)

    # Create an empty data frame
    df=pd.DataFrame(index=dates)

    symbols=[‘SPY‘, ‘AAPL‘, ‘IBM‘, ‘GOOG‘, ‘ 
GLD‘]
    for symbol in symbols:
        temp=getAdjCloseForSymbol(symbol)
        df=df.join(temp, how=‘inner‘)

    return df   


def normalize_data(df):
    """ Normalize stock prices using the first row of the dataframe """
    df=df/df.ix[0, :]
    return df


def getAdjCloseForSymbol(symbol): 
     
# Load csv file
    temp=pd.read_csv("data/{0}.csv".format(symbol), 
        index_col="Date", 
        parse_dates=True,
        usecols=[‘Date‘, ‘Adj Close‘],
        na_values=[‘nan‘])
    # rename the column
    temp=temp.rename(columns={‘Adj Close‘: symbol})
    return temp

def plot_data(df, title=" 
Stock prices"):
    ax=df.plot(title=title, fontsize=10)
    ax.set_xlabel("Date")
    ax.set_ylabel("Price")
    plt.show()


if __name__ == ‘__main__‘:
    df=test_run()
    # data=data.ix[‘2017-12-01‘:‘2017-12-15‘, [‘IBM‘, ‘GOOG‘]]    
    df=normalize_data(df)
    plot_data(df)
    """
                       IBM         GOOG
    2017-12-01  154.759995  1010.169983
    2017-12-04  156.460007   998.679993
    2017-12-05  155.350006  1005.150024
    2017-12-06  154.100006  1018.380005
    2017-12-07  153.570007  1030.930054
    2017-12-08  154.809998  1037.050049
    2017-12-11  155.410004  1041.099976
    2017-12-12  156.740005  1040.479980
    2017-12-13  153.910004  1040.609985
    2017-12-15  152.500000  1064.189941
    """

It is easy to compare the data by normalize it.

技術分享圖片

[Python] Normalize the data with Pandas

orm cnblogs port pre .sh use panda 技術分享 height import os import pandas as pd import matplotlib.pyplot as plt def test_run():

[Python] Slice the data with pandas

slice example name [] ant 2.4 int index ram For example we have dataframe like this: SPY AAPL IBM

C extensions, Cleaning data with Pandas, Machine Learning and more

Worthy Read

Prepare Data for Machine Learning in Python with Pandas

Tweet Share Share Google Plus If you are using the Python stack for studying and applying machin

解決Problem with writing the data， class java.util.ArrayList, ContentType: application/xml

writing 數據庫今天，在使用cxf讀取內網數據庫的數據時，報以下一個錯誤Problem with writing the data， class java.util.ArrayList, ContentType: application/xml以上錯誤提示我們，在寫入數據時有錯誤，最後經檢查

The data directory was initialized by PostgreSQL version 9.6, which is not compatible with this version 10.0.

data was start pos zed with bre mark star 在PostgreSQL9.6.5 安裝 Postgis2.4.2 出現錯誤 The data directory was initialized by PostgreSQL version

Chapter 6： Dimensionality Reduction: Squashing the Data Pancake with PCA

Suggestion it is best not to apply PCA to raw countss (word counts, music play counts, movie viewing counts, etc.)。 The reason for this is that such counts

How to use APIs with Pandas and store the results in Redshift

How to use APIs with Pandas and store the results in RedshiftHere is an easy tutorial to help understand how you can use Pandas to get data from a RESTFUL

Crowdsourcing ML training data with the AutoML API and Firebase

Crowdsourcing ML training data with the AutoML API and FirebaseWant to build an ML model but don’t have enough training data? In this post I’ll show you ho

Cleaning and Prepping Data with Python for Data Science

Check Your Data … QuicklyThe first thing you want to do when you get a new dataset, is to quickly to verify the contents with the .head() method.import pan

Marginally Interesting: How Python became the language of choice for data science

Tweet Nowadays Python is probably the programming language of choice (b

Enough with the Data Tables

Enough with the Data TablesData is important. But just providing data to your users isn’t enough to help them understand their world and take actions.There

What's new in Python 3 via code snippets, Collect Your Own Fitbit Data with Python and more

Worthy Read

[Machine Learning with Python] My First Data Preprocessing Pipeline with Titanic Dataset

The Dataset was acquired from https://www.kaggle.com/c/titanic For data preprocessing, I firstly defined three transformers: DataFrameSelector: S

How to Use Power Transforms for Time Series Forecast Data with Python

Tweet Share Share Google Plus Data transforms are intended to remove noise and improve the signa

How to Handle Missing Data with Python

Tweet Share Share Google Plus Real-world data often has missing values. Data can have missing va

Quick and Dirty Data Analysis with Pandas

Tweet Share Share Google Plus Before you can select and prepare your data for modeling, you need

Deep Learning 16：用自編碼器對資料進行降維_讀論文“Reducing the Dimensionality of Data with Neural Networks”的筆記

前言筆記摘要：高維資料可以通過一個多層神經網路把它編碼成一個低維資料，從而重建這個高維資料，其中這個神經網路的中間層神經元數是較少的，可把這個神經網路叫做自動編碼網路或自編碼器（autoencoder）。梯度下降法可用來微調這個自動編碼器的權值，但是隻有在初始化權值較好時才能得到最優解，不然就

Deep Learning讀書筆記（一）：Reducing the Dimensionality of Data with Neural Networks

這是發表在Science上的一篇文章，是Deep Learning的開山之作，同樣也是我讀的第一篇文章，我的第一篇讀書筆記也從這開始吧。文章的主要工作是資料的降維，等於說這裡使用深度學習網路主要提取資料中的特徵，但卻並沒有將這個特徵應用到分類等

UVa 11995 - I Can Guess the Data Structure!

spa 實現 size end amp ins post bool ret 題目：給你一些數據結構上的操作，推斷該數據結構是棧、隊列、還是優先隊列。分析：0基礎DS，模擬。構建三種結構，直接模擬，然後依據結果推斷。說明：優先隊列用最大堆實現。 #include &l

[Python] Normalize the data with Pandas

相關推薦