python之資料視覺化

阿新 • • 發佈：2018-11-19

各種圖形簡介

線性圖:plt.plot(x,y,*argv)

條形圖：plt.bar(x,y)x和y的長度應相等

水平條形圖：plt.barh(x,y)x軸成垂直,y軸水平而已

條形圖高度表示某專案內的資料個數，由於分組資料具有連續性，直方圖的各矩形通常是連續排列，而條形圖則是分開排列

直方圖：plt.hist(x)，資料集種各資料出現的頻數/頻率圖

2d直方圖：plt.hist2d(x,y)

直方圖是用面積表示各組頻數的多少，矩形的高度表示每一組的頻數或頻率，寬度則表示各組的組距，其高度與寬度均有意義

餅狀圖:plt.pie(a,labels=list('abcde'),autopct='%.2f%%'),

散點圖：plt.scatter(x,y,*argv)

箱形圖:plt.boxplot(x)

詞雲圖:wordcloud.WordCloud(*argv)

根據詞頻和背景圖產生的圖

直方圖/分佈:sns.distplot()

製圖例項

In [79]: import numpy as np

In [80]: import pandas as pd

In [81]: import matplotlib.pyplot as plt

In [82]: import wordcloud

In [83]: import seaborn
plt.rcParams['font.serif'] = ['KaiTi']
plt.rcParams['axes.unicode_minus'] = False 

names = ['mpg','cylinders','displacement','horsepower','weight','acceleration','model_year','origin','car_name']
df = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", sep='\s+', names=names)

In [154]: df['maker'] = df.car_name.apply(lambda x: x.split()[0]).str.title()
     ...: df['origin'] = df.origin.map({1: 'America', 2: 'Europe', 3: 'Asia'})
     ...: df=df.applymap(lambda x: np.nan if x == '?' else x).dropna()
     ...: df['horsepower'] = df.horsepower.astype(float)

1.雲詞

names = ['mpg','cylinders','displacement','horsepower','weight','acceleration','model_year','origin','car_name']
df = pd.read_csv("http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data", sep='\s+', names=names)
word_dict=dict(df['car_name'])
background=plt.imread('data/back.jpg')
wc = wordcloud.WordCloud(
     background_color='white',#背景顏色
     font_path='data/simhei.ttf',#字型
     mask=background,#背景圖
     max_words=1000,#最大的字數
     max_font_size=100,#最大的字型
     colormap='hsv',#色譜
     random_state=100#隨機種子
    )
    wc.generate_from_frequencies(word_dict)#根據詞頻生成wordcloud
    plt.imshow(wc)#將wordcloud投影到plt上
    plt.axis('off')#去除座標
    plt.savefig('image/DesriptionWordCloud.png', dpi=400, bbox_inches='tight')

2.線性圖

In [100]: plt.plot(df.displacement.index,df.displacement.values)
Out[100]: [<matplotlib.lines.Line2D at 0x7f1378501c50>]

In [101]: plt.show()

3.條形圖

In [104]: plt.bar(df.displacement.index[:10],df.displacement.values[:10])
Out[104]: <BarContainer object of 10 artists>

In [105]: plt.show()

4.水平條形圖

In [106]: plt.barh(df.displacement.index[:10],df.displacement.values[:10])
Out[106]: <BarContainer object of 10 artists>

In [107]: plt.show()

5.直方圖

In [116]: a=pd.Series([1,2,3,1,2,3,3,4,2,1])

In [117]: plt.hist(a)
Out[117]: 
(array([3., 0., 0., 3., 0., 0., 3., 0., 0., 1.]),
 array([1. , 1.3, 1.6, 1.9, 2.2, 2.5, 2.8, 3.1, 3.4, 3.7, 4. ]),
 <a list of 10 Patch objects>)

In [118]: plt.show()

6.餅狀圖

In [124]: data=[0.2,0.1,0.33,0.27,0.1]

In [125]: plt.pie(data,autopct='%.2f%%',labels=list('abcde'))
Out[125]: 
([<matplotlib.patches.Wedge at 0x7f136fe5af28>,
  <matplotlib.patches.Wedge at 0x7f136fe636a0>,
  <matplotlib.patches.Wedge at 0x7f136fe63da0>,
  <matplotlib.patches.Wedge at 0x7f136fe6c4e0>,
  <matplotlib.patches.Wedge at 0x7f136fe6cbe0>],
 [Text(0.889919,0.646564,'a'),
  Text(-2.57474e-08,1.1,'b'),
  Text(-1.07351,0.239957,'c'),
  Text(0.103519,-1.09512,'d'),
  Text(1.04616,-0.339919,'e')],
 [Text(0.48541,0.352671,'20.00%'),
  Text(-1.4044e-08,0.6,'10.00%'),
  Text(-0.58555,0.130886,'33.00%'),
  Text(0.0564651,-0.597337,'27.00%'),
  Text(0.570634,-0.18541,'10.00%')])

In [126]: plt.show()

7.散點圖

In [130]: plt.scatter(df.displacement.index,df.displacement.values,color='red')
Out[130]: <matplotlib.collections.PathCollection at 0x7f136faf9470>

In [131]: plt.show()

8.箱形圖

In [147]: plt.boxplot(df.iloc[[1,2,3],[1,6]])
Out[147]: 
{'whiskers': [<matplotlib.lines.Line2D at 0x7f136f0d9e48>,
  <matplotlib.lines.Line2D at 0x7f136f0d9f60>,
  <matplotlib.lines.Line2D at 0x7f136f0e8d68>,
  <matplotlib.lines.Line2D at 0x7f136f0e8e80>,
  <matplotlib.lines.Line2D at 0x7f136f0f8c88>,
  <matplotlib.lines.Line2D at 0x7f136f0f8da0>],
 'caps': [<matplotlib.lines.Line2D at 0x7f136f0e0748>,
  <matplotlib.lines.Line2D at 0x7f136f0e0ba8>,
  <matplotlib.lines.Line2D at 0x7f136f0f1668>,
  <matplotlib.lines.Line2D at 0x7f136f0f1ac8>,
  <matplotlib.lines.Line2D at 0x7f136f100588>,
  <matplotlib.lines.Line2D at 0x7f136f1009e8>],
 'boxes': [<matplotlib.lines.Line2D at 0x7f136f0d9898>,
  <matplotlib.lines.Line2D at 0x7f136f0e8908>,
  <matplotlib.lines.Line2D at 0x7f136f0f8828>],
 'medians': [<matplotlib.lines.Line2D at 0x7f136f0e0cc0>,
  <matplotlib.lines.Line2D at 0x7f136f0f1f28>,
  <matplotlib.lines.Line2D at 0x7f136f100e48>],
 'fliers': [<matplotlib.lines.Line2D at 0x7f136f0e84a8>,
  <matplotlib.lines.Line2D at 0x7f136f0f83c8>,
  <matplotlib.lines.Line2D at 0x7f136f100f60>],
 'means': []}

In [148]: plt.show()

8.直方分佈圖

#方法一
In [150]: sns.distplot(df.displacement.values)
/home/zelin/anaconda3/lib/python3.7/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
Out[150]: <matplotlib.axes._subplots.AxesSubplot at 0x7f136f0c7668>

#方法二
In [166]: g = sns.FacetGrid(df, col="origin")
     ...: g.map(sns.distplot, "mpg")
     ...: 
     ...: 
/home/zelin/anaconda3/lib/python3.7/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
Out[166]: <seaborn.axisgrid.FacetGrid at 0x7f136e0e7f98>



In [151]: plt.show()

9.關係圖

#根據兩個維度繪製關係圖，即DataFrame的兩列
In [155]: sns.factorplot(data=df,x='model_year',y='mpg')
#根據三各維度繪製關係圖
sns.factorplot(data=df,x='model_year',y='mpg',col='origin')
#從折線圖切成柱狀圖
sns.factorplot(data=df, x="model_year", y="mpg", col="origin",kind='bar')

10.繪圖同時還做迴歸

In [168]: g = sns.FacetGrid(df, col="origin")
     ...: g.map(sns.regplot, "horsepower", "mpg")
     ...: plt.xlim(0, 250)#x軸刻度最大值
     ...: plt.ylim(0, 60)#y軸刻度最大值
     ...: 
     ...: 
/home/zelin/anaconda3/lib/python3.7/site-packages/scipy/stats/stats.py:1713: FutureWarning: Using a non-tuple sequence for multidimensional indexing is deprecated; use `arr[tuple(seq)]` instead of `arr[seq]`. In the future this will be interpreted as an array index, `arr[np.array(seq)]`, which will result either in an error or a different result.
  return np.add.reduce(sorted[indexer] * weights, axis=axis) / sumval
Out[168]: (0, 60)

11.等高線圖

In [170]: df['tons'] = (df.weight/2000).astype(int)
     ...: g = sns.FacetGrid(df, col="origin", row="tons")
     ...: g.map(sns.kdeplot, "horsepower", "mpg")
     ...: plt.xlim(0, 250)
     ...: plt.ylim(0, 60)

12.按照兩個維度展開畫圖

g = sns.FacetGrid(df, col="origin", row="tons")
g.map(plt.hist, "mpg", bins=np.linspace(0, 50, 11))

13.多個維度兩兩組合繪圖

g = sns.pairplot(df[["mpg", "horsepower", "weight", "origin"]], hue="origin", diag_kind="hist")
for ax in g.axes.flat:
    plt.setp(ax.get_xticklabels(), rotation=45)

14.組合繪圖時做迴歸

g = sns.PairGrid(df[["mpg", "horsepower", "weight", "origin"]], hue="origin")
g.map_upper(sns.regplot)
g.map_lower(sns.residplot)
g.map_diag(plt.hist)
for ax in g.axes.flat:
    plt.setp(ax.get_xticklabels(), rotation=45)
g.add_legend()
g.set(alpha=0.5)

15.聯合繪圖(等高圖)

sns.jointplot("mpg", "horsepower", data=df, kind='kde')

16.聯合繪圖加回歸(散點圖)

sns.jointplot("horsepower", "mpg", data=df, kind="reg")

python之資料視覺化

各種圖形簡介線性圖:plt.plot(x,y,*argv) 條形圖：plt.bar(x,y)x和y的長度應相等水平條形圖：plt.barh(x,y)x軸成垂直,y軸水平而已條形圖高度表示某專案內的資料個數，由於分組資料具有連續性，直方圖的各矩形通常是連續排列，而條形圖則是分開排

python基礎之資料視覺化matplotlib

資料視覺化圖示的繪製需要安裝matplotlib庫，安裝方法：cmd下pip install matplotlib，以及numpy庫，安裝法法：cmd下pip install numpy。使用numpy生產影象繪製需要的資料，如果已經有了資料可以讀取資料到陣

Python與機器學習之資料視覺化(三)

裝飾Matplotlib(標籤、文字、標記、註釋…) 在機器學習實際應用中，最關鍵的部分就是資料視覺化，否則無論除錯還是總結，你無從下手。python大牛們提供了非常牛逼的庫—Matplotlib 回顧詳解影象組成 Figure 在

python—matplotlib資料視覺化例項註解系列-----之柱狀圖

本文程式碼源自官方例項，部分進行了修改和註解方便學習和查詢。 Matplotlib.pyplot中hist()的引數: n, bins, patches = plt.hist(arr, bins=1

python資料分析之資料視覺化matplotlib

import matplotlib.pyplot as plt import numpy as np import numpy.random as randn import pandas as pd f

python—matplotlib資料視覺化例項註解系列-----之箱狀圖

本文程式碼源自官方例項，部分進行了修改和註解，幫助學習和查詢。 import numpy as np import matplotlib.pyplot as plt #隨機生成一組資料73行，4列，

Python資料分析之資料視覺化

資料視覺化是資料分析很重要的一部分，它能幫助我們更好的從繁雜的資料中更直觀更有效的獲取資訊。 matplotlib是用來建立圖表的工具包之一。其目的是為Python構建一個Matlab式的繪圖介面，初次接觸的時候我就感覺這貨跟matlab畫的圖表很相似。雖說其

Python進行資料視覺化分析快速教程例項

Jupyter Notebook介紹 Jupyter Notebook是一個互動式筆記本，支援執行 40 多種程式語言。IPython notebook 是一個基於 IPython REPL 的 web 應用，安裝 IPython 後在終端輸入 ipython notebook 即可啟動服務。j

Python之PyQt5視覺化程式設計01

最近在做一個PC端視覺化顯示的小專案，針對技術實現方案和手段方便，經過查閱和與人交流後，發現Python的PyQt模組繼承了QT原有的技術特點，利用QtDesigner設計使用者需要的UI介面，在經過Pycharm的External Tools工具PyUIC將U

Python之PyQt5視覺化程式設計02——matplotlib動態顯示畫面

matplotlib動態顯示畫面分為直接在figure圖形物件動態顯示畫面和在UI介面動態顯示畫面，但是兩者本質都是使用到了matplotlib中的animation模組，並呼叫其中的FuncAnimation(figure, update, interval..

python 常用資料視覺化函式 kaggle House Price

import missingno as msno import pandas as pd import matplotlib.pyplot as plt #讀入資料並簡單描述 train = pd.read_csv(r'G:\MachineLearning\data\Hou

Python實現資料視覺化，繪製各種圖案

環境系統：windows10 python版本：python3.6.1 使用的庫：matplotlib，numpy numpy庫產生隨機數幾種方法學習Python中有不明白推薦加入交流裙

Python--Pandas-資料視覺化

1.Pandas 簡介我們做資料視覺化，其實就是就資料進行分析，使用Python做資料分析的，我想pandas必然是一個利器，一個非常強大的資料分析工具包，也集成了資料視覺化的功能，一個集資料處理、分析、視覺化於一身的工具，非常強大好用。pandas中的資料

kaggle實戰之資料視覺化

鳶尾花資料描述：一共有150組鳶尾花資料，一共有三種種類，分別是Iris-virginica、Iris-setosa、Iris-versicolor每種鳶尾花都是50組資料，每一組鳶尾花一共有四個屬性分別是萼片長度(SepalLengthCm)、萼片寬度(SepalWid

『資料視覺化』基於Python的資料視覺化工具

劉宇宙，現在一家創業型公司做技術總負責，做爬蟲和資料處理相關工作，曾從事過卡系統研發、金融雲端計算服務系統研發，物聯網方向大資料研發，著書一本，《Python3.5從零開始學》如何做Python 的資料視覺化？ pyecharts 是一個用於

基於Python的資料視覺化 matplotlib seaborn pandas

原文采用了kaggle上iris花的資料，資料來源從上面的網址上找噢如果沒有seaborn庫安裝方法如下 http://www.ithao123.cn/content-10393533.html 正式開始了~~~ # 首先載

python的資料視覺化 graphviz pydot安裝配置（win10）

1、下載安裝http://www.graphviz.org/pub/graphviz/stable/windows/graphviz-2.38.msi 要是原連結下載不了，可以下我這個：http://

Python + PyEcharts——資料視覺化

一、第一個PyEcharts圖示以下示例都是在jupyter notebook環境下執行 1 安裝：pip install pyecharts 2 引用pyecharts f

基於Python的資料視覺化庫pyecharts介紹

什麼是pyecharts？　　　　pyecharts 是一個用於生成 Echarts 圖表的類庫。　　 ech

一、ETL實踐之資料視覺化架構

開篇心聲：　　不管是學習新知識，還是遇到各種難題，總能在技術論壇找到經驗帖子。一直享受大家提供的幫助，而自己沒有任何輸出，實在過意不去。我相信技術是經驗的交流，思維的碰撞。　　這是我一次寫技術分享文章，我想用系列文章介紹用Mongodb、Kettle、Metabase這三個開源軟體在資料視覺化實踐中的一些

python之資料視覺化

各種圖形簡介

製圖例項

1.雲詞

2.線性圖

3.條形圖

4.水平條形圖

5.直方圖

6.餅狀圖

7.散點圖

8.箱形圖

8.直方分佈圖

9.關係圖

10.繪圖同時還做迴歸

11.等高線圖

12.按照兩個維度展開畫圖

13.多個維度兩兩組合繪圖

14.組合繪圖時做迴歸

15.聯合繪圖(等高圖)

16.聯合繪圖加回歸(散點圖)

相關推薦