Python程式設計入門學習筆記(九)

阿新 • • 發佈：2018-12-25

## Python第四課

### 新的資料格式：CSV

- 純文字，使用某個字符集，比如ACSII，Unicode，EBCDIC或GB2312（簡體中文環境）等；
- 由記錄組成（典型的是每行一條記錄）；
- 每條記錄被分隔符（英語：Delimiter）分隔為欄位（英語：Field（computer science））（典型分隔符有逗號、分號或製表符；有時分隔符可以包括可選的空格）；
- 每條記錄都有同樣的欄位序列。

#### pandas


```python
import pandas as pd
import numpy as np
```


```python
f = open('K:/Code/jupyter-notebook/Python Study/成績表.csv')
df = pd.read_csv(f)
```


```python
#head預設讀取前5行
df.head()
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>張小文</td>
      <td>男</td>
      <td>20</td>
      <td>1002</td>
      <td>56</td>
      <td>62</td>
      <td>86</td>
      <td>85</td>
      <td>86</td>
      <td>75</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
      <td>李清</td>
      <td>女</td>
      <td>19</td>
      <td>1001</td>
      <td>94</td>
      <td>65</td>
      <td>85</td>
      <td>90</td>
      <td>84</td>
      <td>75</td>
    </tr>
    <tr>
      <th>2</th>
      <td>3</td>
      <td>孫明</td>
      <td>男</td>
      <td>19</td>
      <td>1003</td>
      <td>74</td>
      <td>85</td>
      <td>80</td>
      <td>84</td>
      <td>86</td>
      <td>91</td>
    </tr>
    <tr>
      <th>3</th>
      <td>4</td>
      <td>陳平</td>
      <td>男</td>
      <td>8</td>
      <td>1003</td>
      <td>85</td>
      <td>75</td>
      <td>78</td>
      <td>73</td>
      <td>86</td>
      <td>81</td>
    </tr>
    <tr>
      <th>4</th>
      <td>5</td>
      <td>劉東</td>
      <td>男</td>
      <td>20</td>
      <td>1001</td>
      <td>88</td>
      <td>74</td>
      <td>77</td>
      <td>65</td>
      <td>85</td>
      <td>71</td>
    </tr>
  </tbody>
</table>
</div>




```python
type(df)
```




    pandas.core.frame.DataFrame



### DataFrame


```python
# 列名
print(df.columns)
# 索引
print(df.index)
```

    Index(['學號', '姓名', '性別', '年齡', '班級', '計算機', '英語', '數學', '語文', '物理', '化學'], dtype='object')
    RangeIndex(start=0, stop=8, step=1)
    


```python
df.loc[0]
```




    學號        1
    姓名      張小文
    性別        男
    年齡       20
    班級     1002
    計算機      56
    英語       62
    數學       86
    語文       85
    物理       86
    化學       75
    Name: 0, dtype: object




```python
# 篩選數學成績大於80的
df[df.數學 > 80]
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>張小文</td>
      <td>男</td>
      <td>20</td>
      <td>1002</td>
      <td>56</td>
      <td>62</td>
      <td>86</td>
      <td>85</td>
      <td>86</td>
      <td>75</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
      <td>李清</td>
      <td>女</td>
      <td>19</td>
      <td>1001</td>
      <td>94</td>
      <td>65</td>
      <td>85</td>
      <td>90</td>
      <td>84</td>
      <td>75</td>
    </tr>
  </tbody>
</table>
</div>




```python
df[df.數學 < 70]
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>7</th>
      <td>8</td>
      <td>黃佳</td>
      <td>女</td>
      <td>20</td>
      <td>1002</td>
      <td>81</td>
      <td>78</td>
      <td>58</td>
      <td>84</td>
      <td>90</td>
      <td>82</td>
    </tr>
  </tbody>
</table>
</div>




```python
# 複雜篩選
df[(df.語文 >= 80) & (df.數學 >= 80) & (df.英語 >= 80)]
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>2</th>
      <td>3</td>
      <td>孫明</td>
      <td>男</td>
      <td>19</td>
      <td>1003</td>
      <td>74</td>
      <td>85</td>
      <td>80</td>
      <td>84</td>
      <td>86</td>
      <td>91</td>
    </tr>
  </tbody>
</table>
</div>



### 排序


```python
df.sort_values(['數學','語文']).head()
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>7</th>
      <td>8</td>
      <td>黃佳</td>
      <td>女</td>
      <td>20</td>
      <td>1002</td>
      <td>81</td>
      <td>78</td>
      <td>58</td>
      <td>84</td>
      <td>90</td>
      <td>82</td>
    </tr>
    <tr>
      <th>6</th>
      <td>7</td>
      <td>王大力</td>
      <td>男</td>
      <td>18</td>
      <td>1003</td>
      <td>85</td>
      <td>85</td>
      <td>75</td>
      <td>78</td>
      <td>84</td>
      <td>69</td>
    </tr>
    <tr>
      <th>4</th>
      <td>5</td>
      <td>劉東</td>
      <td>男</td>
      <td>20</td>
      <td>1001</td>
      <td>88</td>
      <td>74</td>
      <td>77</td>
      <td>65</td>
      <td>85</td>
      <td>71</td>
    </tr>
    <tr>
      <th>5</th>
      <td>6</td>
      <td>嚴雲峰</td>
      <td>男</td>
      <td>19</td>
      <td>1001</td>
      <td>84</td>
      <td>87</td>
      <td>77</td>
      <td>80</td>
      <td>70</td>
      <td>81</td>
    </tr>
    <tr>
      <th>3</th>
      <td>4</td>
      <td>陳平</td>
      <td>男</td>
      <td>8</td>
      <td>1003</td>
      <td>85</td>
      <td>75</td>
      <td>78</td>
      <td>73</td>
      <td>86</td>
      <td>81</td>
    </tr>
  </tbody>
</table>
</div>



### 訪問


```python
# 按照索引定位
df.loc[1]
```




    學號        2
    姓名       李清
    性別        女
    年齡       19
    班級     1001
    計算機      94
    英語       65
    數學       85
    語文       90
    物理       84
    化學       75
    Name: 1, dtype: object



### 索引


```python
scores = {
    '英語': [90,70,89],
    '數學': [64,78,48],
    '姓名': ['wang','li','sun']
}
df = pd.DataFrame(scores, index = ['one','two','three'])
df
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>英語</th>
      <th>數學</th>
      <th>姓名</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>one</th>
      <td>90</td>
      <td>64</td>
      <td>wang</td>
    </tr>
    <tr>
      <th>two</th>
      <td>70</td>
      <td>78</td>
      <td>li</td>
    </tr>
    <tr>
      <th>three</th>
      <td>89</td>
      <td>48</td>
      <td>sun</td>
    </tr>
  </tbody>
</table>
</div>




```python
df.index
```




    Index(['one', 'two', 'three'], dtype='object')




```python
df.loc['one']
```




    英語      90
    數學      64
    姓名    wang
    Name: one, dtype: object




```python
# 實實在在的所謂的第幾行，當索引不是數字索引時使用
df.iloc[0]
```




    英語      90
    數學      64
    姓名    wang
    Name: one, dtype: object




```python
# 合併了loc和iloc的功能
df.ix[0]
```

    c:\python\python36\lib\site-packages\ipykernel_launcher.py:1: DeprecationWarning: 
    .ix is deprecated. Please use
    .loc for label based indexing or
    .iloc for positional indexing
    
    See the documentation here:
    http://pandas.pydata.org/pandas-docs/stable/indexing.html#ix-indexer-is-deprecated
      """Entry point for launching an IPython kernel.
    




    英語      90
    數學      64
    姓名    wang
    Name: one, dtype: object




```python
df.loc[:2]
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>張小文</td>
      <td>男</td>
      <td>20</td>
      <td>1002</td>
      <td>56</td>
      <td>62</td>
      <td>86</td>
      <td>85</td>
      <td>86</td>
      <td>75</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
      <td>李清</td>
      <td>女</td>
      <td>19</td>
      <td>1001</td>
      <td>94</td>
      <td>65</td>
      <td>85</td>
      <td>90</td>
      <td>84</td>
      <td>75</td>
    </tr>
    <tr>
      <th>2</th>
      <td>3</td>
      <td>孫明</td>
      <td>男</td>
      <td>19</td>
      <td>1003</td>
      <td>74</td>
      <td>85</td>
      <td>80</td>
      <td>84</td>
      <td>86</td>
      <td>91</td>
    </tr>
  </tbody>
</table>
</div>




```python
df.iloc[:3]
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>張小文</td>
      <td>男</td>
      <td>20</td>
      <td>1002</td>
      <td>56</td>
      <td>62</td>
      <td>86</td>
      <td>85</td>
      <td>86</td>
      <td>75</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
      <td>李清</td>
      <td>女</td>
      <td>19</td>
      <td>1001</td>
      <td>94</td>
      <td>65</td>
      <td>85</td>
      <td>90</td>
      <td>84</td>
      <td>75</td>
    </tr>
    <tr>
      <th>2</th>
      <td>3</td>
      <td>孫明</td>
      <td>男</td>
      <td>19</td>
      <td>1003</td>
      <td>74</td>
      <td>85</td>
      <td>80</td>
      <td>84</td>
      <td>86</td>
      <td>91</td>
    </tr>
  </tbody>
</table>
</div>




```python
# 訪問某一行，是錯誤的
# df[0]

#訪問多行資料是可以使用切片的
df[:2]
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>張小文</td>
      <td>男</td>
      <td>20</td>
      <td>1002</td>
      <td>56</td>
      <td>62</td>
      <td>86</td>
      <td>85</td>
      <td>86</td>
      <td>75</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
      <td>李清</td>
      <td>女</td>
      <td>19</td>
      <td>1001</td>
      <td>94</td>
      <td>65</td>
      <td>85</td>
      <td>90</td>
      <td>84</td>
      <td>75</td>
    </tr>
  </tbody>
</table>
</div>




```python
# dataFrame中的陣列
df.values
```




    array([[1, '張小文', '男', 20, 1002, 56, 62, 86, 85, 86, 75],
           [2, '李清', '女', 19, 1001, 94, 65, 85, 90, 84, 75],
           [3, '孫明', '男', 19, 1003, 74, 85, 80, 84, 86, 91],
           [4, '陳平', '男', 8, 1003, 85, 75, 78, 73, 86, 81],
           [5, '劉東', '男', 20, 1001, 88, 74, 77, 65, 85, 71],
           [6, '嚴雲峰', '男', 19, 1001, 84, 87, 77, 80, 70, 81],
           [7, '王大力', '男', 18, 1003, 85, 85, 75, 78, 84, 69],
           [8, '黃佳', '女', 20, 1002, 81, 78, 58, 84, 90, 82]], dtype=object)




```python
df.數學.values
```




    array([86, 85, 80, 78, 77, 77, 75, 58], dtype=int64)




```python
# 簡單的統計
df.數學.value_counts()
```




    77    2
    78    1
    75    1
    58    1
    86    1
    85    1
    80    1
    Name: 數學, dtype: int64




```python
new = df[['數學','語文']].head()
new
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>數學</th>
      <th>語文</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>86</td>
      <td>85</td>
    </tr>
    <tr>
      <th>1</th>
      <td>85</td>
      <td>90</td>
    </tr>
    <tr>
      <th>2</th>
      <td>80</td>
      <td>84</td>
    </tr>
    <tr>
      <th>3</th>
      <td>78</td>
      <td>73</td>
    </tr>
    <tr>
      <th>4</th>
      <td>77</td>
      <td>65</td>
    </tr>
  </tbody>
</table>
</div>




```python
new * 2
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>數學</th>
      <th>語文</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>172</td>
      <td>170</td>
    </tr>
    <tr>
      <th>1</th>
      <td>170</td>
      <td>180</td>
    </tr>
    <tr>
      <th>2</th>
      <td>160</td>
      <td>168</td>
    </tr>
    <tr>
      <th>3</th>
      <td>156</td>
      <td>146</td>
    </tr>
    <tr>
      <th>4</th>
      <td>154</td>
      <td>130</td>
    </tr>
  </tbody>
</table>
</div>



### 重點


```python
def func(score):
    if score>=80:
        return '優秀'
    elif score>=70:
        return '良'
    elif score>=60:
        return '及格'
    else:
        return '不及格'
df['數學分類'] = df.數學.map(func)
```


```python
df.head()
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
      <th>數學分類</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>張小文</td>
      <td>男</td>
      <td>20</td>
      <td>1002</td>
      <td>56</td>
      <td>62</td>
      <td>86</td>
      <td>85</td>
      <td>86</td>
      <td>75</td>
      <td>優秀</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
      <td>李清</td>
      <td>女</td>
      <td>19</td>
      <td>1001</td>
      <td>94</td>
      <td>65</td>
      <td>85</td>
      <td>90</td>
      <td>84</td>
      <td>75</td>
      <td>優秀</td>
    </tr>
    <tr>
      <th>2</th>
      <td>3</td>
      <td>孫明</td>
      <td>男</td>
      <td>19</td>
      <td>1003</td>
      <td>74</td>
      <td>85</td>
      <td>80</td>
      <td>84</td>
      <td>86</td>
      <td>91</td>
      <td>優秀</td>
    </tr>
    <tr>
      <th>3</th>
      <td>4</td>
      <td>陳平</td>
      <td>男</td>
      <td>8</td>
      <td>1003</td>
      <td>85</td>
      <td>75</td>
      <td>78</td>
      <td>73</td>
      <td>86</td>
      <td>81</td>
      <td>良</td>
    </tr>
    <tr>
      <th>4</th>
      <td>5</td>
      <td>劉東</td>
      <td>男</td>
      <td>20</td>
      <td>1001</td>
      <td>88</td>
      <td>74</td>
      <td>77</td>
      <td>65</td>
      <td>85</td>
      <td>71</td>
      <td>良</td>
    </tr>
  </tbody>
</table>
</div>




```python
# applymap對dataFrame中所有的資料進行操作的一個函式，非常重要
def func(number):
    return number + 10
# 等價
func = lambda number: number + 10

df.applymap(lambda x: str(x) + ' -').head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
      <th>數學分類</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1 -</td>
      <td>張小文 -</td>
      <td>男 -</td>
      <td>20 -</td>
      <td>1002 -</td>
      <td>56 -</td>
      <td>62 -</td>
      <td>86 -</td>
      <td>85 -</td>
      <td>86 -</td>
      <td>75 -</td>
      <td>優秀 -</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2 -</td>
      <td>李清 -</td>
      <td>女 -</td>
      <td>19 -</td>
      <td>1001 -</td>
      <td>94 -</td>
      <td>65 -</td>
      <td>85 -</td>
      <td>90 -</td>
      <td>84 -</td>
      <td>75 -</td>
      <td>優秀 -</td>
    </tr>
  </tbody>
</table>
</div>



### 匿名函式


```python
[i+ 100 for i in range(10)]
```




    [100, 101, 102, 103, 104, 105, 106, 107, 108, 109]




```python
def func(x):
    return x + 100
```


```python
list(map(func,range(10)))
# 函式太簡單，不經常使用，或者沒有必要取名字就可以使用匿名函式lambda
list(map(lambda x: x + 100,range(10)))
```




    [100, 101, 102, 103, 104, 105, 106, 107, 108, 109]




```python
# 根據多列生成新的一個列的操作，用apply函式
df['new_score'] = df.apply(lambda x: x.數學 + x.語文, axis = 1)
```


```python
#前幾行
df.head(2)
#最後幾行
df.tail(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
      <th>數學分類</th>
      <th>new_score</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>6</th>
      <td>7</td>
      <td>王大力</td>
      <td>男</td>
      <td>18</td>
      <td>1003</td>
      <td>85</td>
      <td>85</td>
      <td>75</td>
      <td>78</td>
      <td>84</td>
      <td>69</td>
      <td>良</td>
      <td>153</td>
    </tr>
    <tr>
      <th>7</th>
      <td>8</td>
      <td>黃佳</td>
      <td>女</td>
      <td>20</td>
      <td>1002</td>
      <td>81</td>
      <td>78</td>
      <td>58</td>
      <td>84</td>
      <td>90</td>
      <td>82</td>
      <td>不及格</td>
      <td>142</td>
    </tr>
  </tbody>
</table>
</div>



### pandas中的dataFrame的操作，很大一部分和numpy中的二維陣列的操作是近似的

<h1 style="text-align:center">matplotlib繪圖 </h1>


```python
df = df.drop(['new_score'],axis = 1)
```


```python
df.head(2)
```




<div>
<style scoped>
    .dataframe tbody tr th:only-of-type {
        vertical-align: middle;
    }

    .dataframe tbody tr th {
        vertical-align: top;
    }

    .dataframe thead th {
        text-align: right;
    }
</style>
<table border="1" class="dataframe">
  <thead>
    <tr style="text-align: right;">
      <th></th>
      <th>學號</th>
      <th>姓名</th>
      <th>性別</th>
      <th>年齡</th>
      <th>班級</th>
      <th>計算機</th>
      <th>英語</th>
      <th>數學</th>
      <th>語文</th>
      <th>物理</th>
      <th>化學</th>
      <th>數學分類</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <th>0</th>
      <td>1</td>
      <td>張小文</td>
      <td>男</td>
      <td>20</td>
      <td>1002</td>
      <td>56</td>
      <td>62</td>
      <td>86</td>
      <td>85</td>
      <td>86</td>
      <td>75</td>
      <td>優秀</td>
    </tr>
    <tr>
      <th>1</th>
      <td>2</td>
      <td>李清</td>
      <td>女</td>
      <td>19</td>
      <td>1001</td>
      <td>94</td>
      <td>65</td>
      <td>85</td>
      <td>90</td>
      <td>84</td>
      <td>75</td>
      <td>優秀</td>
    </tr>
  </tbody>
</table>
</div>



### 繪圖


```python
import numpy as np
import matplotlib.pyplot as plt
#這一行是必不可少的
%matplotlib inline 
```


```python
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.plot(x, y)
plt.plot(x, np.cos(x))
```




    [<matplotlib.lines.Line2D at 0x1b3061cc7f0>]




![png](output_48_1.png)



```python
plt.plot(x, y, '--')
```




    [<matplotlib.lines.Line2D at 0x1b3082c71d0>]




![png](output_49_1.png)



```python
fig = plt.figure()
plt.plot(x, y, '--')
```




    [<matplotlib.lines.Line2D at 0x1b30832ca58>]




![png](output_50_1.png)



```python
fig.savefig('K:/Code/jupyter-notebook/Python Study/first_figure.png')
```


```python
# 虛線樣式
plt.subplot(2,1,1)
plt.plot(x, np.sin(x),'--')

plt.subplot(2,1,2)
plt.plot(x, np.cos(x),)
```




    [<matplotlib.lines.Line2D at 0x1b308395198>]




![png](output_52_1.png)



```python
# 點狀樣式
x = np.linspace(0,10,20)
plt.plot(x, np.sin(x),'o')
```




    [<matplotlib.lines.Line2D at 0x1b3084f4940>]




![png](output_53_1.png)



```python
# color控制顏色
x = np.linspace(0,10,20)
plt.plot(x, np.sin(x),'o',color= 'red')
```




    [<matplotlib.lines.Line2D at 0x1b30855bef0>]




![png](output_54_1.png)



```python
# 加label標籤
x = np.linspace(0, 10, 100)
y = np.sin(x)

plt.plot(x, y,'--',label='sin(x)')
plt.plot(x, np.cos(x),'o',label='cos(x)')
# legend控制label的顯示效果，loc是控制label的位置的顯示
plt.legend(loc= 1 )
```




    <matplotlib.legend.Legend at 0x1b309907198>




![png](output_55_1.png)



```python
plt.legend?
##當遇到一個不熟悉的函式的時候，多使用？號，檢視函式的文件
```


```python
# plot函式，可定義的引數非常多
x = np.linspace(0, 10, 20)
y = np.sin(x)
plt.plot(x,y,'-p',color = 'green',
        markersize = 10,linewidth = 4,
        markeredgecolor = 'orange',
        markeredgewidth=2)
plt.ylim(-0.5,0.8)
```




    (-0.5, 0.8)




![png](output_57_1.png)



```python
# 具體引數可檢視文件
plt.plot?
```


```python
# ylim,xlim限定函式
plt.plot(x,y,'-p',color = 'green',
        markersize = 10,linewidth = 4,
        markeredgecolor = 'orange',
        markeredgewidth=2)
plt.ylim(-0.5,1.2)
plt.xlim(2,8)
```




    (2, 8)




![png](output_59_1.png)



```python
#散點圖函式
plt.scatter(x,y,s=100,c='red')
```




    <matplotlib.collections.PathCollection at 0x1b309da0c88>




![png](output_60_1.png)



```python
plt.style.use('classic')

x = np.random.randn(100)
y = np.random.randn(100)
colors = np.random.randn(100)
sizes = 1000 * np.random.randn(100)
plt.scatter(x,y,c=colors,s=sizes,alpha=0.4)
plt.colorbar()
```

    c:\python\python36\lib\site-packages\matplotlib\collections.py:902: RuntimeWarning: invalid value encountered in sqrt
      scale = np.sqrt(self._sizes) * dpi / 72.0 * self._factor
    




    <matplotlib.colorbar.Colorbar at 0x1b309fe4f98>




![png](output_61_2.png)


### pandas本身自帶繪圖

### 線性圖形


```python
import pandas as pd
df = pd.DataFrame(np.random.randn(100,4).cumsum(0),columns=['A','B','C','D'])
df.plot()
```




    <matplotlib.axes._subplots.AxesSubplot at 0x1b30c0c88d0>




![png](output_64_1.png)


### 柱狀圖形


```python
df = pd.DataFrame(np.random.randint(10,50,(3,4)),columns=['A','B','C','D'],index = ['one','two','three'])
df.plot.bar()
```




    <matplotlib.axes._subplots.AxesSubplot at 0x1b30c284898>




![png](output_66_1.png)



```python
df.B.plot.bar()
```




    <matplotlib.axes._subplots.AxesSubplot at 0x1b30c16c9b0>




![png](output_67_1.png)



```python
# 等價於上面的繪製
df.plot(kind = 'bar')
```




    <matplotlib.axes._subplots.AxesSubplot at 0x1b30c190898>




![png](output_68_1.png)



```python
# 進行累加
df.plot(kind = 'bar',stacked = True)
```




    <matplotlib.axes._subplots.AxesSubplot at 0x1b30c223978>




![png](output_69_1.png)


### 直方圖


```python
df = pd.DataFrame(np.random.randn(100,4),columns=['A','B','C','D'])
df.hist(column='A',grid=True,figsize=(10,5))
```




    array([[<matplotlib.axes._subplots.AxesSubplot object at 0x000001B30DE24DD8>]],
          dtype=object)




![png](output_71_1.png)


### 密度圖


```python
# 等價於df.plot(kind = 'kde')
# 提示：執行前，需要安裝scipy庫，用pip install scipy命令，否則提示：ModuleNotFoundError: No module named 'scipy'
df.plot.kde()
```




    <matplotlib.axes._subplots.AxesSubplot at 0x1b30e082d30>




![png](output_73_1.png)


### matplotlib 繪製三維圖


```python
from mpl_toolkits.mplot3d import Axes3D  
from matplotlib import cm  
from matplotlib.ticker import LinearLocator, FormatStrFormatter  
import matplotlib.pyplot as plt  
import numpy as np  
 
fig = plt.figure()  
ax = fig.gca(projection='3d') 
#橫座標區間,內部不能重複
X = np.arange(-5, 5, 0.25)
#縱座標區間,內部不能重複
Y = np.arange(-5, 5, 0.25)
#生成網格
X, Y = np.meshgrid(X, Y)
R = np.sqrt(X**2 + Y**2)  
Z = np.sin(R)  

#plot the surface z axis
surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, cmap=cm.jet,  
        linewidth=0, antialiased=False)  

#Customize the 
ax.set_zlim(-1.01, 1.01)  
ax.zaxis.set_major_locator(LinearLocator(10))  
ax.zaxis.set_major_formatter(FormatStrFormatter('%.02f'))  
 
# Add a color bar which maps values to colors
fig.colorbar(surf, shrink=0.5, aspect=5)  
 
plt.show() 
```


![png](output_75_0.png)

Python程式設計入門學習筆記(九)

## Python第四課 ### 新的資料格式：CSV - 純文字，使用某個字符集，比如ACSII，Unicode，EBCDIC或GB2312（簡體中文環境）等； - 由記錄組成（典型的是每行一條記錄）； - 每條記錄被分隔符（英語：Delimiter）分隔為欄位（英語：

Python程式設計入門學習筆記(十)

<h1 style="text-align:center">泰坦尼克資料處理與分析 </h1> ![](http://www.allengao.cn/wp-content/uploads/2018/06/Titanic.jpg) ```pytho

Python程式設計入門學習筆記(三)

### 切片 ```python line = 'Welcome to Beijing,welcome to China!' #取字串的前10個字元,line[0:10],預設是0 line[:10] ``` 'Welcome to' ```pyt

Python程式設計入門學習筆記(五)

### 函式 ```python varibal = { 'a': 100, 'b': 100, 'c': 200 } ``` ```python varibal['a'] ``` 100 ```python varib

Python程式設計入門學習筆記(七)

簡單爬蟲python庫 1、request 用來獲取頁面內容 2、BeautifulSoup 文件連結：https://www.crummy.com/software/BeautifulSoup/bs4/doc/index.zh.html爬取鏈家網的資訊

Python程式設計入門學習筆記(八)

## Python 第四課 ### 課程安排 1、numpy 2、pandas 3、matplotlib ### numpy 陣列和列表，列表可以儲存任意型別的資料，而陣列只能儲存一種型別的資料 ```python import arr

Python程式設計入門學習筆記(一)

# 第一章 python介紹 ### 最簡單的開始 ```python print('hello,"world') ``` hello,"world ```python print("hello,'world") ``` hello,'

Python程式設計入門學習筆記(六)

## Python第三課 ### 推薦一個python資料結構視覺化工具 http://www.pythontutor.com/ ### 課表 - Mysql資料庫的基本操作 - 用python操作資料庫 - 編寫python爬蟲並儲存到資料庫 ### 資料庫

Python程式設計入門學習筆記(二)

### 變數：代表某個值的名稱 ### 語法糖 ```python a = 10 b = 20 a,b = b,a print("a is {},b is {}".format(a,b)) ``` a is 10,b is 20 ### 命名規範

Python程式設計入門學習筆記(前言)

第零章學習Python前的準備工作關於學習內容的說明：一、Python基礎 – 變數與資料型別，及常見資料型別的用法二、Python基礎 – 條件、迴圈、函式、類三、Python爬蟲 – Python爬蟲並用Mysql資料庫儲存四、pandas通覽 – 用pandas做資料

Python程式設計入門學習筆記(四)

## python第二課 ### 課程內容 1、條件判斷 2、迴圈 3、函式 4、類 ### 條件判斷 ```python #偽程式碼表示 if condition: do something else: do something ```

opencv3程式設計入門學習筆記1-----基本影象容器Mat

1、Mat的結構 Mat本質上是由兩個資料部分組成的類：（包含資訊有矩陣的大小，用於儲存的方法，矩陣儲存的地址等）的矩陣頭和一個指標，指向包含了畫素值的矩陣（可根據選擇用於儲存的方法採用任何維度儲存資料）。矩陣頭部的大小是恆定的。然而，矩陣本身的大小因影象的不同而不同，通

Shell程式設計入門學習筆記之shell變數

shell簡介 Shell本身是一個用C語言編寫的程式，它是使用者使用Unix/Linux的橋樑，使用者的大部分工作都是通過Shell完成的。他不是Unix/Linux系統核心的一部分，但是他呼叫了系統核心的大部分功能來執行程式、建立檔案並以並行的方式協調各個程式的執行。因此，

Opencv3程式設計入門學習筆記（一）

1.影象載入 a) IplImage* srcImage0 = cvLoadImage("lenna", 1);//程式退出前如果步release，會出現記憶體洩漏問題。 b) Mat srcImage = imread("lenna.png", 1);//同樣使讀取圖片，

Opencv3程式設計入門學習筆記（四）之split通道分離Debug過程中0xC0000005記憶體訪問衝突問題

這是筆者學習《Opencv3程式設計入門》的第四篇部落格，這篇部落格主要是解決在Windows系統下VS 2013中Debug含有split分離通道色彩函式時報出的0xC0000005記憶體訪問衝突問題，問題表現如下面第一幅圖所示。剛剛遇到這個問題的時候

python資料分析學習筆記九

第九章分析文字資料和社交媒體 1 安裝nltk 略 2 濾除停用字姓名和數字示例程式碼如下: import nltk # 載入英語停用字語料 sw = set(nltk.corpus.stopwords.words('english')) print('Sto

Python程式設計入門-第八章輸入和輸出 -學習筆記

第8章輸入和輸出一、設定字串格式對於之前學習的print()\input()這些函式是針對基本的控制檯I/O。 1、字串插入字串插入是一種設定字串格式的簡單方法，總是採用如下格式： format % values 其中format是包含一個或

python數據分析入門學習筆記兒

rip help cat app run 復雜 bsp 真的 parser 學習利用python進行數據分析的筆記兒&下星期二內部交流會要講的內容，一並分享給大家。博主粗心大意，有什麽不對的地方歡迎指正~還有許多尚待完善的地方，待我一邊學習一邊完善~ 前言：各種和

Python入門學習筆記02（文件的打開、讀寫）

清空文件文件打開文件取整讀取默認操作 ioe 如果 Python使用open()函數打開一個文件，函數參數為文件路徑，打開模式，指定編碼。指定編碼需要特別註意，改參數默認為使用系統編碼，在中文操作系統上為GBK，如果和編譯器編碼不一致的話，打開含有中文字符

Python入門學習筆記03（裝飾器）

語法糖替換開頭 ogg highlight 使用情況 war \n 裝飾器裝飾器的本質就是一個函數，它的作用是在不改變被裝飾函數代碼及調用方式的情況下為被裝飾函數加上一些功能，可以說裝飾器對於被裝飾函數來說是完全透明的。裝飾器的實現方式利用了高階函數和嵌套函數，建立

Python程式設計入門學習筆記(九)

相關推薦