python學習筆記 Day 18 下載資料 及 Web API
阿新 • • 發佈:2018-12-21
Day 18 下載資料 及 Web API
-
python常用模組小結
-
CSV資料檔案訪問分析
- 使用CSV
import csv filename = 'sitka_weather_07-2014.csv' with open(filename) as f: reaer = csv.reader(f) header_row = next(reader)
- enumerate()函式:enumerate() 函式用於將一個可遍歷的資料物件(如列表、元組或字串)組合為一個索引序列,同時列出資料和資料下標,一般用在 for 迴圈當中。
Sample:enumerate
with open(filename) as f: reader = csv.reader(f) header_row = next(reader) for index, column_header in enumerate(header_row): print (index, column_header)
- 遍歷csv檔案並提取資料:for + append
with open(filename) as f: reader = csv.reader(f) header_row = next(reader)
- 錯誤處理
with open(filename) as f: reader = csv.reader(f) header_row = next(reader) dates,
-
JSON格式
-
pygal.i18n 不存在,No module named 'pygal.i18n’錯誤:
- 改用pygal_maps_world.i18n:
- OS X
$ pip install pygal_maps_world
- Windows
\> python -m pip install pygal_maps_world
- OS X
- 將’ from pygal.i18n import COUNTRIES '改為
from pygal_maps_world.i18n import COUNTRIES ```
- 改用pygal_maps_world.i18n:
-
module ‘pygal’ has no attribute ‘Worldmap’ 錯誤
- 改用‘pygal_maps_world’
import pygal_maps_world.maps wm = pygal_maps_world.maps.World()
- 改用‘pygal_maps_world’
-
-
Web API
-
Web API用於與網站進行互動,請求資料(以JSON或CSV返回)。
-
requests包,讓python能向網站請求資訊以及檢查返回的響應。
- 安裝requests包
- OS X
$ pip install --user requests
- Windows
$ python -m pip install --user requests
- 安裝requests包
-
處理並響應字典
import requests #執行API呼叫並存儲響應 url = "https://api.github.com/search/repositories?q=language:python&sort=stars" r = requests.get(url) print ("Status code: ", r.status_code) #將API響應儲存在一個字典變數中 response_dict = r.json() print ("Total repositories: ", response_dict['total_count']) #探索有關倉庫的資訊 repo_dicts = response_dict['items'] print ("Repositories returned: " , len(repo_dicts)) #研究第一個倉庫 repo_dict = repo_dicts[0] print ("\nKeys:", len(repo_dict)) for key in repo_dict.keys(): print (key)
-
進一步研究‘倉庫’
#研究第一個倉庫 for repo_dict in repo_dicts: print ("\nSelcted information about first repository: ") print ('Name: ' + repo_dict['name']) print ('Owner: ' , repo_dict['owner']['login']) print ('Start: ' , repo_dict['stargazers_count']) print ('Repository: ', repo_dict['html_url']) print ('Created: ', repo_dict['created_at']) print ('Updated: ', repo_dict['updated_at']) print ('Description: ', repo_dict['description'])
-
‘NoneType’ object has no attribute ‘decode’ 錯誤:執行下面的程式碼時出現上述錯誤:
names, plot_dicts = [], [] for repo_dict in repo_dicts: names.append(repo_dict['name']) plot_dict = { 'value': repo_dict['stargazers_count'], 'label': repo_dict['description'] , } plot_dicts.append(plot_dict) #視覺化 my_style = LS('#333366', base_style = LCS) my_config = pygal.Config() my_config.x_label_rotation = 45 my_config.show_legend = False my_config.title_font_size = 24 my_config.label_font_size = 14 my_config.major_label_font_size = 18 my_config.truncate_label = 15 my_config_show_y_guides = False my_config.width = 1000 chart = pygal.Bar(my_config, style = my_style) chart.title = 'Most-starred Python Projects on GitHub' chart.x_labels = names chart.add('', plot_dicts) chart.render_to_file('python_repos.svg')
參考下面兩種解決辦法:
第一種方法,即:
'label': str(repo_dict['description']),
改為:
'label': str(repo_dict['description']),
既簡單又方便。
-
Hacker News API,學習以下三個知識點:
- 根據Web API呼叫返回的列表,動態生成WEB API呼叫網址,並再次呼叫WEB API訪問並獲取資料;
- 字典的dict.get()函式,不確定某個鍵是否包含在字典中時,可使用方法dict.get(),它在指定的鍵存在時返回與之相關的值,在指定的鍵不存在時返回第二個實參指定的值
- 模組operator中的函式item getter(),以及與sorted()函式的配合使用。這個函式傳遞鍵’comments’,它將從這個列表中的每個字典中提取與鍵’comments’相關的值,函式sorted()將根據這種值對列表進行排序
import requests from operator import itemgetter #執行API呼叫並存儲響應 url = 'https://hacker-news.firebaseio.com/v0/topstories.json' r = requests.get(url) print ('Status code: ', r.status_code) #處理有關每篇文章的資訊 submission_ids = r.json() #建立submission_dicts空列表,用於儲存熱門文章字典 submission_dicts = [] #取前30個熱門文章ID for submission_id in submission_ids[:30]: #對於每篇文章,都執行一個API呼叫 #根據儲存在submission_ids列表中的ID生成URL url = ('https://hacker-news.firebaseio.com/v0/item/' + str(submission_id) + '.json') submission_r = requests.get(url) print(submission_r.status_code) response_dict = submission_r.json() #為當前處理的文章生成一個字典 submission_dict = { 'title': response_dict['title'], 'link': 'http://news.ycombinator.com/item?id=' + str(submission_id), 'comments': response_dict.get('descendants', 0) } submission_dicts.append(submission_dict) submission_dicts = sorted(submission_dicts, key = itemgetter('comments'),reverse = True) for submission_dict in submission_dicts: print ('\nTitle: ', submission_dict['title']) print ('Discussion link: ', submission_dict['link']) print ('Comments: ', submission_dict['comments'])
上面這段程式碼返回的資料結果:
[{"title": "Glitter bomb tricks parcel thieves", "link": "http://news.ycombinator.com/item?id=18706193", "comments": 304}, {"title": "Stop Learning Frameworks", "link": "http://news.ycombinator.com/item?id=18706785", "comments": 175}, {"title": "Reasons Python Sucks", "link": "http://news.ycombinator.com/item?id=18706174", "comments": 175}, {"title": "I need to copy 2000+ DVDs in 3 days. What are my options?", "link": "http://news.ycombinator.com/item?id=18690587", "comments": 167}, {"title": "SpaceX Is Raising $500M at a $30.5B Valuation", "link": "http://news.ycombinator.com/item?id=18706506", "comments": 139}, ......... ]
-