python測試工具開發面試寶典3web抓取
用requests輸出網站返回頭
輸出
' ofollow,noindex">https://china-testing.github.io/'
的返回頭
- 參考答案
In [1]: import requests In [2]: url = 'https://china-testing.github.io/' In [3]: response = requests.get(url) In [4]: response.request.headers Out[4]: {'User-Agent': 'python-requests/2.18.4', 'Accept-Encoding': 'gzip, deflate', 'Accept': '*/*', 'Connection': 'keep-alive'}
requests是HTTP訪問極其重要的庫,比較常用的屬性有:response.status_code、response.text。
更多參考資料: python工具庫介紹-requests:人性化的HTTP
用Requests和BeautifulSoup爬取部落格標題
爬取
https://china-testing.github.io/
首頁的部落格標題,共10條.
- 參考答案
# -*- coding: utf-8 -*- # 討論釘釘免費群21745728 qq群144081101 567351477 # CreateDate: 2018-10-16 import requests from bs4 import BeautifulSoup def get_upcoming_events(url): req = requests.get(url) soup = BeautifulSoup(req.text, 'lxml') events = soup.findAll('article') for event in events: event_details = {} event_details['name'] = event.find('h1').find("a").text print(event_details) get_upcoming_events('https://china-testing.github.io/')
執行結果:
$ python3 blogs.py {'name': '介面自動化效能測試線上培訓大綱'} {'name': '2018最佳人工智慧影象處理工具OpenCV書籍下載'} {'name': 'IBM開發社群python精品文章彙總'} {'name': 'python工具庫介紹-requests:人性化的HTTP'} {'name': '中草藥的故事-金銀花(標準中藥)- 清熱解毒,疏散風熱'} {'name': '中草藥的故事-合歡花(標準中藥)'} {'name': '中草藥的故事-吳茱萸(標準中藥)'} {'name': '[雪峰磁針石部落格]python3快速入門教程9重要的標準庫-高階篇'} {'name': '[雪峰磁針石部落格]python3快速入門教程11命令列自動化工具與pexpect'} {'name': '[雪峰磁針石部落格]python3快速入門教程9重要的標準庫-基礎篇'}
BeautifulSoup的預設解析器為html.parser,處理大頁面比較吃力,為此使用lxml。直譯器html5lib的行為和瀏覽器表現類似。
最新程式碼地址
https://github.com/china-testing/python-api-tesing/blob/master/python-automation-cook/ch3/blogs.py
selenium訪問' https://httpbin.org/forms/post'
用selenium訪問' https://httpbin.org/forms/post' ,填充內容

圖片.png
- 參考答案
# 討論釘釘免費群21745728 qq群144081101 567351477 # CreateDate: 2018-10-16 from selenium import webdriver import time browser = webdriver.Chrome() browser.get('https://httpbin.org/forms/post') custname = browser.find_element_by_name("custname") custname.clear() custname.send_keys("python測試開發") time.sleep(2) for size_element in browser.find_elements_by_name("size"): if size_element.get_attribute('value') == 'medium': size_element.click() time.sleep(2) for topping in browser.find_elements_by_name('topping'): if topping.get_attribute('value') in ['bacon', 'cheese']: topping.click() time.sleep(2) browser.find_element_by_tag_name('form').submit()
執行結果
{ "args": {}, "data": "", "files": {}, "form": { "comments": "", "custemail": "", "custname": "python\u6d4b\u8bd5\u5f00\u53d1", "custtel": "", "delivery": "", "size": "medium", "topping": [ "bacon", "cheese" ] }, "headers": { "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8", "Accept-Encoding": "gzip, deflate, br", "Accept-Language": "zh-CN,zh;q=0.9", "Cache-Control": "max-age=0", "Connection": "close", "Content-Length": "132", "Content-Type": "application/x-www-form-urlencoded", "Host": "httpbin.org", "Origin": "https://httpbin.org", "Referer": "https://httpbin.org/forms/post", "Upgrade-Insecure-Requests": "1", "User-Agent": "Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36" }, "json": null, "origin": "183.62.236.90", "url": "https://httpbin.org/post" }