Python爬蟲:Scrapy的get請求和post請求
阿新 • • 發佈:2018-11-08
scrapy 請求繼承體系
Request
|-- FormRequest
通過以下請求測試
GET: https://httpbin.org/get
POST: https://httpbin.org/post
get請求
方式:通過Request 傳送
import json
from scrapy import Spider, Request, cmdline
class SpiderRequest(Spider):
name = "spider_request"
def start_requests(self):
url = "https://httpbin.org/get?name=tom"
yield Request(url, body=json.dumps({"age": "23"}))
def parse(self, response):
print(response.text)
if __name__ == '__main__':
cmdline.execute("scrapy crawl spider_request".split())
服務端收到url連結中的引數name,而沒有收到body裡邊的引數age
"args": {
"name" : "tom"
},
post請求
方式一:通過FormRequest 傳送
from scrapy import Spider, cmdline, FormRequest
class SpiderFormData(Spider):
name = "spider_form_data"
def start_requests(self):
url = "https://httpbin.org/post"
yield FormRequest(url, formdata={"name": "Tom"})
def parse(self, response):
print(response.text)
if __name__ == '__main__':
cmdline.execute("scrapy crawl spider_form_data".split())
伺服器接收到引數
"form": {
"name": "Tom"
},
而且headers裡邊有一個引數
"headers": {
"Content-Type": "application/x-www-form-urlencoded",
},
方式二:通過Request
傳送
需要新增引數 method="POST"
import json
from scrapy import Spider, Request, cmdline
class SpiderPost(Spider):
name = "spider_post"
def start_requests(self):
url = "https://httpbin.org/post"
yield Request(url, method="POST", body=json.dumps({"name": "Tom"}))
def parse(self, response):
print(response.text)
if __name__ == '__main__':
cmdline.execute("scrapy crawl spider_post".split())
1、直接傳送post請求,伺服器端收到引數data,和json:
"data": "{\"name\": \"Tom\"}",
"form": {},
"json": {
"name": "Tom"
},
2、如果新增headers引數:
"headers": {
"Content-Type": "application/x-www-form-urlencoded",
},
伺服器收到引數,form將接收到引數,也就是FormRequest
的提交方式
"data": "",
"form": {
"{\"name\": \"Tom\"}": ""
},
"json": null,
3、如果新增headers引數:
"headers": {
"Content-Type": "application/json",
},
伺服器端將收到data 和json 引數,和第一個情形一樣,不過有時候不加這個請求頭引數獲取,會請求錯誤
"data": "{\"name\": \"Tom\"}",
"form": {},
"json": {
"name": "Tom"
},
總結
請求方式 | 使用方法 | headers引數 | 引數 | 伺服器端接收到引數 |
---|---|---|---|---|
get | Request | - | ?name=tom | args |
post | FormRequest | 有預設值 | formdata={“name”: “Tom”} | form |
post | Request | - | body=json.dumps({“name”: “Tom”}) | data,json |
post | Request | “Content-Type”: “application/x-www-form-urlencoded” | body=json.dumps({“name”: “Tom”}) | form |
post | Request | “Content-Type”: “application/json”, | body=json.dumps({“name”: “Tom”}) | data, json |