1. 程式人生 > >Python資料爬蟲學習筆記(21)爬取京東商品JSON資訊並解析

Python資料爬蟲學習筆記(21)爬取京東商品JSON資訊並解析

一、需求:有一個通過抓包得到的京東商品的JSON連結,解析該JSON內容,並提取出特定id的商品價格p,json內容如下:

jQuery923933([{"op":"7599.00","m":"9999.00","id":"J_5089253","p":"7099.00"},
{"op":"48.00","m":"96.00","id":"J_16463451903","p":"38.00"},
{"op":"59.00","m":"229.00","id":"J_33440061157","p":"59.00"},
{"op":"79.00","m":"80.00","id":"J_6027746","p":"79.00"},
{"op":"32.90","m":"59.00","id":"J_33183063203","p":"32.90"},
{"op":"169.00","m":"699.00","id":"J_33341525798","p":"169.00"},
{"op":"228.00","m":"399.00","id":"J_30639439257","p":"228.00"},
{"op":"188.00","m":"199.00","id":"J_25539002541","tpp":"130.00","up":"tpp","p":"138.00"},
{"op":"55.00","m":"99.00","id":"J_3136674","p":"39.90"},
{"op":"25.90","m":"55.90","id":"J_5338456","p":"22.50"},
{"op":"50.00","m":"50.00","id":"J_11170365589","p":"50.00"}]);

     注意到該json內容是一個數組(array),由中括號[ ]括起來,並非是一個由大括號{ }括起來的物件(object)。

二、編寫程式碼:

import urllib.request
import re
import json

#爬取json資料內容
data=urllib.request.urlopen("https://p.3.cn/prices/mgets?callback=jQuery923933&type=1&area=1&pdtk=&pduid=15374502312291140901533&pdpin=&pin=null&pdbp=0&skuIds=J_5089253%2CJ_16463451903%2CJ_33440061157%2CJ_6027746%2CJ_33183063203%2CJ_33341525798%2CJ_30639439257%2CJ_25539002541%2CJ_3136674%2CJ_5338456%2CJ_11170365589&ext=11100000&source=item-pc").read()
#將資料內容轉換為字串
str1=str(data)
#去掉字串的無用資訊,本例為首尾的圓括號前後部分
str1 = str1[(str1.find('(')+1):str1.rfind(')')]
#將json資料轉換為python資料格式,此處jdata為list陣列
jdata=json.loads(str1)
#遍歷資料,找出特定id的p數值
for i in range(0,len(jdata)):
    jdataObj=jdata[i]
    if jdataObj["id"]=="J_5089253":
        print(jdataObj["p"])

三、補充: