1. 程式人生 > >Python——爬取人口遷徙數據(以騰訊遷徙為例)

Python——爬取人口遷徙數據(以騰訊遷徙為例)

map car img all spa ima tps .sh compile

說明:

1.遷徙量是騰訊修改後的數值,無法確認真實性。

2.代碼運行期間,騰訊遷徙未設置IP屏蔽和瀏覽器檢測,因此下段代碼僅能保證發布近期有效。

3.代碼功能:爬取指定一天的四十個城市左右的遷徙量(含遷入、遷出)。

 1 import re
 2 import urllib.request
 3 import xlwt
 4 import xlrd
 5 
 6 date = "20171016"
 7 cityList = xlrd.open_workbook("E:/city.xls").sheet_by_index(0).col_values(0) # [‘city‘, ‘南昌‘, ‘景德鎮‘, ‘萍鄉‘, ...
8 cityCodeList = xlrd.open_workbook("E:/city.xls").sheet_by_index(0).col_values(1) # [‘cityCode‘, ‘360100‘, ‘360200‘,... 9 direction = ["0","1"] 10 header = ["from","to","number","car","train","plane"] 11 dInd = 0 12 for cityIndex in range(1,len(cityCodeList)): 13 for dInd in range(2): 14 url = "
https://lbs.gtimg.com/maplbs/qianxi/" + date + "/" + cityCodeList[cityIndex] + direction[dInd] + "6.js" # "0 遷入": result-city,"1 遷出:city-result 15 workbook = xlwt.Workbook() 16 sheet = workbook.add_sheet("result") 17 for i in range(len(header)): 18 sheet.write(0,i,header[i])
19 ptRow = re.compile((\[".*?\])) 20 ptCity = re.compile("") 21 try: 22 data = urllib.request.urlopen(url).read().decode("utf8") # JSONP_LOADER&&JSONP_LOADER([["重慶",198867,0.000,0.300,0.700],["上海",174152,0.160,0.390,0.450],[... 23 dataList = re.findall(ptRow,data) # [‘["重慶",198867,0.000,0.300,0.700]‘, ‘["上海",174152,0.160,0.390,0.450]‘,[... 24 for i in range(len(dataList)): 25 colList = str(dataList[i]).split(",") # colList[4] = 0.700] 26 if direction[dInd] == "0": 27 sheet.write(i + 1, len(header) - 6, str(colList[0]).replace("[","").replace(","")) # city 28 sheet.write(i + 1, len(header) - 5, cityList[cityIndex]) 29 else: 30 sheet.write(i + 1, len(header) - 6, cityList[cityIndex]) 31 sheet.write(i + 1, len(header) - 5, str(colList[0]).replace("[","").replace(","")) # city 32 sheet.write(i + 1, len(header) - 4, colList[1]) # number 33 sheet.write(i + 1, len(header) - 3, colList[2]) # car 34 sheet.write(i + 1, len(header) - 2, colList[3]) # train 35 sheet.write(i + 1, len(header) - 1, str(colList[4]).replace("]","")) # plane 36 except Exception as e: 37 print(e) 38 workbook.save("E:/qianxi/" + str(cityList[cityIndex]) + direction[dInd] + date + ".xls") 39 print("Done!")

結果展示:

技術分享

技術分享

Python——爬取人口遷徙數據(以騰訊遷徙為例)