1. 程式人生 > >python簡單爬數據

python簡單爬數據

import agen model include urlencode port horizon 如果 nec

失敗了,即使跟Firefox看到的headers,參數一模一樣都不行,爬出來有網頁,但是就是不給數據,嘗試禁用了js,然後看到了cookie(不禁用js是沒有cookie的),用這個cookie爬,還是不行,隔了時間再看,cookie的內容也並沒有變化,有點受挫,但還是發出來,也算給自己留個小任務啥的

如果有大佬經過,還望不吝賜教

另外另兩個網站的腳本都可以用,過會直接放下代碼,過程就不說了


目標網站 http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml

先解決一下date到decimal years的轉換,僅考慮到天的粗略轉換

def date2dy(year, month, day):
    months = [31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31]
    oneyear = 365
    if year%100 == 0:
        if year%400 == 0:
            months[1] = 29
            oneyear = 366
    else:
        if year%4 == 0:
            months[1] = 29
            oneyear = 366

    days 
= 0 i = 1 while i < month: days = days + months[i] i = i + 1 days = days + day - 1 return year + days/366

第一個小目標是抓下2016.12.1的數據

打開FireFox的F12,調到網絡一欄

技術分享

提交數據得到

技術分享

有用的信息是請求頭,請求網址和參數,扒下來扔到程序裏面試試

這塊我試了大概一天多,抓不下來,我好菜呀.jpg

放下代碼吧先,萬一有大佬經過還望不吝賜教

#!usr/bin/python

import requests
import sys web_url = rhttp://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml request_url = rhttp://www.geomag.bgs.ac.uk/cgi-bin/igrfsynth filepath = sys.path[0] + \\data_igrf_raw_ + .html fid = open(filepath, w, encoding=utf-8) headers = { Host: www.geomag.bgs.ac.uk, User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:53.0) Gecko/20100101 Firefox/53.0, Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8, Accept-Language: zh-CN,zh;q=0.8,en-US;q=0.5,en;q=0.3, Accept-Encoding: gzip, deflate, Content-Type: application/x-www-form-urlencoded, Content-Length: 136, Referer: http://www.geomag.bgs.ac.uk/data_service/models_compass/igrf_form.shtml, Connection: keep-alive, Upgrade-Insecure-Requests: 1 } payload = { name: -, # your name and email address coord: 1, # ‘1‘: Geodetic ‘2‘: Geocentic date: 2016.92, # decimal years alt: 150, # Altitude place: ‘‘, degmin: y, # Position Coordinates: ‘y‘: In Degrees and Minutes ‘n‘: In Decimal Degrees latd: 60, # latitude degrees (degrees negative for south) latm: 0, # latitude minutes lond: 120, # longitude degrees (degrees negative for west) lonm: 0, # longitude minutes tot: y, # Total Intensity(F) dec: y, # Declination(D) inc: y, # Inclination(I) hor: y, # Horizontal Intensity(H) nor: y, # North Component (X) eas: y, # East Component (Y) ver: y, # Vertical Component (Z) map: 0, # Include a Map of the Location: ‘0‘: NO ‘1‘: YES sv: n } #如果需要Secular Variation (rate of change), 加上‘sv‘: ‘y‘ r = requests.post(request_url, data=payload, headers=headers) fid.write(r.text) fid.close();

python簡單爬數據