Python 爬取百度圖片的高清原圖

阿新 • • 發佈：2018-11-13

# coding=utf-8
"""
爬取百度圖片的高清原圖
Author          : MirrorMan
Created         : 2017-11-10
"""
import re
import urllib
import os
import requests


def get_onepage_urls(onepageurl):
    if not onepageurl:
        print('執行結束')
        return [], ''
    try:
        html = requests.get(onepageurl).text
    except Exception as e:
        print(e)
        pic_urls = []
        fanye_url = ''
        return pic_urls, fanye_url
    pic_urls = re.findall('"objURL":"(.*?)",', html, re.S)
    html = requests.get(onepageurl)
    html.encoding = 'utf-8'
    content = html.text
    fanye_urls = re.findall(re.compile(r'<a href="(.*)" class="n">下一頁</a>'), content, flags=0)
    fanye_url = 'http://image.baidu.com' + fanye_urls[0] if fanye_urls else ''
    return pic_urls, fanye_url

def down_pic(pic_urls, localPath):
    if not os.path.exists(localPath):  # 新建資料夾
        os.mkdir(localPath)
    """給出圖片連結列表, 下載圖片"""
    for i, pic_url in enumerate(pic_urls):
        try:
            pic = requests.get(pic_url, timeout=15)
            string = str(i + 1) + '.jpg'
            with open(localPath + '%d.jpg' % i, 'wb')as f:
                f.write(pic.content)
                print('成功下載第%s張圖片: %s' % (str(i + 1), str(pic_url)))
        except Exception as e:
            print('下載第%s張圖片時失敗: %s' % (str(i + 1), str(pic_url)))
            print(e)
            continue

if __name__ == '__main__':
    keyword = '鳴人'  # 關鍵詞, 改為你想輸入的詞即可, 相當於在百度圖片裡搜尋一樣
    url_init_first = r'http://image.baidu.com/search/flip?tn=baiduimage&ipn=r&ct=201326592&cl=2&lm=-1&st=-1&fm=result&fr=&sf=1&fmq=1497491098685_R&pv=&ic=0&nc=1&z=&se=1&showtab=0&fb=0&width=&height=&face=0&istype=2&ie=utf-8&ctd=1497491098685%5E00_1519X735&word='
    url_init = url_init_first + urllib.parse.quote(keyword, safe='/')
    all_pic_urls = []
    onepage_urls, fanye_url = get_onepage_urls(url_init)
    all_pic_urls.extend(onepage_urls)

    fanye_count = 1  # 圖片所在頁數，下載完後調整這裡就行
    while 1:
        onepage_urls, fanye_url = get_onepage_urls(fanye_url)
        fanye_count += 1
        print('第%s頁' % fanye_count)
        if fanye_url == '' and onepage_urls == []:
            break
        all_pic_urls.extend(onepage_urls)

    down_pic(list(set(all_pic_urls)), r'C:\Users\41174\AppData\Local\Temp\change.py\shrinkImage\\')  # 儲存位置也可以修改

參考：https://blog.csdn.net/xiligey1/article/details/73321152

Python 爬取百度圖片的高清原圖

# coding=utf-8 """ 爬取百度圖片的高清原圖 Author : MirrorMan Created : 2017-11-10 """ import re import urllib import os import requests de

python爬取百度圖片代碼

python爬蟲；import json import itertools import urllib import requests import os import re import sys word=input("請輸入關鍵字：") path="./ok" if

python爬取百度圖片---釋出exe小計編碼是個大坑

#*--coding:utf-8--* import requests import sitecustomize import os import sys reload(sys) sys.setdefaultencoding('utf-8') type=sys.getfilesystemencodi

Python 爬取百度圖片

百度圖片抓包資料: 引數詳情: 資料解析: from urllib import request, parse from http import cookiejar import

Python 3.5_簡單上手、爬取百度圖片的高清原圖 Python 3.5_簡單上手、爬取百度圖片的高清原圖

Python 3.5_簡單上手、爬取百度圖片的高清原圖 2017年11月10日 15:49:50 閱讀數：1008 利用工作之餘的時間，學習Python差不多也有小一個月的時間了，路漫漫其修遠兮，我依然是隻菜鳥。感覺

Python 3.5_簡單上手、爬取百度圖片的高清原圖

利用工作之餘的時間，學習Python差不多也有小一個月的時間了，路漫漫其修遠兮，我依然是隻菜鳥。感覺學習新技術確實是一個痛並快樂著的過程，在此分享些心得和收穫，並貼一個爬取百度圖片原圖的程式碼。一、安裝，搭建環境首先是Python的安裝，我想網上已經很多了，如果

python爬取百度搜索圖片

知乎需要 with 異常 mage 不足 request height adr 在之前通過爬取貼吧圖片有了一點經驗，先根據之前經驗再次爬取百度搜索界面圖片廢話不說，先上代碼 #!/usr/bin/env python # -*- coding: utf-8 -*- #

Python爬取百度貼吧圖片指令碼

新手，以下是爬取百度貼吧制定帖子的圖片指令碼，因為指令碼主要是解析html程式碼，因此一旦百度修改頁面前端程式碼，那麼指令碼會失效，權當爬蟲入門練習吧，後續還會嘗試更多的爬蟲。 # coding=ut

Python爬取百度貼吧的圖片

Python是一個弱型別的動態語言下面是我的第一個簡單的爬蟲指令碼程式 #coding=gbk #匯入re和urlLib兩個庫 import re import urllib #定義一個有參的獲得圖片的方法,方法名為getImg def getImg(url):

python爬蟲爬取百度圖片

爬蟲爬取百度圖片因公司業務需要，而且公司人手不足，我這個測試工程師需要臨時客串一下其他職位，所以，由我來爬取百度圖片。說明 1、最近稍微有點兒忙，沒顧得上整理。而且程式碼量比較少，所以註釋比較少。 2、如果需要直接使用我的程式碼，請將相應路徑檔名稱更改。具體

Python依據單個關鍵詞爬取百度圖片

最近由於工作需要要使用大量的水果蔬菜圖片，故萌生使用爬蟲抓取百度圖片的想法，並未用於商業用途，只是為了測試資料。所以並未使用多執行緒、框架等技術。由於百度圖片是動態載入的，發現搜尋關鍵詞後action的引數很相似，故使用requests.get(url ,

python 3 爬取百度圖片

糾結於爬取百度圖片，竟然花費了一天的時間才讓程式順利跑起來。其中踩坑無數。而且還發現公司電腦實在是比較差勁。。。 import requests import urllib import os , re from os.path import join

Python爬取百度貼吧數據

utf-8 支持我 family encode code word keyword 上一條時間　　本渣除了工作外，在生活上還是有些愛好，有些東西，一旦染上，就無法自拔，無法上岸，從此走上一條不歸路。花鳥魚蟲便是我堅持了數十年的愛好。　　本渣還是需要上班，才能支持我的

python爬取百度搜索結果ur匯總

百度搜索 sta attr amp end rom range 百度篩選寫了兩篇之後，我覺得關於爬蟲，重點還是分析過程分析些什麽呢： 1）首先明確自己要爬取的目標　　比如這次我們需要爬取的是使用百度搜索之後所有出來的url結果 2）分析手動進行的獲取目標的過程，以便

python 爬取百度url

style not 域名 head dex fin compile threads www 1 #!/usr/bin/env python 2 # -*- coding: utf-8 -*- 3 # @Date : 2017-08-29 18:38:23 4

【學習筆記】python爬取百度真實url

python 今天跑個腳本需要一堆測試的url，，，挨個找復制粘貼肯定不是程序員的風格，so，還是寫個腳本吧。環境：python2.7 編輯器：sublime text 3 一、分析一下首先非常感謝百度大佬的url分類非常整齊，都在一個

python爬取百度翻譯返回：{'error': 997, 'from': 'zh', 'to': 'en', 'query 問題

escape result words fan use rip odin 解決 base 解決辦法：修改url為手機版的地址：http://fanyi.baidu.com/basetrans User-Agent也用手機版的測試代碼： # -*- coding: utf

selenium+chrome瀏覽器驅動-爬取百度圖片

com max-age col presence and 下載其他 htm row 百度圖片網頁中中，當頁面滾動到底部，頁面會加載新的內容。我們通過selenium和谷歌瀏覽器驅動，執行js，是瀏覽器不斷加載頁面，通過抓取頁面的圖片路徑來下載圖片。 1 from s

python爬取百度貼吧指定內容

環境:python3.6 1：抓取百度貼吧—linux吧內容基礎版抓取一頁指定內容並寫入檔案萌新剛學習Python爬蟲,做個練習貼吧連結: http://tieba.baidu.com/f?kw=linux&ie=utf-8&pn=0 解析原始碼使用的是B

python3 anaconda pycharm 爬取百度圖片

#-*- coding:utf-8 -*- import time import requests from urllib import request from xml import etree import random import os class baiduimgspider(obj

Python 爬取百度圖片的高清原圖

相關推薦