1. 程式人生 > >利用BeautifulSoup爬取我愛我家的租房資料

利用BeautifulSoup爬取我愛我家的租房資料

因為之前對BeautifulSoup一直不是很熟悉,剛好身邊的朋友同事在找房子,就想著能不能自己寫個爬蟲爬一下資料,因此就寫了這個爬蟲。基本都是邊看書邊寫的,不過也沒什麼好講的。直接粘程式碼了。

# coding=utf-8
import requests
from bs4 import BeautifulSoup
import  pymysql
import time
db= pymysql.connect(host="127.0.0.1",port =3306,user="root" ,passwd="root",db="woaiwojia",charset='utf8')
cursor = db.cursor()
for
num in range(1,81): url = "https://sh.5i5j.com/zufang/o8r1u1n"+str(num)+"/" time.sleep(10) strhtml = requests.get(url) fanlist = BeautifulSoup(strhtml.text,"lxml") sthtml = fanlist.find_all("ul",{"class":"pList"}) for ul in fanlist.find_all("ul",{"class":"pList"}): for li in
ul.find_all(name="li"): for div in li.find_all("div",{"class":"listCon"}): xiaoqu = div.h3.a.string detailUrl = "https://sh.5i5j.com"+div.h3.a.attrs['href'] detailhtml = requests.get(detailUrl) detail = BeautifulSoup(detailhtml.text
,"lxml") jinjirenlist =detail.find_all("div",{"id":"housebroker"}) for div1 in div.find_all("div",{"class":"listX"}): area = div1.find_all("p")[0].text community = div1.find_all("p")[1].text hot = div1.find_all("p")[2].text price = div1.find_all("div",{"class":"jia"})[0].p.strong.string for uldiv in detail.find_all("div",{"id":"housebroker"}): for ul in uldiv.find_all("ul"): lxrphone = ul.h3.string+ul.label.string sql = "insert into zufang(area,xiaoqu,community,hot,price,lxrphone) VALUES ('%s','%s','%s','%s','%s','%s');" % (area, xiaoqu,community,hot,price,lxrphone) try: cursor.execute(sql) db.commit() except: print('插入失敗')

有什麼問題或者建議可以評論與我進行交流