1. 程式人生 > >urllib庫:分析Robots協議

urllib庫:分析Robots協議

 1from urllib.robotparser import RobotFileParser
2import ssl
3from urllib.request import urlopen
4ssl._create_default_https_context = ssl._create_unverified_context
5
6rp = RobotFileParser()
7rp.set_url('http://www.jianshu.com/robots.txt')
8rp.read()
9
print(rp.can_fetch('*''http://www.jianshu.com/p/b6755402d7d'))
10print(rp.can_fetch('*''http://www.jianshu.com/search?q=python&page=1&type=note'))

parse()讀取分析

1rp = RobotFileParser()
2rp.parse(urlopen('http://www.jianshu.com/robots.txt').read().decode('utf-8').split('\n'))

`