1. 程式人生 > >python urllib.parse解析url

python urllib.parse解析url

1.urllib.parse.urlparse(urlstring, scheme=’’, allow_fragments=True)

  • 功能: 將url分為6部分, 返回一個元組;
  • 協議, 伺服器的地址(ip:port), 檔案路徑, 訪問的頁面
from urllib import parse
url = 'https://www.baidu.com/s?ie=utf-8&f=8&rsv_bp=0&rsv_idx=1&tn=baidu&wd=hello&rsv_pq=d0f841b10001fab6&rsv_t=2d43603JgfgVkvPtTiNX%2FIYssE6lWfmSKxVCtgi0Ix5w1mnjks2eEMG%2F0Gw&rqlang=cn&rsv_enter=1&rsv_sug3=6&rsv_sug1=4&rsv_sug7=101&rsv_sug2=0&inputT=838&rsv_sug4=1460'
parsed_tuple = parse.urlparse(url)
print(parsed_tuple)
print(parsed_tuple.netloc)
print(parsed_tuple.path)

在這裡插入圖片描述

urlencode:

from urllib.parse import   urlencode
params = {
    'name':'westos',
    'age':20
}
base_url = 'http://www.baidu.com?'
url = base_url + urlencode(params)
print(url)

在這裡插入圖片描述

url異常處理

- 異常
     exception urllib.error.URLError¶
     exception urllib.error.HTTPError
     exception urllib.error.ContentTooShortError(msg, content)

** 超時異常處理
from urllib import request, error
import  socket
#
try:
    url = 'https://www.baidu.com'
    response = request.urlopen(url, timeout=0.01)
    print(response.read().decode('utf-8'))
except error.HTTPError as e:
    print(e.reason, e.code, e.headers, sep='\n')
except error.URLError as e:
    print(e.reason)
    if isinstance(e.reason, socket.timeout):
        print("超時")
else:
    print("成功")

在這裡插入圖片描述