1. 程式人生 > >MacOS下安裝BeautifulSoup庫及使用

MacOS下安裝BeautifulSoup庫及使用

BeautifulSoup簡介


BeautifulSoup庫是一個強大的python第三方庫,它可以解析html進行解析,並提取資訊。

安裝BeautifulSoup


  • 開啟終端,輸入命令:
pip3 install beautifulsoup4

BeautifulSoup庫小測


  • 檢視它的原始碼:

  • 用request庫獲得原始碼(存放在變數demo中):
>>> import requests
>>> r = requests.get("http://python123.io/ws/demo.html")
>>> r.text
'<html><head><title>This is a python demo page</title></head>\r\n<body>\r\n<p class="title"><b>The demo python introduces several python courses.</b></p>\r\n<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:\r\n<a href="http://www.icourse163.org/course/BIT-268001" class="py1" id="link1">Basic Python</a> and <a href="http://www.icourse163.org/course/BIT-1001870001" class="py2" id="link2">Advanced Python</a>.</p>\r\n</body></html>'
>>> demo = r.text
  • 匯入BeautifulSoup庫
>>> from bs4 import BeautifulSoup
>>> 
  • 使用BeautifulSoup庫解析html資訊
>>> demo = r.text
>>> soup = BeautifulSoup(demo,'html.parser')
>>> print(soup.prettify)
<bound method Tag.prettify of <html><head><title>This is a python demo page</title></head>
<body>
<p class="title"><b>The demo python introduces several python courses.</b></p>
<p class="course">Python is a wonderful general-purpose programming language. You can learn Python from novice to professional by tracking the following courses:
<a class="py1" href="http://www.icourse163.org/course/BIT-268001" id="link1">Basic Python</a> and <a class="py2" href="http://www.icourse163.org/course/BIT-1001870001" id="link2">Advanced Python</a>.</p>
</body></html>>
>>> 

如何使用BeautifulSoup庫?

  • 程式碼框架:
from bs4 import BeautifulSoup
soup = BeautifulSoup('<p>data</p>','html.parser')
  • 其中BeautifulSoup的兩個引數:
    • 第一個代表我們要解析的html格式的資訊。
    • 第二個代表解析所使用到的解析器