1. 程式人生 > >selenium栗子之登陸網站並獲取cookie

selenium栗子之登陸網站並獲取cookie

測試網站(航天雲網):

http://cas.casicloud.com/loginservice=http%3A%2F%2Fin.casicloud.com%2Floginc%3Fservice%3D%252Fsso%252Flogin.jsp%253Fredirect%253Dhttp%25253A%25252F%25252Fwww.casicloud.com%25252Floginc%25253Fret%25253Dhttp%2525253A%2525252F%2525252Fwww.casicloud.com%2525252F

介面如圖:


首先關於驗證碼:

很慶幸的是,經過分析,該網站的驗證碼不用通過OCR識別,相對應的,驗證碼的值在JS載入後,一段<input type="hidden" id="randomString" value="”。。。。的值李,因此,我們只需要模擬登陸後,取出JS載入好的值之後,正則匹配或者XPATH就能得到該值。

接著,開始:

1、設定瀏覽器,登入網頁:

url = ‘***’

driver = webdriver.Chrome()
driver.get(url)

2、個人建議設定一個時間間隔,便於JS的載入(我一般設定的3-5秒)。

driver.implicitly_wait(5)

3、在相對應的表格裡填寫賬戶密碼

driver.find_element_by_xpath('//*[@id="shortAccount"]')
driver.find_element_by_xpath('//*[@id="shortAccount"]').send_keys('賬戶名')
driver.find_element_by_xpath('//*[@id="password"]')
driver.find_element_by_xpath('//*[@id="password"]').send_keys('密碼')

4、通過JS載入後的頁面獲取驗證碼值:

html = driver.page_source
check_value = re.search(r'<input type="hidden" id="randomString" value="(\d\d\d\d)"',html).group(1)

5、填寫驗證碼,登入網站並獲取cookie:

key = str(check_value)

driver.find_element_by_xpath('//*[@id="code0"]')
driver.find_element_by_xpath('//*[@id="code0"]').send_keys(key)
driver.find_element_by_xpath('//*[@id="loginForm"]/div[6]/input').click()

driver.refresh()
cookies = driver.get_cookies()
ret = ''
for cookie in cookies:
    cookie_name = cookie['name']
    cookie_value = cookie['value']
    ret = ret+cookie_name+'='+cookie_value+'; '
print ret
driver.quit()

上面的重新整理頁面(refresh)只是個人習慣。

然後程式碼整理一下,如下:

#coding:utf-8
import re
from selenium import  webdriver

def login_get_cookie(url):
    driver = webdriver.Chrome()
    driver.get(url)
    driver.implicitly_wait(5)
    driver.find_element_by_xpath('//*[@id="shortAccount"]')
    driver.find_element_by_xpath('//*[@id="shortAccount"]').send_keys('賬戶')
    driver.find_element_by_xpath('//*[@id="password"]')
    driver.find_element_by_xpath('//*[@id="password"]').send_keys('密碼')
    html = driver.page_source
    check_value = re.search(r'<input type="hidden" id="randomString" value="(\d\d\d\d)"',html).group(1)
    key = str(check_value)
    driver.find_element_by_xpath('//*[@id="code0"]')
    driver.find_element_by_xpath('//*[@id="code0"]').send_keys(key)
    driver.find_element_by_xpath('//*[@id="loginForm"]/div[6]/input').click()
    driver.refresh()
    cookies = driver.get_cookies()
    ret = ''
    for cookie in cookies:
        cookie_name = cookie['name']
        cookie_value = cookie['value']
        ret = ret+cookie_name+'='+cookie_value+'; '
    print ret
    driver.quit()
    return ret
url = 'http://cas.casicloud.com/login?service=http%3A%2F%2Fin.casicloud.com%2Floginc%3Fservice%3D%252Fsso%252Flogin.jsp%253Fredirect%253Dhttp%25253A%25252F%25252Fwww.casicloud.com%25252Floginc%25253Fret%25253Dhttp%2525253A%2525252F%2525252Fwww.casicloud.com%2525252F'
cookies =  login_get_cookie(url)