Python requests 自動登入某財BBS，自動簽到打卡領銅錢，最後再配個plist，每天自動執行

阿新 • • 發佈：2019-02-11

某財的使用者應該都知道這個網站，在“簽到有禮”版塊，每天會有一貼，用帖子中給出的關鍵字回帖，得銅錢，據說銅錢可以換現金，還可以換書。

真好，裸辭在家的失業人員最需要這個～每天領之。

基本思路：

先用抓包工具仔細分析下登陸以及回帖時post了哪些資料，這些資料從何而來（我用的Firefox ＋ Firebug，挺好用的，選上保持＋全部，就算頁面重定向，所有的請求也都能看到）；
python requests庫，用requests.Session().post來登陸和回帖，用get來讀取頁面內容；
登陸之後，拿到BBS首頁HTML原始碼，正則＋BeautifulSoup，找到“簽到有禮”子版塊的相對URL，以及它的forumID，跟baseURL拼接得到子版塊URL；

get子版塊URL，拿到子版塊首頁HTML原始碼，相同的方法，得到當日簽到帖子的相對URL和threadID；
get簽到帖子URL，找到帖子裡的關鍵字，post回帖之；
然後再一路找到自己資訊欄，看看自己的銅錢數；
最後，把這一路走來的中間過程和狀態寫入到log檔案裡，方便出問題後反查。
最後的最後，寫個.sh指令碼，裡面執行這個python程式，配置個相應的plist，每天自動執行（MAC OS）

先說說我踩過的坑：

登陸post之後，返回的是200狀態值，也就是成功，但是回帖post時，永遠提示未登陸，肯定是cookie出了問題，但是requests是自動保持cookie和session的，吭吭哧哧大半天之後，crab大神一語點醒了我。仔細看抓包工具裡的相關包，仔細看post之後，響應的content！！！

這個網站登陸資料post之後，響應的content是個scripts，裡面有兩個連結，乍一看，都是什麼API...從抓包工具裡也能清楚看到，post登陸資料之後，立馬連著兩個get，請求的URL正是post之後，響應的content裡面的那兩個URL。並且這兩個get得到的響應都是set-cookie，沒錯，這兩個URL就是傳說中『種cookie』的。所以在登入post之後，再get這兩個URL，後面就OK了。所以，不管爬什麼網站，仔細分析清楚請求包和響應包，這是一切的基礎！

小知識點GET：

f = open(file,'r+')
f = open(file,'w+')
乍一看，r+ 跟 w+ 沒區別，其實有很大區別：r+ 方式，檔案必須存在，否則會報錯，用r+方式寫的時候，它是從頭開始覆蓋的，覆蓋到哪裡算哪裡；而w+方式，檔案不存在時會新建，寫入的時候是全部清空再寫入的。a+則是可讀可寫，並且是用追加方式寫入的。

下面程式執行方法：python 路徑/Auto_Login_Reply.py 使用者名稱/密碼，這種方式有個好處，不用改.py裡面的使用者名稱密碼引數，直接帶引數執行。

本程式是面向過程的，從頭至尾，一氣呵成。

python真好，既能面向物件，也能面向過程，靈活巧妙，贊！

#!/usr/bin/env python
#-*- coding:utf-8 -*-

__author__ = 'Sophie2805'

import re
import os.path

import requests
from bs4 import BeautifulSoup

import time
import sys

'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
if log.txt does not exist under current executing path, create it.
write log, if the log file is larger than 100 lines, delete all then write from beginning
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''

file = os.path.abspath('.')+'/log.txt'#get the absolute path of .py executing
if not os.path.isfile(file):#not exist, create a new one
    f = open(file,'w')
    f.close()

if os.path.getsize(file)/1024 > 1024:#larger than 1MB
    f = open(file,'w')
    try:
        f.write('')
    finally:
        f.close()

'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
python Auto_Login_Reply.py user/pwd
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''

args = sys.argv[1]
#print args
username = args[0:args.find('/')]
pwd = args[args.find('/')+1:len(args)]
#print username , pwd

'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
using log_list[] to log the whole process
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''

#print os.path.abspath('.')
log_list = []
log_list.append('+++++++++++++++++++++++++++++++++++++++++++++\n')
log_list.append('++++挖財簽到有禮'+(time.strftime("%m.%d %T"))+' 每天簽到得銅錢++++\n')
log_list.append('+++++++++++++++++++++++++++++++++++++++++++++\n')

s = requests.Session()

agent = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.10; rv:38.0) Gecko/20100101 Firefox/38.0'
connection = 'keep-alive'

s.headers. update({'User-Agent':agent,
                   'Connection':connection})

'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
post login request to this URL, observed in Firebug
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''

login_url = 'https://www.wacai.com/user/user!login.action?cmd=null'

login_post_data ={
    'user.account':username,
    'user.pwd':pwd
}

try:
    login_r = s.post(login_url,login_post_data)
except Exception,e:
    log_list.append(time.strftime("%m.%d %T") + '--Login Exception: '+ e + '.\n')

f = open(file,'a')#append
try:
    f.writelines(log_list)
finally:
    f.close()
log_list=[]

'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
these two get() are very import!!!
login_r.content return these 2 api URLs.
Without getting these 2 URLs, the BBS will not take our session as already login.
I assume, getting these 2 URLs, some critical cookie will be returned.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''

src1 = login_r.content[login_r.content.find('src')+5:login_r.content.find('"></script>')]
src2 = login_r.content[login_r.content.rfind('src')+5:login_r.content.rfind('"></script><script>')]
#print src1
#print src2
s.get(src1)
s.get(src2)

base_url = 'http://bbs.wacai.com/'
homepage_r = s.get(base_url)
if '我的挖財' in homepage_r.content:
    log_list.append(time.strftime("%m.%d %T") + '--Successfully login.\n')
#print homepage_r.content
'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
find the checkin forum URL and ID, which is used as fid parameter in the reply post URL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''
pattern = '<.+>簽到有禮<.+>'
p = re.compile(pattern)
soup = BeautifulSoup(p.findall(homepage_r.content)[0])
checkin_postfix = soup.a['href']
checkin_forum_url = checkin_postfix
#print checkin_postfix
forum_id = checkin_postfix[checkin_postfix.find('-')+1:checkin_postfix.rfind('-')]
#print forum_id
if forum_id != '':
    log_list.append(time.strftime("%m.%d %T") + '--Successfully find the checkin forum ID.\n')
    print '--Successfully find the checkin forum ID'
    f = open(file,'a')#append
    try:
        f.writelines(log_list)
    finally:
        f.close()
    log_list=[]

'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
get the checkin forum portal page and find today's thread URL and ID, which is used as tid parameter in the reply post URL
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''
checkin_forum_page=s.get(checkin_forum_url)
#print checkin_forum_page.content
#print checkin_forum_page.status_code
title = '簽到有禮'+(time.strftime("%m.%d")).lstrip('0')+'每天簽到得銅錢，每人限回一次'
print title;
pattern_1 = '<.+>'+title + '<.+>'
p_1 = re.compile(pattern_1)
soup = BeautifulSoup(p_1.findall(checkin_forum_page.content)[0])
thread_postfix = soup.a['href']
thread_url = base_url + thread_postfix
thread_id= thread_postfix[thread_postfix.find('-')+1:thread_postfix.rfind('-')-2]
#print thread_id

if thread_id != '':
    log_list.append(time.strftime("%m.%d %T") + '--Successfully find the thread ID.\n')
    f = open(file,'a')#append
    try:
        f.writelines(log_list)
    finally:
        f.close()
    log_list=[]
t = s.get(thread_url)

'''~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
formhash is a must in the post data, observed in Firebug.
So get the formhash from the html of the page
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~'''
pattern_2 = '<input type="hidden" name="formhash" .+/>'
p_2 = re.compile(pattern_2)
soup = BeautifulSoup(p_2.findall(t.content)[0])
formhash = soup.input['value']

pattern_3 = '回帖內容必須為'+'.+'+'</font>非此內容將收回銅錢獎勵'
result_3 = re.compile(pattern_3).findall(t.content)
#print result_3
key = result_3[0][result_3[0].find('>')+1:result_3[0].rfind('<')-1]
if key != '':
    log_list.append(time.strftime("%m.%d %T") + '--Successfully find the key word.\n')
    f = open(file,'a')#append
    try:
        f.writelines(log_list)
    finally:
        f.close()
    log_list=[]

'''~~~~~~~
auto reply
~~~~~~~~~~'''

host='bbs.wacai.com'
s.headers.update({'Referer':thread_url})
s.headers.update({'Host':host})
reply_data={
    'formhash':formhash,
    'message':key,
    'subject':'',
    'usesig':''
}
reply_post_url = 'http://bbs.wacai.com/forum.php?mod=post&action=reply&fid='+forum_id+'&tid='+thread_id+'&extra=&replysubmit=yes&infloat=yes&handlekey=fastpost&inajax=1'
try:
    reply_r = s.post(reply_post_url,data=reply_data)
except Exception,e:
    log_list.append(time.strftime("%m.%d %T") + '--Reply exception: '+ e +'.\n' )
if '非常感謝，回覆釋出成功，現在將轉入主題頁，請稍候……' in reply_r.content:#success
    log_list.append(time.strftime("%m.%d %T") + '--Successfully auto reply.\n')
else:
    log_list.append(time.strftime("%m.%d %T") + '--Fail to reply: '+ reply_r.content + '.\n')
f = open(file,'a')#append
try:
    f.writelines(log_list)
finally:
    f.close()
log_list=[]
'''~~~~~~~~~~~~~~
find my WaCai URL
~~~~~~~~~~~~~~~~~'''
pattern_4 = '<.+訪問我的空間.+</a>'
p_4 = re.compile(pattern_4)
soup = BeautifulSoup(p_4.findall(t.content)[0])
if soup.a['href'] != '':
    log_list.append(time.strftime("%m.%d %T") + '--Successfully find my WaCai link.\n' )
    f = open(file,'a')#append
    try:
        f.writelines(log_list)
    finally:
        f.close()
    log_list=[]
mywacai_url = soup.a['href']
mywacai_page = s.get(mywacai_url)

'''~~~~~~~~~~~~~
find my info URL
~~~~~~~~~~~~~~~~'''
pattern_5 = '<.+個人資料</a>'
p_5 = re.compile(pattern_5)
soup = BeautifulSoup(p_5.findall(mywacai_page.content)[0])
if soup.a['href'] != '':
    log_list.append(time.strftime("%m.%d %T") + '--Successfully find my info link.\n' )
    f = open(file,'a')#append
    try:
        f.writelines(log_list)
    finally:
        f.close()
    log_list=[]
myinfo_url = base_url+ soup.a['href']
myinfo_page = s.get(myinfo_url)

'''~~~~~~~~~~~~~~
find my coin info
~~~~~~~~~~~~~~~~~'''
pattern_6 = '<em>銅錢.+\n.+\n'
p_6 = re.compile(pattern_6)
coin = p_6.findall(myinfo_page.content)[0]
coin = coin[coin.find('</em>')+5:coin.find('</li>')]
if int(coin.strip()) != 0:
    log_list.append(time.strftime("%m.%d %T") + '--Successfully get my coin amount: %s.\n'% int(coin.strip()))
    f = open(file,'a')#append
    try:
        f.writelines(log_list)
    finally:
        f.close()
    log_list=[]

最後是plist，mac電腦用這個配置定時任務，windows的話，寫個bat，然後也可以配置的貌似。

先寫個test.sh指令碼，注意用chmod 777 test.sh給它賦予可執行的許可權：

cd /Users/Sophie/PycharmProjects/Auto_Login_Reply_BBS_WaCai

python Auto_Login_Reply.py username/password

然後到如下路徑

~/Library/LaunchAgents，新建個plist檔案，檔名為：wacai.bbs.auto.login.reply.plist。

注意label不要跟別的重複，寫個特別點的，ProgramArguments裡面寫上test.sh的絕對路徑，StartCalendarInterval裡面配置成幾點幾分自動執行，最後的StandardOutPath和StandardErrorPath要不要都行，要更好，出錯了可以看看錯誤資訊。

<span style="font-size:14px;"><span style="font-size:12px;"><?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
	<key>Label</key>
	<string>wacai.bbs.auto.login.reply</string>
	<key>ProgramArguments</key>
	<array>
		<string>/Users/Sophie/PycharmProjects/Auto_Login_Reply_BBS_WaCai/test.sh</string>
	</array>
	<key>StartCalendarInterval</key>
	<dict>
		<key>Minute</key>
		<integer>30</integer>
		<key>Hour</key>
		<integer>1</integer>
	</dict>
<key>StandardOutPath</key>
<string>/Users/Sophie/PycharmProjects/Auto_Login_Reply_BBS_WaCai/run.log</string>
<key>StandardErrorPath</key>
<string>/Users/Sophie/PycharmProjects/Auto_Login_Reply_BBS_WaCai/runerror.log</string>
</dict>
</plist></span></span>

編輯好了之後

launchctl load wacai.bbs.auto.login.reply.plist 啟用這個plist

launchctl start wacai.bbs.auto.login.reply 立即執行一次，注意，這裡是那個label值，不帶plist字尾的

修改plist之後，要launchctl unload .... 再 launchctl load...重新載入

還可以用launchctl list | grep wacai 來看看執行狀態，一般，有了PID，並且status為0即為一切正常，否則，哪裡有問題導致執行出了問題。

最後附上Log（我銅錢特別少，窮哭了已經！）以及某財網站截圖，希望頁面不要頻繁變動，不然我就得debug改指令碼了 = =#

+++++++++++++++++++++++++++++++++++++++++++++
++++挖財簽到有禮06.29 02:57:03 每天簽到得銅錢++++
+++++++++++++++++++++++++++++++++++++++++++++
06.29 02:57:19--Successfully login.
06.29 02:57:19--Successfully find the checkin forum ID.
06.29 02:57:19--Successfully find the thread ID.
06.29 02:57:19--Successfully find the key word.
06.29 02:57:19--Successfully auto reply.
06.29 02:57:19--Successfully find my WaCai link.
06.29 02:57:19--Successfully find my info link.
06.29 02:57:20--Successfully get my coin amount: 463.

Python requests 自動登入某財BBS，自動簽到打卡領銅錢，最後再配個plist，每天自動執行

Python requests 自動登入某財BBS，自動簽到打卡領銅錢，最後再配個plist，每天自動執行

Python requests jira登入302重定向

相遇，系一個幾動人嘅名詞啊！喺呢個世界，

hibernate 一對多或者多對多時候，集合屬性怎麼分頁過濾？一個人一百個訂單，絕對要分頁的

每日打卡標題:找到一本不錯的Linux電子書，附《Linux就該這麼學》章節目錄。

android studio gradle打包，怎麼樣通過打不同的包名所依賴的專案不同，即所生成的apk大小不一樣呢（每個apk不包含所有的依賴的專案）

ArcGIS 10.0破解了，不能使用3D analyst和spatial analyst這兩個工具，提示沒有許可證

題目：輸入一個字串，輸出該字串中字元的所有組合。舉個例子，如果輸入abc，它的組合有a、b、c、ab、ac、bc、abc。

Python實現自動登入，強行突破圖形驗證碼！

python 用requests模組自動登入

Python 實現全自動登入(真正的全自動，自動識別驗證碼)

自動化測試： Selenium 自動登入授權，再 Requests 請求內容

第三百六十八節，Python分布式爬蟲打造搜索引擎Scrapy精講—elasticsearch(搜索引擎)用Django實現搜索的自動補全功能

登入登出，自動登入

python selenium 自動登入百度貼吧

列印資訊輸出到lcd、自動登入串列埠，並自動執行程式

python爬蟲自動登入武漢大學校園網

python的selenium驅動谷歌瀏覽器使用cookie自動登入

python開啟chrome瀏覽器自動登入網站並發表說說批量

利用Python識別圖形驗證碼！實現自動登入！室友驚訝的合不攏嘴！

Python requests 自動登入某財BBS，自動簽到打卡領銅錢，最後再配個plist，每天自動執行

相關推薦