1. 程式人生 > >python3抓取杭州房價資訊

python3抓取杭州房價資訊

因為馬上要去杭州,杭州房價去年漲的太厲害了,現在政策比較多看不清杭州房價的形式,所以想寫個爬蟲將杭州房產的交易資訊做個記錄。

準備階段

自己的伺服器用的CentOS,上面裝了python3,因為要連線資料庫,需要安裝psycopg2,於是

python3 -m pip install psycopg2

設計好資料庫

----------------------------------------
create table for new house transaction each day
----------------------------------------
CREATE
TABLE hangzhou.trans_daily_info ( trans_date DATE NOT NULL, downtown_new_trans SMALLINT NOT NULL, downtown_new_vol INTEGER NOT NULL, xiaoshan_new_trans SMALLINT NOT NULL, xiaoshan_new_vol INTEGER NOT NULL, yuhang_new_trans SMALLINT NOT NULL, yuhang_new_vol INTEGER NOT NULL
, fuyang_new_trans SMALLINT NOT NULL, fuyang_new_vol INTEGER NOT NULL, djd_new_trans SMALLINT NOT NULL, djd_new_vol INTEGER NOT NULL, urban_new_daily_trans SMALLINT NOT NULL, urban_new_daily_vol INTEGER NOT NULL, other4county_new_qty SMALLINT NOT NULL, other4country_new_vol INTEGER
NOT NULL, downtown_old_qty SMALLINT NOT NULL PRIMARY KEY (trans_date) );
---------------------------------------- create table for weekly hot residence area ---------------------------------------- create table hangzhou.old_weekly_hot_residence( id SERIAL primary key , start_time DATE NOT NULL, end_time DATE NOT NULL, residence_name VARCHAR(50) NOT NULL ); ---------------------------------------- create table for weekly hotest residence ---------------------------------------- CREATE TABLE hangzhou.old_weekly_hotest_residence ( start_date DATE NOT NULL, end_date DATE NOT NULL, week SMALLINT NOT NULL, residence_name VARCHAR(50) NOT NULL, comment TEXT NOT NULL, PRIMARY KEY (start_date,end_date) ); ---------------------------------------- create table for second hand residence transaction info ---------------------------------------- CREATE TABLE hangzhou.old_trans_weekly_info ( start_date DATE NOT NULL, end_date DATE NOT NULL, week SMALLINT NOT NULL, city_commercial_house_qty INTEGER NOT NULL, city_residence_qty INTEGER NOT NULL, urban_commerical_house_qty INTEGER NOT NULL, urban_residence_qty INTEGER NOT NULL, shangcheng_qty INTEGER DEFAULT 0 , xiacheng_qty INTEGER DEFAULT 0, jianggan_qty INTEGER DEFAULT 0, gongshu_qty INTEGER DEFAULT 0, xihu_qty INTEGER DEFAULT 0, bingjiang_qty INTEGER DEFAULT 0, zhijiang_qty INTEGER DEFAULT 0, xiasha INTEGER DEFAULT 0, PRIMARY KEY (start_date,end_date) );

後來發現crontab中的命令不執行,check /var/log/cron中發現也沒有更新,於是check crond 發現問題,重啟

service crond status

當執行的時候發現week欄位多餘於是刪除

 alter table hangzhou.old_trans_weekly_info drop week;
 alter table hangzhou.old_weekly_hotest_residence drop week;

之後發現需要新增comment2 欄位,於是

alter table hangzhou.old_weekly_hotest_residence ADD comment2 TEXT ;
alter table hangzhou.old_weekly_hotest_residence ALTER comment2 SET NOT NULL;

未完待續