1. 程式人生 > >Query String模塊和http小爬蟲和events模塊和fs模塊和stream模塊

Query String模塊和http小爬蟲和events模塊和fs模塊和stream模塊

card .net ins query 小爬蟲 headers inline def pack

## querystring模塊 1. 功能:是node.js中處理字符的 2. 核心方法 - parse:將string->object - parse( str , arg1 , arg2) str: 你要處理的字符 arg1: 分隔字符 arg2: 將 = 轉化為 : , (這句話前提是 & 符號是提前被轉化的) ```javascript var qs=require(‘querystring‘); var str=‘http://www.baidu.com/001?a=1&b=2#hash=20‘; var obj=qs.parse(str,‘?‘,‘&‘); console.log(obj) ``` - stringify:將object->string ```javascript qs.stringify(obj) ``` - escape:將中文字符編碼 ```javascript var charStr=‘http://www.baidu.com/001?city=杭州‘; var url=require(‘url‘); var charurl=url.parse(charStr).query; console.log(qs.escape(charurl)); ``` - unescape:將中文字符解碼 ```javascript qs.unescape(qs.escape(charurl)) ``` ## http - 核心方法:get、request、小爬蟲 - http小爬蟲: 使用數據請求一段內容,然後將這段內容做數據清洗,最後再通過後端服務器發送到前臺頁面 - 反爬蟲:反數據請求,反內容,讓數據清洗不好處理 ```javascript http小爬蟲舉例:請求網址:http://stu.1000phone.net/student.php/Index/index 1.進入node.js官網,找到http模塊,引入http 2.使用http的get方法 http.get(url/options,callback) 3.定義一個options 4.通過使用get方法已經獲得了數據請求,是一個網頁 5.然後進行數據清洗,通過一個第三方包(工具:cheerio),去npmjs裏面找 6.安裝cheerio `$ npm i cheerio -S` 和安裝package.json `$ npm init -y` 7.引入cheerio 8.發送給前臺 var http = require(‘http‘); var cheerio = require(‘cheerio‘);
const options = { hostname: ‘stu.1000phone.net‘, port: 80, path: ‘/student.php/Index/index‘, method: ‘get‘, headers: { //這裏的數據是request header裏的數據 ‘Host‘: ‘ stu.1000phone.net‘, ‘Connection‘: ‘ keep-alive‘, ‘Cache-Control‘: ‘ max-age=0‘, ‘Upgrade-Insecure-Requests‘: ‘ 1‘, ‘User-Agent‘: ‘ Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ 74.0.3729.131 Safari/537.36‘, ‘Accept‘: ‘ text/html,application/xhtml+xml, application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8, application/signed-exchange;v=b3‘, ‘Referer‘: ‘ http://stu.1000phone.net/student.php/ Index/moneyDetail‘, ‘Accept-Encoding‘: ‘ gzip, deflate‘, ‘Accept-Language‘: ‘ zh-CN,zh;q=0.9,en;q=0.8‘, ‘Cookie‘: ‘ PHPSESSID=ouvnaju74a3be6lcb6updgvb95; StuInfo=think%3A%7B%22StuId%22%3A%22133717%22%2C%22StuN umber%22%3A%22HZ190213065%22%2C%22IDcard%22%3A%22QFEDU_ tEubnTherf1jqmBcsdhx9KlB0Nk%252BSeEjdkl%252F9W44GGI%253 D%22%2C%22StuName%22%3A%22%25E8%25B5%25B5%25E8%258B%25B 1%25E5%25A7%25BF%22%2C%22Cid%22%3A%222237%22%7D‘, ‘Content-Type‘: ‘application/x-www-form-urlencoded‘, } };
http.createServer(function (request, response) { response.writeHead( 200 , { ‘Content-type‘: ‘text/html;charset=utf8‘ }) var req = http.get(options, function (res) { res.setEncoding(‘utf8‘); let rawData = ‘‘; res.on(‘data‘, (chunk) => { rawData += chunk; }); res.on(‘end‘, () => { try { //console.log(rawData)//輸出的是整個網頁 var $ = cheerio.load(rawData); response.write($(‘.inline .user-title-label span‘).text().toString()); response.end()
} catch (e) { console.error(e.message); } }) }).on(‘error‘, (e) => { console.error(`problem with request: ${e.message}`) }) req.end()
}).listen(8002, ‘localhost‘, function () { console.log(`服務器運行在:http://localhost:8002`) })
```
## events模塊 - 1.功能:是node.js中的事件模塊 - 2.使用 ```javascript //創建event var Events=require(‘events‘); //通過定義一個類繼承這個方式 class MyEvents extends Events {}; //在實例化這個類,得到一個對象,對象身上就會具備一些屬性 var myEvents=new MyEvents(); //這個實例身上具備on和emit兩個方法,on是事件的定義(聲明),emit是事件的執行 //聲明事件 myEvents.on(‘a‘,()=>{ console.log(‘hello‘) }) //觸發事件 myEvents.emit(‘a‘) ``` ## fs - 概念:node.js中處理文件的模塊 - 使用 1. 操作目錄 ```javascript var fs=require(‘fs‘) 增 fs.mkdir(‘./dist‘,function( error ) { if( error ) throw error console.log( ‘目錄創建成功‘ ) }) ``` ```javascript 改 fs.rename(‘./dist‘,‘./fs_dist‘,function( error ) { if( error ) throw error console.log(‘ 目錄名稱修改成功 ‘) }) ``` ```javascript 查,查目錄裏的文件 for( var i = 0 ; i < 10 ; i ++ ){ fs.writeFile(`./fs_dist/${i}.txt`,i,function( err ) { console.log( `第${i}個文件創建成功` ) }) }
fs.readdir(‘./fs_dist‘,‘utf-8‘,function ( error,data ) { if( error ) throw error //console.log( data ) // 以文件名為元素構成的數組 for ( var i = 0 ; i < data.length; i ++ ){ fs.readFile( `./fs_dist/${data[i]}`,‘utf8‘,function( error , content ) { if( error ) throw error console.log( content ) }) } }) ``` ```javascript 刪 //fs.rmdir(path,callback) 這個方法只能刪除空目錄
fs.rmdir( ‘./fs_dist‘, function ( error ) {
if( error ) throw error console.log(‘目錄刪除成功‘)
}) ``` 1. 操作文件 ```javascript 增 writeFile(路徑,內容 , 錯誤優先的回調) fs.writeFile(‘./dist/1.txt‘,‘hello yyb‘,function( error ) { if( error ) throw error }) ``` ```javascript 改 fs.appendFile(‘./dist/1.txt‘,‘\nhello 千鋒~~~‘,‘utf8‘,function( error ) { if( error ) throw error console.log(‘文件修改成功‘) }) ``` ```javascript 查 fs.readFile( ‘./dist/1.txt‘,‘utf8‘,function( error, data ) { if ( error ) throw error // console.log( data.toString() ) // 二進制數據 console.log( data ) console.log(‘文件讀成功了‘) }) ``` ```javascript 刪 fs.unlink( ‘./dist/1.txt‘, function( error ) { if( error ) throw error console.log( ‘文件刪除成功‘ ) }) ``` ## stream 1. 概念stream 流: 減少內存消耗, 增加效率 2. 名詞:pipe-->管道流 可讀的流,可寫的流 ```javascript 舉例:壓縮包的創建 var fs = require( ‘fs‘ ) var zlib = require(‘zlib‘) // 創建壓縮包 var readeStream = fs.createReadStream( ‘./dist/1.txt‘ ) var writeStream = fs.createWriteStream( ‘./dist/1.txt.gz‘ ) var gzip = zlib.createGzip() // 空壓縮包 readeStream .pipe( gzip ) .pipe( writeStream ) ```

Query String模塊和http小爬蟲和events模塊和fs模塊和stream模塊