1. 程式人生 > >阿里試用,女朋友逼著我給她排序

阿里試用,女朋友逼著我給她排序

阿里試用排序

抱歉,之前莫名其妙把配置檔案給 ignore 了,已經修復,抱歉

前景提要

說來簡直丟盡了鋼鐵直男的臉,沒錯,昨晚我在愉快的做著外包的活(中國移動的小程式,自由職業,喂),11點多了,女友突然腦子一抽:“你能不能幫我把這個玩意排序一下給我用啊,我好薅點羊毛,技術能實現嘛?”
我比較無奈的看了看,阿里試用咩?什麼鬼,哦哦哦,就這玩意啊,爬蟲爬一下就是了。我是前端……
回道:“沒問題啊,爬蟲唄。”
她:“哇,多久能做出來啊?”
我:“我現在在忙誒,1-2小時吧。”
她:“行了,你別忙了,趕緊幫我弄一下出來!”
我看了看她的臉,羞恥的最小化《微信開發者工具》。。。

頁面展示

阿里試用

你要是覺得這也是廣告,那真是太擡舉我了。

爬蟲搞起來

NodeJS 爬蟲,百度一下,到處都是現成的程式碼,我也就不一一分析了,拿出簡書的一段程式碼,來自 埃米莉Emily:

const express = require('express');
// 呼叫 express 例項,它是一個函式,不帶引數呼叫時,會返回一個 express 例項,將這個變數賦予 app 變數。
const superagent = require('superagent');
const cheerio = require('cheerio');
const app = express();

app.get('/', (req, res, next) => {
  console.log(req)
  superagent.get('https://www.v2ex.com/')
    .end((err, sres) => {
      // 常規的錯誤處理
      if (err) {
        return next(err);
      }
      // sres.text 裡面儲存著網頁的 html 內容,將它傳給 cheerio.load 之後
      // 就可以得到一個實現了 jquery 介面的變數,我們習慣性地將它命名為 `$`
      // 剩下就都是 jquery 的內容了
      let $ = cheerio.load(sres.text);
      let items = [];
      $('.item_title a').each((idx, element) => {
        let $element = $(element);
        items.push({
          title: $element.text(),
          href: $element.attr('href')
        });
      });

      res.send(items);
    });
});

app.listen(3000, function () {
  console.log('app is listening at port 3000');
});

嘛,express 用 NodeJS 的不可能不知道,superagent 理解成可以在 Node 裡面做對外請求即可,cheerio 嗯,Node 專用 JQ。

首爬

把上面的請求地址換成:https://try.taobao.com/,檢視頁面標籤結構,找到想要的選擇器結構:

標籤結構

.tb-try-wd-item-info > .detail,把這個替換上面選擇器 .item_title a,走起:

……我不想展示結果,因為只有六個,頁面實際展示是 10 個,找了半天,發現兩個問題:

推薦

POST 請求來的資料

如上,第一個是爬到的 6 個是推薦,喵的,不是下面列表;
第二個,下面列表是後面通過 POST 單獨請求來的資料,怎麼看都是某框架的 SSR 乾的好事。

於是爬蟲不成,得換戰略。

模擬 POST

OK,既然是 POST,就好弄了,直接把連線跟引數刨出來,然後 superagent 模擬:

superagent
  .post(
    `https://try.taobao.com/api3/call?what=show&page=${paylaod.page}&pageSize&api=x%2Fsearch`
  )
  .set('content-type', 'application/x-www-form-urlencoded; charset=UTF-8')
  .end((err, sres) => {
    // 常規的錯誤處理
    if (err) {
      return next(err)
    }
    const result = JSON.parse(sres.text).result // 返回結構樹
    resolve(result)
  })   

content-type 源自:

contetnType

哼哼哼,你沒猜錯,失敗了,如下:

失敗頁面

想想是必然的,怎麼可能給你隨便請求呢,然後該怎麼做?研究?nonono,老夫上來就是一梭子,不就是 Content-Type 麼!

superagent
  .post(
    `https://try.taobao.com/api3/call?what=show&page=${paylaod.page}&pageSize&api=x%2Fsearch`
  )
  .set(
    'user-agent',
    'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/69.0.3497.100 Safari/537.36'
  )
  .set('accept', 'pplication/json, text/javascript, */*; q=0.01')
  .set('accept-encoding', 'gzip, deflate, br')
  .set(
    'accept-language',
    'zh-CN,zh;q=0.9,en;q=0.8,la;q=0.7,zh-TW;q=0.6,da;q=0.5'
  )
  // .set('content-length', '8')
  .set('content-type', 'application/x-www-form-urlencoded; charset=UTF-8')
  .set(
    'cookie',
    'your cookie'
  )
  .set('origin', 'https://try.taobao.com')
  .set('referer', 'https://try.taobao.com')
  .set('x-csrf-token', 'f0b8e7443eb7e')
  .set('x-requested-with', 'XMLHttpRequest')
  .end((err, sres) => {
    // 常規的錯誤處理
    if (err) {
      return next(err)
    }
    const result = JSON.parse(sres.text).result
    resolve(result)
  })

依據就是下面這個:

content-type2

不就是頭麼,不就是源麼,不就是使用者代理麼,用個 HTTPS 還沒有你辦法了?

注意上面 .set('content-length', '8'),不知道那邊怎麼玩,加上這個就超時……

於是,交代了吧:

{
    "pages": {
        "paging": {
            "n": 2182,
            "page": 1,
            "pages": 219
        },
        "items": [
            {
                "shopUserId": "2450112357",
                "title": "凱度高階款嵌入式蒸烤箱",
                "status": 1,
                "totalNum": 1,
                "requestNum": 15530,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "casdon凱度旗艦店",
                "showId": "2561626",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34530215",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1ycS2eMDqK1RjSZSyXXaxEVXa.jpg",
                "shopItemId": "559771706359",
                "price": 13850
            },
            {
                "shopUserId": "3189770892",
                "title": "皇家美素佳兒老包裝2段400g",
                "status": 1,
                "totalNum": 50,
                "requestNum": 2079,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "皇家美素佳兒旗艦店",
                "showId": "2551240",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34396042",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1YrSZaVYqK1RjSZLeXXbXppXa.jpg",
                "shopItemId": "547114874458",
                "price": 189
            },
            {
                "shopUserId": "1077716829",
                "title": "關注店鋪優先審水密碼幻彩隔離",
                "status": 1,
                "totalNum": 10,
                "requestNum": 6907,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "水密碼旗艦店",
                "showId": "2568391",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34784086",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB16_4ChmzqK1RjSZPxXXc4tVXa.jpg",
                "shopItemId": "559005882880",
                "price": 599
            },
            {
                "shopUserId": "725786863",
                "title": "精品皮草派克大衣",
                "status": 1,
                "totalNum": 1,
                "requestNum": 11793,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "美瑞蓓特",
                "showId": "2557886",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34574078",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1zVLMdCrqK1RjSZK9XXXyypXa.jpg",
                "shopItemId": "577418950477",
                "price": 5980
            },
            {
                "shopUserId": "3000840351",
                "title": "保友智慧新品Pofit電腦椅",
                "status": 1,
                "totalNum": 1,
                "requestNum": 12895,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "保友辦公傢俱旗艦店",
                "showId": "2557100",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34528042",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1bYZEg6TpK1RjSZKPXXa3UpXa.png",
                "shopItemId": "577598687971",
                "price": 5408
            },
            {
                "shopUserId": "791732485",
                "title": "TEK手持吸塵器A8",
                "status": 1,
                "totalNum": 1,
                "requestNum": 17195,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "泰怡凱旗艦店",
                "showId": "2552265",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34444014",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1D6bWbhTpK1RjSZFGXXcHqFXa.jpg",
                "shopItemId": "547653053965",
                "price": 5199
            },
            {
                "shopUserId": "3229583972",
                "title": "椰富海南冷炸椰子油食用油1L",
                "status": 1,
                "totalNum": 20,
                "requestNum": 4451,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "椰富食品專營店",
                "showId": "2561698",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34532250",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1VjLSePDpK1RjSZFrXXa78VXa.jpg",
                "shopItemId": "578653506446",
                "price": 256
            },
            {
                "shopUserId": "855223948",
                "title": "卡西歐立式家用電鋼琴PX770",
                "status": 1,
                "totalNum": 1,
                "requestNum": 16762,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "世紀音緣樂器專營店",
                "showId": "2551326",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34420041",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1CC6aa9zqK1RjSZFpXXakSXXa.jpg",
                "shopItemId": "562405126383",
                "price": 4838
            },
            {
                "shopUserId": "4065939832",
                "title": "關注寶貝送輕奢沙發床",
                "status": 1,
                "totalNum": 1,
                "requestNum": 17436,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "貝兮旗艦店",
                "showId": "2559904",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34532170",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1AzxYegHqK1RjSZFPXXcwapXa.jpg",
                "shopItemId": "577798067313",
                "price": 4399
            },
            {
                "shopUserId": "807974445",
                "title": "森海塞爾CX6藍芽耳機",
                "status": 1,
                "totalNum": 4,
                "requestNum": 22557,
                "acceptNum": 0,
                "reportNum": 0,
                "isApplied": false,
                "shopName": "sennheiser旗艦店",
                "showId": "2559701",
                "startTime": 1539619200000,
                "endTime": 1540220400000,
                "id": "34532161",
                "type": 1,
                "pic": "//img.alicdn.com/bao/uploaded/TB1HET6d7voK1RjSZFwXXciCFXa.jpg",
                "shopItemId": "564408956766",
                "price": 999
            }
        ]
    }
}

細心的小夥伴應該看到,我沒有傳送 form 給他,一樣可以請求到需要的資料,page 掛在了 query 上……

展示部分

資料拿到,就簡單了,其實就是這一個介面實現剩下的功能了,沒錯,記住我是前端。

<!DOCTYPE html>
<html lang="en">

<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <meta http-equiv="X-UA-Compatible" content="ie=edge">
  <title>tb try</title>
  <style>
    .warning {
      color: red;
    }

    button {
      width: 100px;
      height: 44px;
      margin-right: 44px;
    }

    table {
      border: 1px solid #d8d8d8;
      border-collapse: collapse;
    }

    tr {
      border-bottom: 1px solid #d8d8d8;
      cursor: pointer;
    }

    tr:last-child {
      border: 0;
    }
  </style>
</head>

<body>
  <button onclick="postPage()">下一頁</button>
  <span id="currentPage"></span>
  <table>
    <tbody>
      <tr>
        <th>序號(倒序)</th>
        <th>概率</th>
        <th>名字</th>
      </tr>
    </tbody>
    <tbody id="results"></tbody>
  </table>

  <script>
    let currentPage = 0 // 當前頁面
    let allItems = [] // 全部資料
    let currentTime = 0 // 鎖頻率使用,標記上次時間
    const xhr = new XMLHttpRequest()
    const loopInterval = 2 // 鎖頻率步長,單位秒
    const results = document.querySelector('#results')
    const currentPageText = document.querySelector('#currentPage')
    const reFullTBody = arr => {
      let innerHtml = ''
      arr.forEach((item, i) => {
        item.rate = item.totalNum / item.requestNum * 100
        let tr = `
          <tr onclick="window.open('https://try.taobao.com/item.htm?id=${item.id}')">
            <td>${i + 1}</td>
            <td>${item.rate.toFixed(3) + '%'}</td>
            <td>${item.title}</td>
          </tr>
          `
        if (item.rate > 5) tr = tr.replace('<tr', '<tr class="warning"')
        innerHtml += tr
      })
      currentPageText.innerText = `當前頁:${currentPage}`
      results.innerHTML = innerHtml
    }

    const postPage = () => {
      // 鎖頻率步長內取消請求
      const newTime = new Date().getTime()
      const shoudBack = newTime - currentTime < loopInterval * 1000
      if(shoudBack) {
        alert(loopInterval + '秒內不要多次點選哦。')
        return
      }
      currentTime = newTime
      xhr.onreadystatechange = function() {
        if(this.readyState === 4 && this.status === 200) {
          const res = JSON.parse(this.response)
            if(res.length < 1) {
            alert('今天結束的已經篩選完了')
            return
          }
          allItems = [...allItems, ...res]
          allItems.sort((a, b) => b.rate - a.rate)
          reFullTBody(allItems)
          currentPage--
        }
      }
      xhr.open('post', '/table')
      xhr.setRequestHeader("Content-Type", "application/x-www-form-urlencoded; charset=UTF-8");
      //傳送請求
      xhr.send("page=" + currentPage)
    }

    xhr.onreadystatechange = function() {
      if(this.readyState === 4 && this.status === 200) {
        currentPage = JSON.parse(this.response).pages
        postPage()
      }
    }
    xhr.open('get', '/total')
    xhr.send()
  </script>
</body>

</html>

長這個樣子:

展示

我多人性化,可以點選跳轉、概率超過 5% 紅色展示、還告訴你當前所在頁碼、點太快還給你提示………………………………

就是這麼好用,喜歡的趕緊體驗吧!

線上:點我體驗

覺得有用,不要吝惜 star 哦。