1. 程式人生 > >VBA-正則表示式來獲取網上資料

VBA-正則表示式來獲取網上資料

以貓眼電影為例

1,我們要獲得貓眼電影榜單的好看的電影資訊,影片名稱,主演,以及觀看和購票連結,獲取後效果如下圖所示

2.不難看出,我們只需要通過觀察網頁原始碼,然後進行整合提取關鍵資訊,在用正則表示式來擷取想要的資訊就可以對應的獲取相應的資訊,具體的程式碼及解釋如下

Option Explicit
'獲取貓眼電影榜單資訊
Sub getdy()
    Cells.Clear
    '建立正則表示式
    Dim ret As Object
    Set ret = CreateObject("VBScript.RegExp")
    With ret
        .Global = False
        .Pattern = "[\u4e00-\u9fa5]+"
    End With
    '建立HTML物件,進行連線請求
    Dim ht As Object
    Set ht = CreateObject("MSXML2.XMLHTTP")
    Dim strurl As String
    strurl = "http://maoyan.com/board"
    With ht
        .Open "get", strurl, False
        .send
        Do While .readystate <> 4
            DoEvents
        Loop
    
     End With
'建立陣列接收資訊
    Dim s() As String
    Dim ming() As String
    Cells(1, 1) = "影片名稱"
    Cells(1, 2) = "主演"
    Cells(1, 3) = "連結"
 '擷取網頁資訊(影片名稱)
    s = Split(ht.responsetext, "<p class=""name")
    ReDim ming(0 To UBound(s))
  '將資訊寫入表格
   Dim i As Integer
   For i = 1 To UBound(s)
   '用正則表示式對結果進行處理
     ming(i) = ret.Execute(s(i))(0)
   Cells(i + 1, 1) = ming(i)
   Next
   '更換正則表示式,擷取漢字和非空格字元
    ret.Pattern = "([\u4e00-\u9fa5]+\S+)"
    '獲取資訊(主演)
    s = Split(ht.responsetext, "<p class=""star")
    Dim star() As String
    ReDim star(0 To UBound(s))
    For i = 1 To UBound(s)
        star(i) = ret.Execute(s(i))(0)
        Cells(i + 1, 2) = Replace(star(i), "主演:", "")
    
    Next
    '獲取電影連結,即進入其購票和觀看頁面的連結
    ret.Pattern = "\d+"
    s = Split(ht.responsetext, "<p class=""name")
    Dim lianjie() As String
    ReDim lianjie(0 To UBound(s))
    For i = 1 To UBound(s)
       lianjie(i) = ret.Execute(s(i))(0)
        Cells(i + 1, 3) = "http://maoyan.com/films/" & lianjie(i)
        
    Next
    
End Sub