1. 程式人生 > >httpWebRequest獲取流和WebClient的文件抓取

httpWebRequest獲取流和WebClient的文件抓取

即使 reads 請求 pub sof do-while agen lib uri

httpWebRequest獲取流和WebClient的文件抓取

昨天寫一個抓取,遇到了一個坑,就是在獲取網絡流的時候,人為的使用了stream.Length來獲取流的長度,獲取的時候會拋出錯誤,查了查文檔,原因是某些流是無法獲取到數據的長度的,所以不能直接得到。如果是常和stream打交道就能避免這個問題。其實直接使用do-while來獲取就行了,代碼如下:

int i=0;
do
{
    byte[] buffer = new byte[1024];

    i = stream.Read(buffer, 0, 1024);

    fs.Write(buffer, 0, i);

} while (i >0);

其中while後只能寫i>0;而不能寫成i>=1024;原因可以看MSDN中的一段解釋:msdn

僅當流中沒有更多數據且預期不會有更多數據(如套接字已關閉或位於文件結尾)時,Read 才返回 0。 即使尚未到達流的末尾,實現仍可以隨意返回少於所請求的字節。

一下是httpwebrequest和webClient抓取數據的簡短代碼:

httpWebRequest

/// <summary>
/// 
/// </summary>
/// <param name="url">抓取url</param>
/// <param name="filePath">保存文件名</param>
/// <param name="oldurl">來源路徑</param>
/// <returns></returns>
public static bool HttpDown(string url, string filePath, string oldurl)
{
    try
    {
        HttpWebRequest req = WebRequest.Create(url) as HttpWebRequest;

        req.Accept = @"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
    ";
        req.Referer = oldurl;
        req.UserAgent = @" Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36
    ";
        req.ContentType = "application/octet-stream";

        HttpWebResponse response = req.GetResponse() as HttpWebResponse;

        Stream stream = response.GetResponseStream();

       // StreamReader readStream=new StreamReader 

        FileStream fs = File.Create(filePath);

        long length = response.ContentLength;


        int i=0;
        do
        {
            byte[] buffer = new byte[1024];

            i = stream.Read(buffer, 0, 1024);

            fs.Write(buffer, 0, i);

        } while (i >0);
         

        fs.Close();

        return true;
    }
    catch (Exception ex) 
    { 
        return false;
    }


}

WebClient

public static bool Down(string url, string desc,string oldurl)
{
    try
    {
        WebClient wc = new WebClient();
        wc.Headers.Add(HttpRequestHeader.Accept, @"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8
");

        wc.Headers.Add(HttpRequestHeader.Referer, oldurl);
        wc.Headers.Add(HttpRequestHeader.UserAgent, @" Mozilla/5.0 (Windows NT 6.3; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36
");
        wc.Headers.Add(HttpRequestHeader.ContentType, "application/octet-stream");


        wc.DownloadFile(new Uri(url), desc);

        Console.WriteLine(url);
        Console.WriteLine("    "+desc + "   yes!");
        return true;

    }
    catch (Exception ex)
    {
        return false;
    }

}

httpWebRequest獲取流和WebClient的文件抓取