1. 程式人生 > >kevin_xiang的專欄,mail: [email protec

kevin_xiang的專欄,mail: [email protec

http://www.tuicool.com/articles/bUnMfu

django提供檔案下載時,若果檔案較小,解決辦法是先將要傳送的內容全生成在記憶體中,然後再一次性傳入Response物件中:

def simple_file_download(request):
# do something...
content = open("simplefile", "rb").read()
return HttpResponse(content)

如果檔案非常大時,最簡單的辦法就是使用靜態檔案伺服器,比如Apache或者Nginx伺服器來處理下載。不過有時候,我們需要對使用者的許可權做一下限定,或者不想向用戶暴露檔案的真實地址,或者這個大內容是臨時生成的(比如臨時將多個檔案合併而成的),這時就不能使用靜態檔案伺服器了。

django文件中提到,可以向HttpResponse傳遞一個迭代器,流式的向客戶端傳遞資料。

要自己寫迭代器的話,可以用yield:

def read_file(filename, buf_size=8192):
with open(filename, "rb") as f:
while True:
content = f.read(buf_size)
if content:
yield content
else:
break
def big_file_download(request):
filename = "filename"
response = HttpResponse(read_file(filename))
return
response

或者使用生成器表示式,下面是django文件中提供csv大檔案下載的例子:

import csv
from django.utils.six.moves import range
from django.http import StreamingHttpResponse
class Echo(object):
"""An object that implements just the write method of the file-like
interface.
"""
def write(self, value):
"""Write the value by returning it, instead of storing in a buffer."""
return value def some_streaming_csv_view(request): """A view that streams a large CSV file.""" # Generate a sequence of rows. The range is based on the maximum number of # rows that can be handled by a single sheet in most spreadsheet # applications. rows = (["Row {0}".format(idx), str(idx)] for idx in range(65536)) pseudo_buffer = Echo() writer = csv.writer(pseudo_buffer) response = StreamingHttpResponse((writer.writerow(row) for row in rows), content_type="text/csv") response['Content-Disposition'] = 'attachment; filename="somefilename.csv"' return response

python也提供一個檔案包裝器,將類檔案物件包裝成一個迭代器:

class FileWrapper:
"""Wrapper to convert file-like objects to iterables"""
def __init__(self, filelike, blksize=8192):
self.filelike = filelike
self.blksize = blksize
if hasattr(filelike,'close'):
self.close = filelike.close
def __getitem__(self,key):
data = self.filelike.read(self.blksize)
if data:
return data
raise IndexError
def __iter__(self):
return self
def next(self):
data = self.filelike.read(self.blksize)
if data:
return data
raise StopIteration

使用時:

from django.core.servers.basehttp import FileWrapper
from django.http import HttpResponse
import os
def file_download(request,filename):
wrapper = FileWrapper(file('filepath'))
response = HttpResponse(wrapper, content_type='application/octet-stream')
response['Content-Length'] = os.path.getsize(path)
response['Content-Disposition'] = 'attachment; filename=%s' % filename
return response

django也提供了StreamingHttpResponse類來代替HttpResponse對流資料進行處理。

壓縮為zip檔案下載:

import os, tempfile, zipfile  
from django.http import HttpResponse  
from django.core.servers.basehttp import FileWrapper  
def send_zipfile(request):  
"""																		  
Create a ZIP file on disk and transmit it in chunks of 8KB,				  
without loading the whole file into memory. A similar approach can		   
be used for large dynamic PDF files.										 
"""  
temp = tempfile.TemporaryFile()  
archive = zipfile.ZipFile(temp, 'w', zipfile.ZIP_DEFLATED)  
for index in range(10):  
filename = __file__ # Select your files here.							 
archive.write(filename, 'file%d.txt' % index)  
archive.close()  
wrapper = FileWrapper(temp)  
response = HttpResponse(wrapper, content_type='application/zip')  
response['Content-Disposition'] = 'attachment; filename=test.zip'  
response['Content-Length'] = temp.tell()  
temp.seek(0)  
return response

不過不管怎麼樣,使用django來處理大檔案下載都不是一個很好的注意,最好的辦法是django做許可權判斷,然後讓靜態伺服器處理下載。

這需要使用sendfile的機制:"傳統的Web伺服器在處理檔案下載的時候,總是先讀入檔案內容到應用程式記憶體,然後再把記憶體當中的內容傳送給客戶端瀏覽器。這種方式在應付當今大負載網站會消耗更多的伺服器資源。sendfile是現代作業系統支援的一種高效能網路IO方式,作業系統核心的sendfile呼叫可以將檔案內容直接推送到網絡卡的buffer當中,從而避免了Web伺服器讀寫檔案的開銷,實現了“零拷貝”模式。 "

Apache伺服器裡需要mod_xsendfile模組來實現,而Nginx是通過稱為 X-Accel-Redirect 的特性來實現。

nginx配置檔案:

# Will serve /var/www/files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
internal;
alias /var/www/files;
}

或者

# Will serve /var/www/protected_files/myfile.tar.gz
# When passed URI /protected_files/myfile.tar.gz
location /protected_files {
internal;
root /var/www;
}

注意alias和root的區別。

django中:

response['X-Accel-Redirect']='/protected_files/%s'%filename

這樣當向django view函式發起request時,django負責對使用者許可權進行判斷或者做些其它事情,然後向nginx轉發url為/protected_files/filename的請求,nginx伺服器負責檔案/var/www/protected_files/filename的下載:

@login_required
def document_view(request, document_id):
book = Book.objects.get(id=document_id)
response = HttpResponse()
name=book.myBook.name.split('/')[-1]
response['Content_Type']='application/octet-stream'
response["Content-Disposition"] = "attachment; filename={0}".format(
name.encode('utf-8'))
response['Content-Length'] = os.path.getsize(book.myBook.path)
response['X-Accel-Redirect'] = "/protected/{0}".format(book.myBook.name)
return response