Linux下和MySQL下利用python插入億萬級資料

阿新 • • 發佈：2019-01-08

##下載mysqldb
首先必須下載mysqldb，下載語句是

yum install MySQL-python

安裝之後，在命令列輸入

#>>>python
#接下來是python程式碼
>>>import MySQLdb
>>>#顯示出了命令列即為安裝成功
>>>exit()#退出python

##插入億萬級資料
作者在寫SQL時就知道2種寫法（大牛另說），（1）一次插入一條，（2）一次插入多條。
之前在書上看到，一次插入多條會提高sql語句的速度，所以接下來就以（2）為例進行實驗插入1000萬條的資料。
首先隨便建張表

CREATE TABLE `good` (
  `id` int(10) NOT NULL AUTO_INCREMENT,
  `name` varchar(255) DEFAULT NULL,
  `price` double DEFAULT NULL,
  `color` varchar(255) DEFAULT NULL,
  `goodNum` int(11) DEFAULT NULL,
  `brandName` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`id`)
  ）

Linux下程式碼如下

import sys
import os
import time
import random as rd
import MySQLdb as md

def test():
  con=md.connect(host="localhost",user="root",passwd="admin123",db="test")
  cursor=con.cursor()
  #sql1="truncate table product"
  #n=cursor.execute(sql1)
  tm1=time.time()
  oriName="sujaloiushtegsk"
  oriPrice=5000
  oriPid=1831098
  for i in range(10000):
    sql="insert into good(name,price,color,goodNum,brandName) values"
    #sql="select * from product"
    tm=time.time()
    for j in range(10000):
      #print sql
      N1=rd.randint(1,14)
      N2=rd.randint(1,14)
      N3=rd.randint(1,14)
      PP=rd.randint(200,1500)
      ppid=rd.randint(1,10000)

      name=oriName[N1]+oriName[N2]+oriName[N3]
      brandName=oriName[N3]+oriName[N1]
      color=oriName[N1]+oriName[N3]
      goodNum=oriPid+ppid
      price=oriPrice+PP
      if j<=9998:
        sql=sql+"("+"'"+str(name)+"'"+","+str(price)+","+"'"+str(color)+"'"+","+str(goodNum)+","+"'"+str(brandName)+"'"+")"+","
        #sql=sql+"('123','apple7','6000','aaa','china')"+","
      else:
        sql=sql+"("+"'"+str(name)+"'"+","+str(price)+","+"'"+str(color)+"'"+","+str(goodNum)+","+"'"+str(brandName)+"'"+")"+";"
        #sql=sql+"("+str(pid)+","+"'"+str(pname)+"'"+","+str(price)+","+"'"+str(buyer)+"'"+","+"'"+str(city)+"'"+")"+';'
        #sql=sql+"('123','apple7','6000','aaa','china')"+";"
      #print j 
    #print sql
    n=cursor.execute(sql)
    con.commit()
    a=time.time()
    print "the"+str(i+1)+"'s time is :"+str(a-tm)
  tm2=time.time()
  print str(tm2-tm1)
  con.close()


if __name__=="__main__":
  test()

表名一類的那些大家根據自己的情況修改。作者程式碼水平欠缺，希望大家別嫌棄~~~
根據我的檢測，2000萬的資料集大概跑了400s，1億條的資料跑了1860s=31分鐘。因為插入的資料相對簡單，並且資料維度比較小，所以還是很快的。
若有其他的改進建議，希望大家不吝賜教。

Linux下和MySQL下利用python插入億萬級資料

Linux下和MySQL下利用python插入億萬級資料

MySQL使用儲存過程插入千萬級資料如何提升效率？

Linux下和Windows下的效能監控

利用Python遞迴下載資料夾下所有檔案

在Linux下和Windows下遍歷目錄的方法及如何達成一致性操作

獲取mysql 自增id 和mysql 下一個自增id的方法

Centos下和Win7下查看端口占用情況

mysql分頁和 mysql中利用編號id和每頁條數來進行分頁

遊戲引擎選擇、Mac下和Windows下UnrealEngine 4體驗對比

Python基礎（二）--- IDEA中整合Python和MySQL，使用Python進行SQL操作

mongo資料庫下和mybatis下的後端分頁操作

maven打war包到指定目錄下和tomcat下

MySQL模擬插入百萬級資料和SQL分析

Numpy基礎 --陣列和向量計算利用Python進行資料分析讀書筆記

InstantClient安裝使用(mac下和win下)

如何獲取assets下和raw下的檔案轉成string字串

利用python sklearn 將類別資料轉換成one-hot資料

mysql迴圈插入千萬級資料

Anaconda 利用python 向redis寫入資料小程式

利用Python爬取房產資料！並在地圖上顯示！Python乃蒂花之秀！

Linux下和MySQL下利用python插入億萬級資料

相關推薦