How can I quickly insert a data frame with a large python capacity into msql?

Asked 2 years ago, Updated 2 years ago, 119 views

I want to create a data frame with Python's Pandas and insert it into mssql Data frames are formed in 14 columns at an average of 250,000 rows.

sqlalchemy.create_engine to insert the whole in sql form, and it took 45 minutes locally dataframe.iterrows() inserts cursor.execute() line by line through repeat statement, which also takes a long time and is currently running.

How can I quickly insert a data frame into msql when it takes less than 10 seconds to process it?

The largest capacity file in the json file is 316,650 KB.

python mssql sql-server

2022-09-20 21:52

2 Answers

import pandas as pd
import pymysql
import time

def makefile(name,q):
    f = open(name,'w')
    f.write('ls,num,id,pass,name\n')
    for i in range(q):
        txt = '%s,%s,id%s,pw%s,nm%s\n'%(i,i,i,i,i)
        f.write(txt)

def conn():
    db = pymysql.connect(host='localhost',port=3306,user='howoni',passwd='#',db='howoni')
    c = db.cursor()
    c.execute('truncate TABLES')
    return db

makefile('hi2.csv',500000)
df = pd.read_csv('hi2.csv',index_col='ls')

b = conn()
a = b.cursor()

s = time.time()
sql = 'insert into TABLES values '
data = []
for idx, i in enumerate(df.values):     
    sql += '(%s, %s, %s, %s),'
    for j in i:
        data.append(j)
    if idx % 20 == 0:
        sql = sql.rstrip(',')
        a.execute(sql, tuple(data))
        sql = 'insert into TABLES values '
        data = []
sql = sql.rstrip(',')
a.execute(sql,data)
print(time.time()-s)

s = time.time()
sql = 'insert into TABLES values (%s, %s, %s, %s)'
for i in df.values:
    a.execute(sql, tuple(i))
print(time.time()-s)

a.execute('select count(*) from TABLES')
r = a.fetchone()
print(r)

b.commit()
a.close()

1) "insert into [TABLES] values (%s, %s, %s, %s) * 20" * 25000 : 500000rows 15 seconds,

2) "insert into [TABLES] values (%s, %s, %s) * 500000 : 500000rows required 120 seconds

I think you can refer to it when you use pimssql to hit it in a way other than bulkFor Use


2022-09-20 21:52

Stackoverflow has the same question. There are various answers, so please test it.

https://stackoverflow.com/questions/29706278/python-pandas-to-sql-with-sqlalchemy-how-to-speed-up-exporting-to-ms-sql


2022-09-20 21:52

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.