March 18th, 2005, 08:13 PM
Hi there all,
for a small project i want to rewrite an old project (php) into python. I read a little about python and mysql and i'm convinced that I achieve that what I want. Now the problem.
I got a script that reads a directory and reads the lines from the logfiles in that directory. Next step is to instert the content per line (per file ) into a database.
I first created a script that used cursor.execute(). Unfortunatly the script got very slow (900.000 rows in 600 secs .. ). Then i read something about the executemany command. So I rewrote the script ( its not oop yet just some testing) to use executemany but for some reason this doesnt work as i want to.
First of all for some reason the list is not filled with more then 1440 tuples this resolves in just 1440 rows inserted into the dbase ( leaving aprox 890000 rows to be done ).
Next sometimes i get a OperationalError: (2013, 'Lost connection to MySQL server during query') error ? How can that happen
could someone give me a hint why this is happening? for those wo want to have it.. the source of the script:
and a line from a logfile:
# this file can be seen as the core of the GZ2 script
# it inserts the contents from the logfiles into the database
# and creates the xml needed for the website to show statistics
import MySQLdb, os, string, time
connection = MySQLdb.connect(host="localhost", user="root", passwd="ez2aohh", db="gz2")
cursor = connection.cursor()
time1 = time.time()
path = 'e:/GZ2/proxylogs/'
loglist = os.listdir(path)
for file in loglist:
data = 
fileopen = open(path+file, "r")
for line in fileopen.readlines():
pattern = string.split(line,",")
date = string.split(pattern," ")
temptup = (date, pattern, pattern, pattern, pattern, pattern, pattern, pattern)
cursor.executemany("insert into block( block_time, block_ip, block_email, block_packet, block_size, block_os, block_cpu, block_client) values(%s, %s, %s, %s, %s, %s, %s, %s )",data)
time2 = time.time()
print time2 - time1