Hi there all,

for a small project i want to rewrite an old project (php) into python. I read a little about python and mysql and i'm convinced that I achieve that what I want. Now the problem.

I got a script that reads a directory and reads the lines from the logfiles in that directory. Next step is to instert the content per line (per file ) into a database.

I first created a script that used cursor.execute(). Unfortunatly the script got very slow (900.000 rows in 600 secs .. ). Then i read something about the executemany command. So I rewrote the script ( its not oop yet just some testing) to use executemany but for some reason this doesnt work as i want to.

First of all for some reason the list is not filled with more then 1440 tuples this resolves in just 1440 rows inserted into the dbase ( leaving aprox 890000 rows to be done ).

Next sometimes i get a OperationalError: (2013, 'Lost connection to MySQL server during query') error ? How can that happen

could someone give me a hint why this is happening? for those wo want to have it.. the source of the script:
# this file can be seen as the core of the GZ2 script
# it inserts the  contents from the logfiles into the database
# and creates the xml needed for the website to show statistics

import MySQLdb, os, string, time
import dircache
connection = MySQLdb.connect(host="localhost", user="root", passwd="ez2aohh", db="gz2")
cursor = connection.cursor()

time1 = time.time()

path = 'e:/GZ2/proxylogs/'

loglist = os.listdir(path)

for file in loglist:
    data = []   
    fileopen = open(path+file, "r")

    for line in fileopen.readlines(): 
        pattern = string.split(line,",")
        date = string.split(pattern[0]," ")

        temptup = (date[0], pattern[1], pattern[2], pattern[3], pattern[4], pattern[5], pattern[6], pattern[7])       

    cursor.executemany("insert into block( block_time, block_ip, block_email, block_packet, block_size, block_os, block_cpu, block_client) values(%s, %s, %s, %s, %s, %s, %s, %s  )",data)

print len(data)
time2 = time.time()
print time2 - time1
and a line from a logfile:

2003-10-23 15:00:06,,teambvd@teambvd.com,CA:65449BBC:00000000,1,4,1,90050483,1