Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    40
    Rep Power
    11

    Question Splitting a list by order number


    Hi all,

    i need to know the best way to split a large text file containing many orders into individual orders. The text file can contain multiple lines for each order although the number isn't set. The order number is the first 5 characters of each line and I need to export these into a new text file.

    Any help is much appreciated
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    London, England
    Posts
    1,585
    Rep Power
    1373
    How do you know where one order ends and the next one begins? is it only by the fact that a line starts with a 5-digit number, or is there some other marker?

    Without more information and a small sample of data it would be difficult to write any code for this.

    Dave - The Developers' Coach
  4. #3
  5. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2005
    Posts
    174
    Rep Power
    11
    check out the examples on this page: http://www.ug.cs.usyd.edu.au/~comp5315/lec-04.html

    this is something similar. it takes blog codes at each line and then continues with the rest of the line or block. Also it takes each new section and saves it as a new html file.

    I'm sure you could play with it a bit and fit it into your issue.

    cheers
    sf2k
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    40
    Rep Power
    11
    each order is on a new line, here is a sample of the text file. The order number is the first 5 characters of the line

    J1234|0100|040|01295|0775001|L|T|O|W|04495
    J1234|0100|050|01295|0775001|L|T|O|W|04495
    J1234|0100|060|01295|0775001|L|T|O|W|04495
    J1234|0100|070|01295|0775001|L|T|O|W|04495
    J1238|0100|030|01295|0775001|L|T|O|U|04495
    J1238|0100|040|01295|0775001|L|T|O|U|04495
    J1238|0100|050|01295|0775001|L|T|O|U|04495
    L1236|0100|040|01295|0775001|T|T|L|L|04495
    L1236|0100|050|01295|0775001|T|T|L|L|04495
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2004
    Location
    Regensburg, Germany
    Posts
    147
    Rep Power
    17
    You could read the lines of your file, create a dictionary of the orders and write each order to another file.
    Try something like this:
    Code:
    file = open("my_file". "r")
    orders = {}
    for line in file:
        order = line[:5]
        try:
            orders[order].append(line)
        except KeyError:
            # new order
            orders[order] = [line]
    
    file.close()
    ...
    ...
    # write orders to files:
    for order in orders:
        # open file with name containing the order number
        file = open(order + ".ord", "w")
        # write order to file
        lines = orders[order]
        file.writelines(lines)
        file.close()
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    40
    Rep Power
    11
    worked a treat, many thanks
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    40
    Rep Power
    11
    thanks for the previous help it worked great but now i need to split the files into two seperate orders depending on the content of the line[11:14]

    if this section reads 666, 777, 888 or 999 i need to put these lines into a seperate file with a different extension.

    i have tried the code below with little success so would appreciate someone pointing me in the right direction.

    J1234|0100|040|01295|0775001|L|T|O|W|04495
    J1234|0100|666|01295|0775001|L|T|O|W|04495
    J1234|0100|666|01295|0775001|L|T|O|W|04495
    J1234|0100|777|01295|0775001|L|T|O|W|04495
    J1238|0100|888|01295|0775001|L|T|O|U|04495
    J1238|0100|888|01295|0775001|L|T|O|U|04495
    J1238|0100|999|01295|0775001|L|T|O|U|04495
    L1236|0100|040|01295|0775001|T|T|L|L|04495
    L1236|0100|050|01295|0775001|T|T|L|L|04495

    Code:
    file = open ('C:\\myfolder\\output8.txt', 'r')
    orders = {}
    for line in file:
    	order = line[:5]
    	try:
    		orders[order].append(line)
    	except KeyError:
    		orders[order] = [line]
    	##file.close()
    	for order in orders:
    		if line[11:14] == '666' or '777' or '888' or '999':
    			file = open('C:\\myfolder\\DATAOUT\\' + order + ".CS2", "a")
    			lines = orders[order]
    			file.writelines(line)
    		else:
    			file = open('C:\\myfolder\\DATAOUT\\' + order + ".CS1", "w")
    			lines = orders[order]
    			file.writelines(lines)
    			file.close()
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2005
    Posts
    37
    Rep Power
    10
    You can't say:

    if line[11:14] == '666' or '777' or '888' or '999':

    and have it do what you want. That condition will always be true, because '777' always evaluates to true. you need something along the lines of:

    if line[11:14] == '666' or line[11:14] == '777' or etc......

    Give that a try.
  16. #9
  17. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2005
    Posts
    174
    Rep Power
    11
    Maybe replace it with this? It's probably a bit cleaner
    Code:
    if '666' or '777' or '888' or '999' in line[11:14]:
         do_stuff()
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2003
    Posts
    624
    Rep Power
    35
    But it's also "always true":

    Code:
    >>> if '666' or '777' or '888' or '999' in "oi": print "found"
    ... 
    found
  20. #11
  21. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    London, England
    Posts
    1,585
    Rep Power
    1373
    Originally Posted by sf2k
    Maybe replace it with this? It's probably a bit cleaner
    Code:
    if '666' or '777' or '888' or '999' in line[11:14]:
         do_stuff()
    That won't work either - python sees the '666', evaluates that as True, and ignores the rest of the line.

    The Pythonic way to do it is:

    Code:
    if  line[11:14] in ('666', '777', '888', '999'):
         do_stuff()
    Dave - The Developers' Coach

    Comments on this post

    • sfb agrees : Quite
    • sf2k agrees : doh...thanks
    • jacktasia agrees
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Novice (500 - 999 posts)

    Join Date
    Nov 2003
    Posts
    624
    Rep Power
    35
    Originally Posted by macca1707
    thanks for the previous help it worked great but now i need to split the files into two seperate orders depending on the content of the line[11:14]

    if this section reads 666, 777, 888 or 999 i need to put these lines into a seperate file with a different extension.

    i have tried the code below with little success so would appreciate someone pointing me in the right direction.

    J1234|0100|040|01295|0775001|L|T|O|W|04495
    J1234|0100|666|01295|0775001|L|T|O|W|04495
    J1234|0100|666|01295|0775001|L|T|O|W|04495
    J1234|0100|777|01295|0775001|L|T|O|W|04495
    J1238|0100|888|01295|0775001|L|T|O|U|04495
    J1238|0100|888|01295|0775001|L|T|O|U|04495
    J1238|0100|999|01295|0775001|L|T|O|U|04495
    L1236|0100|040|01295|0775001|T|T|L|L|04495
    L1236|0100|050|01295|0775001|T|T|L|L|04495

    Code:
    file = open ('C:\\myfolder\\output8.txt', 'r')
    orders = {}
    for line in file:
    	order = line[:5]
    	try:
    		orders[order].append(line)
    	except KeyError:
    		orders[order] = [line]
    	##file.close()
    	for order in orders:
    		if line[11:14] == '666' or '777' or '888' or '999':
    			file = open('C:\\myfolder\\DATAOUT\\' + order + ".CS2", "a")
    			lines = orders[order]
    			file.writelines(line)
    		else:
    			file = open('C:\\myfolder\\DATAOUT\\' + order + ".CS1", "w")
    			lines = orders[order]
    			file.writelines(lines)
    			file.close()
    Partly why this isn't working is that the dictionary 'orders' is built from the order number, then you look at the next number, but write out the whole order to each file.

    I think I would try something like this: (untested)
    Code:
    for line in file('c:\\myfolder\\output8.txt'):
        fields = line.split('|')
        orderCode, otherThing = fields[0], fields[2]
    
        if otherThing in ('666', '777', '888', '999'):
            output = open('C:\\myfolder\\DATAOUT\\' + orderCode + ".CS2", "a")
           
        else:
            output = open('C:\\myfolder\\DATAOUT\\' + orderCode + ".CS1", "a")
    
        output.write(line + "\r\n")
        output.close()
    It does a lot of file opening and closing, but I wouldn't worry unless it becomes a performance issue.
    Last edited by sfb; March 21st, 2005 at 05:49 PM.
  24. #13
  25. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    I'm not sure if you need the values assigned to dictionary for use else-where in the program so I've left it in place but this can easily be removed if you don't.

    Anyway, here is a working example that counts in at only 12 lines not (counting comments) and 7 lines without the copying the values to the dictionary. There are more comments in here than I would normally use but they're for clarity .

    I've attached a Zip Archive containing the program and resource files.

    Hope this helps,

    Mark.
    Attached Files
    programming language development: www.netytan.com Hula

  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2004
    Posts
    40
    Rep Power
    11
    thanks guys, that worked great using your orders.py after a little tweaking for file locations.

    however when i copied the code into my program i keep getting this error. It works fine if i run it from a seperate file but errors if run as part of my program?????


    Code:
    Traceback (most recent call last):
      File "C:\Python24\Lib\MyCode\Courtesy.py", line 280, in ?
        SplitOrders()
      File "C:\Python24\Lib\MyCode\Courtesy.py", line 199, in SplitOrders
        for order in file('C:\\VDATA\\Courtesy Shoes\\output8.txt'):
    TypeError: 'file' object is not callable
  28. #15
  29. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    I am using the file() directly object rather than accessing it though open() like you do in your program. It looks like you're using the name file for something else.

    The simplest way to fix this would be to replace file() with open().

    Hope this helps,

    Mark.
    programming language development: www.netytan.com Hula

Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo