Thread: Need help

    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    4
    Rep Power
    0

    Need help


    I have something like this .....

    10280341|2012-10-03 19:11:06.390|Sami|abc|Crossword|70
    10280343|2012-10-03 19:15:32.173|Sami|aaa|Sudoku|30
    10280355|2012-10-04 19:15:32.173|miami|bbb|Chaircar|15
    10280366|2012-10-04 19:15:32.173|miami|bob|Avista|35

    And i want an OUTPUT like this grouped by name and date .....


    2012-10-03
    Sami|2|100

    2012-10-04
    miami|2|50
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    139
    Rep Power
    3

    text file


    It would probably help your chances at responses if you were a little more personable.

    for opening your text and output file, i would use:

    Code:
    with open(filename, mode='r') as input_stream:
      Data = input_stream.readlines()
      SData = Data.strip()
    with open(filename, mode='w') as output_stream:
    then find the pipes, and pull out the data between them:

    Code:
    for Line in SData :
      for Cnt1 in range(len(Line)) :
        if '|' in SData[Cnt1] : 
          Pipe1 = Cnt1
          break
    
      for Cnt2 in range(len(Line[Cnt1:-1]))
        if '|' in Line[Cnt2] : 
          Pipe2 = Cnt2
    .     break
    .
    .
    .
      for Cnt5 in range(len(Line[Cnt4:-1]))
        if '|' in Line[Cnt5] :
          Pipe5 = Cnt5
          break
    
    for Line in SData :
      output_stream.write(Line[Cnt1+1:Cnt1+9])
      output_stream.write(Line[Cnt2+1:Cnt3]+'whatever else that is'+'\n'
    i wasn't able to test this, but the general idea should be obtainable.

  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    San Francisco Bay
    Posts
    1,939
    Rep Power
    1313
    Originally Posted by WynnDeezl
    Code:
    for Line in SData :
      for Cnt1 in range(len(Line)) :
        if '|' in SData[Cnt1] : 
          Pipe1 = Cnt1
          break
    
      for Cnt2 in range(len(Line[Cnt1:-1]))
        if '|' in Line[Cnt2] : 
          Pipe2 = Cnt2
    .     break
    .
    .
    .
      for Cnt5 in range(len(Line[Cnt4:-1]))
        if '|' in Line[Cnt5] :
          Pipe5 = Cnt5
          break
    
    for Line in SData :
      output_stream.write(Line[Cnt1+1:Cnt1+9])
      output_stream.write(Line[Cnt2+1:Cnt3]+'whatever else that is'+'\n'
    There's a much easier way to do all this:
    python Code:
    # For compatibility with Python 2.x
    from __future__ import print_function
     
    with open(input_filename) as input_file, open(output_filename, 'w') as output_file:
        for line in input_file:
            fields = line.strip().split('|')
     
            name = fields[2]
            (date, time) = fields[1].split(' ')
     
            print(date, file=output_file)
            # etc.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    139
    Rep Power
    3

    nice


    that is good compact code, but it looks like its only writing the date, not the rest of the data required
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Intermediate (1500 - 1999 posts)

    Join Date
    Feb 2004
    Location
    San Francisco Bay
    Posts
    1,939
    Rep Power
    1313
    Originally Posted by WynnDeezl
    that is good compact code, but it looks like its only writing the date, not the rest of the data required
    Obviously, hence the "#etc." at the end. I assume the OP can figure out what's going or ask further questions if there's still trouble.
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2012
    Posts
    4
    Rep Power
    0

    Need To Group It


    [QUOTE=Hey thanks a lot for your suggestions but having difficulties in grouping it. As i am new to python i need help regarding this output.[/QUOTE]

    Grouping by DATE,
    Grouping by NAME|Len(NAME)|Sum(Values) i.e

    2012-10-03
    Sami|2|100
    2012-10-04
    miami|2|50
    2012-10-05
    Sami|1|20
  12. #7
  13. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,711
    Rep Power
    480
    from
    10280341|2012-10-03 19:11:06.390|Sami|abc|Crossword|70
    10280343|2012-10-03 19:15:32.173|Sami|aaa|Sudoku|30

    You'd like this report:
    2012-10-03
    Sami|2|100

    There are 2 Sami entries on 2012-10-03, and the sum of the last field equals 100.

    This accounts for the field values.

    Next, instead of doing "the simplest thing that could possibly work", I assume you chose over-simplified input.

    Code:
    garbage_in = '''
        10280341|2012-10-03 19:11:06.390|Sami|abc|Crossword|70
        10280355|2012-10-03 19:15:32.173|miami|bbb|Chaircar|15
        10280343|2012-10-03 19:15:32.173|Sami|aaa|Sudoku|30
        xxxxxxxx|2012-10-04 19:15:32.173|Salami|aaa|Sudoku|3
        10280355|2012-10-04 19:15:32.173|miami|bbb|Chaircar|1
        10280366|2012-10-04 19:15:32.173|miami|bob|Avista|2
        10280366|2012-10-04 19:15:32.173|miami|bob|Avista|3
        10280366|2012-10-04 19:15:32.173|miami|bob|Avista|4
    '''
    
    d = {}
    
    for line in garbage_in.split('\n'):
        fields = line.split('|')
        if 6 == len(fields):
            date = fields[1].split()[0]
            name = fields[2]
            key = (date,name)
            if key not in d:
                d[key] = [0,0]
            a = d[key]
            a[0] += 1               # occurrence count
            a[1] += int(fields[-1]) # conversion to integer, sum of last field
    
    L = []
    for key in sorted(d):
        (count,Sum) = d[key]
        L.extend(['',key[0],'%s|%s|%s'%(key[1],count,Sum)])
    
    garbage_out = '\n'.join(L)
    
    print(garbage_out)
    
    
    
    
    '''
    
    2012-10-03
    Sami|2|100
    
    2012-10-03
    miami|1|15
    
    2012-10-04
    Salami|1|3
    
    2012-10-04
    miami|4|10
    '''
    [code]Code tags[/code] are essential for python code and Makefiles!

IMN logo majestic logo threadwatch logo seochat tools logo