#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    20
    Rep Power
    0

    Help needed in line extraction from file


    As I didn't get a satisfactory answer I repeat my question..

    I have a large file (input.txt) with values in following format:

    OG1: or10|1345 or10|387 or10|474 or11|1203 or11|182 or10|2158 or12|637
    OG2: or10|1562 or10|1584 or10|1977 or11|2263 or11|43
    OG3: or12|2400 or12|2401 or13|2697 or13|2698 or16|2 or16|914 or27|1355
    OG4: or10|108 or20|2713 or25|2315 or25|2754 or2|1411

    …………..
    ………

    From this file, I want to find how many times ‘or10’ appear in different OGs (and also the corresponding number after pipe) and paste it in a output file (output.txt).. Here the output would be:

    OG1: or10|1345 or10|387 or10|474 or10|2158
    OG2: or10|1562 or10|1584 or10|1977
    OG3:
    OG4: or10|108

    Anyone?? Thanks for ur consideration...
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    139
    Rep Power
    4

    try:




    is there a way to delete post you've made?

    Comments on this post

    • b49P23TIvg agrees : Believe me, I've contributed to systems where I wish I knew how to retract.
    Last edited by WynnDeezl; April 5th, 2013 at 04:00 PM.
  4. #3
  5. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481
    I've put your data into the file Oh.Gee in current directory. In j, a dialect of APL: executable Iversion notation assign the verb f as shown, then compose (&) f with fread. This does work and is tested.
    Code:
       f =: (({.~ >:@:i.&' '),>@(;L:1@:(<@((<@({.~>:@:i.&' '));.1~'or10'&E.))));.1
    
       f&fread'Oh.Gee'
    OG1: or10|1345 or10|387 or10|474 or10|2158 
    OG2: or10|1562 or10|1584 or10|1977         
    OG3:                                       
    OG4: or10|108


    [edit]({.~>:@:i.&' ') meaning "take through the first space' is duplicate code which I should have named as a separate verb.[/edit]

    Comments on this post

    • WynnDeezl agrees : You're awesome, dude
    Last edited by b49P23TIvg; April 5th, 2013 at 04:20 PM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    20
    Rep Power
    0
    Thanks.. but it shows:

    File "code.py", line 4, in <module>
    for Line in Lines :
    TypeError: 'builtin_function_or_method' object is not iterable



    Originally Posted by WynnDeezl


    This is untested, but should be close:

    Code:
    Lines=open('input.txt','r').readlines
    Output=open('output.txt','w')
    
    for Line in Lines :
      Dummy=0
      Line=Line.strip().split('|')
      for Cnt in range(len(Line)) :
        if 'or10' in Line[Cnt] :
          Dummy=99
          Output.write('or10|'+Line[Cnt+1]+' ')
      if Dummy==0 : Output.write('\t')
      Output.write('\n')
    
    Output.close()
    Oops, forgot the 'OGx' stuff. i'll work on it and repost.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    139
    Rep Power
    4
    Originally Posted by abhijit.bose
    Thanks.. but it shows:

    File "code.py", line 4, in <module>
    for Line in Lines :
    TypeError: 'builtin_function_or_method' object is not iterable
    I was still messing with this code, but i'd use the second posts code. its way better. incredible actually..
    Last edited by WynnDeezl; April 5th, 2013 at 04:08 PM.
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    20
    Rep Power
    0
    Thanks for that information but I don't get it.. I change the directory and run ur command as:

    jpath '~C:\Users\admin\Desktop\jprogram'
    f =: (({.~ >:@:i.&' '),>@(;L:1@<@((<@({.~>:@:i.&' '));.1~'or10'&E.))));.1
    f&fread'groups.txt'

    but it gives:
    _1 0

    or may be I'm missing something ?

    Originally Posted by b49P23TIvg
    I've put your data into the file Oh.Gee in current directory. In j, a dialect of APL: executable Iversion notation assign the verb f as shown, then compose (&) f with fread. This does work and is tested.
    Code:
       f =: (({.~ >:@:i.&' '),>@(;L:1@:(<@((<@({.~>:@:i.&' '));.1~'or10'&E.))));.1
    
       f&fread'Oh.Gee'
    OG1: or10|1345 or10|387 or10|474 or10|2158 
    OG2: or10|1562 or10|1584 or10|1977         
    OG3:                                       
    OG4: or10|108
  12. #7
  13. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,997
    Rep Power
    481
    The file was not found---but you did start j and run the program.
    Code:
       f&:fread 'noSuchFile'
    _1 0
    Where you have 'groups.txt' use the full path name. Character vector constants in j (that's the file name in single quotes) don't interpret back slashes in any special way so you'd just use

    f&:fread 'c:\some\path\can have spaces\groups.txt'

    For expert j advice and especially for help in j using windows systems please write to
    programming@jsoftware.com

    Congratulations!

    Comments on this post

    • abhijit.bose agrees : Thanks
    [code]Code tags[/code] are essential for python code and Makefiles!
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2012
    Posts
    32
    Rep Power
    3
    Since it is posted in Python programming, here's a solution i came up with in python:
    Code:
    lines = open('txt_test.txt', 'r').readlines()
    out = open('out_test.txt', 'w')
    
    for i in range(0, len(lines)):
        if i == 0: out.write('OG%d: ' % (i+1))
        else: out.write('\nOG%d: ' % (i+1))
        for x in range(0, len(lines[i])):
            for y in range(x, len(lines[i])):
                if lines[i][x:x+4] == 'or10' and lines[i][y] == ' ':
                    out.write(lines[i][x:y] + ' ')
                else:
                    continue # if match hasn't been found, continue to loop for space char
                break # when a match has been found, loop for next 'org10'
    replace filenames with your desired ones, and it works just fine. It could probably be done with list comprehension too, tho that 1 liner from b49P23TIvg is still the way to go. Definitely have to learn that language.
    Good luck

    Comments on this post

    • abhijit.bose agrees : Thank you

IMN logo majestic logo threadwatch logo seochat tools logo