Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0

    Data processing script


    I am rather new to programming and I am currently trying to write a script to help me process some data. The file with the data I am trying to process is a little lengthy so I have it uploaded to my dropbox as a text document here. What I am trying to do is have the script search for the variable "surface 101", "surface 105" etc. and then store the data located in the very next line so that I can eventually graph it. I first successfully made a script to read one entry but now I want to expand it so that it will continue reading all of the data. The first script looks like this:
    Code:
    def reports(output):
            with open(output) as f:
                   
             #r is a variable used to help find the point where the results start.
             #two lines after 'cell 1' is the energy and intensity values
    
             r = str(' surface  101')
    
             c = 0 #intilizes the c variable used in the while loop
             #The following while loop skips each line until it comes across the line defined in r
    
             while c != r:
                     line = f.readline()
                     c = line.strip()
             
             line = f.readline() #reads the energy, intensity and SEM
             tally = []
             SEM = []
             numbers = line.strip().split() 
             tally.append(numbers[0])
             SEM.append(numbers[1])
          
             print(tally)
    I have now added a second while loop to hopefully continue running the script until all "surface XX" variables are recorded and it looks like this:
    Code:
    def reports(output):
            with open(output) as f:
                   
             #r is a variable used to help find the point where the results start.
             count = 1
             x = 101
             r = str('surface  101')
    
             c = 0 #intilizes the c variable used in the while loop
             #The following while loop skips each line until it comes across the line defined in r
             while count < 11
                
                while c != r:
                     line = f.readline()
                     c = line.strip()
                
                count += 1
                x += 4
                line = f.readline() #reads the energy and SEM
                tally = []
                SEM = []
                numbers = line.strip().split() 
                tally.append(numbers[0])
                SEM.append(numbers[1])
             
    
             print(tally)
             print(SEM)

    I have two questions regarding this second piece of code.
    1. the r variable is currently defined as
    Code:
    r = str('surface  101')
    I believe I need to set it so that the "101" becomes whatever x is but I am not sure how to reference the variable x

    2. Does it look like I am on the right track for what I am trying to accomplish? I hope this all makes sense. I am so new to programming it is difficult to explain what I am trying to do. Thanks!
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    Just as I continue to work on this I think I MIGHT have figured out how to reference the x variable inside of my r variable. When I run the script however I do come up with the following error:
    Code:
    >>> from report_test import reports
    >>> reports('test.txt')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "report_test.py", line 22, in reports
        tally.append(numbers[0])
    IndexError: list index out of range
    This is my new updated code:
    Code:
    def reports(output):
            with open(output) as f:
                   
             #r is a variable used to help find the point where the results start.
             count = 1
             x = 101
             r = str('surface  %d' % x)
    
             c = 0 #intilizes the c variable used in the while loop
             #The following while loop skips each line until it comes across the line defined in r
             while count < 11:
                
                    while c != r:
                        line = f.readline()
                        c = line.strip()
                
                    
                    line = f.readline() #reads the energy and SEM
                    tally = []
                    SEM = []
                    numbers = line.strip().split() 
                    tally.append(numbers[0])
                    SEM.append(numbers[1])
                    count += 1
                    x += 4
             
    
             print(tally)
             print(SEM)
    Thanks!
  4. #3
  5. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,931
    Rep Power
    481
    Sorry, where's your input file please?

    Usually gawk is a better language to use for this sort of task. Python will work. Because the string is already a string,

    r = str('surface %d' % x)

    is effectively the same as

    r = 'surface %d'%x
    [code]Code tags[/code] are essential for python code and Makefiles!
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    Thanks for your reply. I have no idea what happened to the input file link. Would it have been removed because I don't have enough posts yet? Gawk would be better but the end goal of this is to graph all the data which I believe python will be better for. I have this section working now with your help and this is what it looks like:
    Code:
    def reports(output):
            with open(output) as f:
                   
             
             count = 1                  #this defines the number of tallies in the output file
             x = 101                    #this is the first surface number
             r = str('surface  %d' % x) #r is a variable used to help find the point where the results start.
    
             c = 0 #intilizes the c variable used in the while loop
             #The following while loop skips each line until it comes across the line defined in r
             tally = []
             SEM = []
             while count < 11:
                
                    while c != r:
                        line = f.readline()
                        c = line.strip()
                
                    
                    line = f.readline() #reads the energy and SEM
                    numbers = line.strip().split() 
                    tally.append(numbers[0])
                    SEM.append(numbers[1])
                    count += 1
                    x += 4
                    r = str('surface  %d' %x)
             
             print(tally)
             print(SEM)
    I THINK with what I have here building the rest of the data should be fairly straightforward and I have a good example to build off for graphing it but there's a good chance I'll come running back here soon for more help :P Thanks again for you quick help!
  8. #5
  9. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,931
    Rep Power
    481
    post dot your dot link dot cleverly dot com

    I usually write data files then plotted with gnuplot. Many steps, many processes, nice graphs.
    [code]Code tags[/code] are essential for python code and Makefiles!
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    OK this is the last issue before I start graphing the data. Here is what I have:
    Code:
    def reports(output):
            with open(output) as f:
                   
             
             count = 1                  #this defines the number of tallies in the output file
             x = 101                    #this is the first surface number
             r = str('surface  %d' % x) #r is a variable used to help find the point where the results start.
             d = 0
             c = 0 #intilizes the c variable used in the while loop
             p = 0
             y = 0
             #The following while loop skips each line until it comes across the line defined in r
             Flux = []
             SEM = []
             Depth = []
             Change = []
             while count < 11:
                
                    while c != r:
                        line = f.readline()
                        c = line.strip()
                
                    
                    line = f.readline() #reads the energy and SEM
                    numbers = line.strip().split() 
                    Flux.append(numbers[0])
                    SEM.append(numbers[1])
                    Depth.append(d)
                    p = ((Flux[0]-Flux[%s]) / Flux[0]) * 100 %y 
                    Change.append(p)
                    d += 4
                    count += 1
                    x += 4
                    r = str('surface  %d' %x)
             
          
             print(SEM)
             print(Depth)
    On this section:
    Code:
    p = ((Flux[0]-Flux[%s]) / Flux[0]) * 100 %y
    I am trying to have Flux[0] subtracted from Flux starting at 0 and increasing by 1. What I have listed here does not work.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    Here is the input file:

    https://dl [dot] dropbox [dot] com/u/14334980/test.txt
  14. #8
  15. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,931
    Rep Power
    481

    Warning! Alchemist converts gold to lead.


    Code:
    p = ((Flux[0]-Flux[-1]]) / Flux[0]) * 100
    
    
    LIST[-1] # is the object at greatest index.
    
    Flux[y]  # would work, if you also update y
    Solution with gnuplot and gawk.
    File named /tmp/SURF
    Code:
    gnuplot> plot "<gawk 'a{print;a=0}($1==\"surface\")&&(2==NF){a=1}' /tmp/SURF" u ($0):1 w l
    Last edited by b49P23TIvg; January 7th, 2013 at 10:18 PM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    So this is output using the -1 method:

    Code:
    >>> from report_test import reports
    >>> reports('o')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "report_test.py", line 29, in reports
        p = ((Flux[0]-Flux[-1]) / Flux[0]) * 100 
    TypeError: unsupported operand type(s) for -: 'str' and 'str'
    So it looks like it doesn't like me subtracting a string from a string right? Can I convert these to integers or is there another syntax for subtracting strings?
  18. #10
  19. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,931
    Rep Power
    481
    oops. My programs work better if I test them first.

    You'll need float(string)

    Code:
                p = '%g'%(100*((float(Flux[0])-float(Flux[-1])) / float(Flux[0])))
    [code]Code tags[/code] are essential for python code and Makefiles!
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    Originally Posted by b49P23TIvg
    oops. My programs work better if I test them first.

    You'll need float(string)

    Code:
                p = '%g'%(100*((float(Flux[0])-float(Flux[-1])) / float(Flux[0])))
    Thanks it works!

    So what does the '%g'% do? I understand adding the floats but not the other one.
  22. #12
  23. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,931
    Rep Power
    481
    Converts the number back to a string.

    new string format

    old string format

    Must be a better description in the tutorial.

    Code:
       NB. www.jsoftware.com  j session
    
       NB. find best fit  y = a exp(b x)
    
       NB. A are the data
       A=:0".'2.25607E-04    2.51727E-04    2.04769E-04    1.60019E-04    1.22105E-04    9.10189E-05    6.63603E-05    4.74191E-05    3.32884E-05    2.30198E-05 '
    
       A=: }. A  NB. behead the vector to remove the bad point.
    
       [COEF=:(^. %. i.@:# ^/ 0 1"_)A  NB. find coefficients of linear fit to log data
    _8.17403 _0.301005
    
       fit=: ^@:(COEF&p.)  NB. verb fit
    
       # A  NB. tally A
    9
       i. 9  NB. integers
    0 1 2 3 4 5 6 7 8
    
       fit i. # A
    0.00028188 0.000208612 0.000154388 0.000114259 8.456e_5 6.25807e_5 4.63144e_5 3.4276e_5 2.53668e_5
    
    
       <.0.5+100*(%~ (- fit@:i.@:#)) A  NB. percent error of the fit
    _12 _2 4 6 7 6 2 _3 _10
    Last edited by b49P23TIvg; January 7th, 2013 at 10:15 PM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    Here is my latest trouble. I am trying to get the program to output a script of all of my data. Here is what it looks like right now:
    Code:
    def reports(output):
            with open(output) as f:
                   
             
             count = 1                  #this defines the number of tallies in the output file
             x = 101                    #this is the first surface number
             r = str('surface  %d' % x) #r is a variable used to help find the point where the results start.
             d = 0 #initializes depth variable
             c = 0 #intilizes the c variable used in the while loop
             p = 0 #initializes variable for calculating percent change
             #The following while loop skips each line until it comes across the line defined in r
             Flux = []
             SEM = []
             Depth = []
             Change = []
             while count < 10:
                
                    while c != r:
                        line = f.readline()
                        c = line.strip()
                
                    
                    line = f.readline() #reads the energy and SEM
                    numbers = line.strip().split() 
                    Flux.append(numbers[0])
                    SEM.append(numbers[1])
                    Depth.append(d)
                    p = '%g'%(100*((float(Flux[0])-float(Flux[-1])) / float(Flux[0]))) #calculates percent change
                    Change.append(p)
                    d += 4
                    count += 1
                    x += 4
                    r = str('surface  %d' %x)
                    
                    
                    
             table = open('Report.txt', 'w')
             table.write('================================================================ \n')
             table.write('MCNPX Simulation For XXXXX                                       \n')
             table.write('================================================================ \n')
             table.write('================================================================ \n')	 
             table.write('Depth          Flux          %Reduction          SEM             \n')
             table.write('================================================================ \n')
             a=0
             while a < 100: #this value is equal to the number of tallies
                    table.write(str(Depth[a].ljust(20)) + '     ' +
                             str(Flux[a].ljust(20)) + '     ' +
                             str(Change[a].ljust(20)) + '     ' +
                             str(SEM[a]) + '\n')
                    a = a + 1
    
             print('Report created')
             table.close()
    This is what happens when I run it:

    Code:
    >>> from report_test import reports
    >>> reports('o')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "report_test.py", line 49, in reports
        float(SEM[a]) + '\n')
    AttributeError: 'int' object has no attribute 'ljust'
    I have tried adding on the .ljust to the SEM variable as well but the error is exactly the same with that. The section I added that is causing the error is this:

    Code:
    table = open('Report.txt', 'w')
             table.write('================================================================ \n')
             table.write('MCNPX Simulation For XXXXX                                       \n')
             table.write('================================================================ \n')
             table.write('================================================================ \n')	 
             table.write('Depth          Flux          %Reduction          SEM             \n')
             table.write('================================================================ \n')
             a=0
             while a < 100: #this value is equal to the number of tallies
                    table.write(str(Depth[a].ljust(20)) + '     ' +
                             str(Flux[a].ljust(20)) + '     ' +
                             str(Change[a].ljust(20)) + '     ' +
                             str(SEM[a]) + '\n')
                    a = a + 1
    
             print('Report created')
             table.close()
  26. #14
  27. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,931
    Rep Power
    481
    Try this program. Mostly I only moved some right parentheses to the left a bit. Your report headings should include dimensions. % is the only one you show.
    Code:
    def reports(output):
        with open(output) as f:
            count = 1                  #this defines the number of tallies in the output file
            x = 101                    #this is the first surface number
            r = str('surface  %d' % x) #r is a variable used to help find the point where the results start.
            d = 0 #initializes depth variable
            c = 0 #intilizes the c variable used in the while loop
            p = 0 #initializes variable for calculating percent change
            #The following while loop skips each line until it comes across the line defined in r
            Flux = []
            SEM = []
            Depth = []
            Change = []
            while count < 10:
                   while c != r:
                       line = f.readline()
                       c = line.strip()
                   line = f.readline() #reads the energy and SEM
                   numbers = line.strip().split()
                   Flux.append(numbers[0])
                   SEM.append(numbers[1])
                   Depth.append(d)
                   p = '%g'%(100*((float(Flux[0])-float(Flux[-1])) / float(Flux[0]))) #calculates percent change
                   Change.append(p)
                   d += 4
                   count += 1
                   x += 4
                   r = str('surface  %d' %x)
            table = open('Report.txt', 'w')
            table.write('================================================================ \n')
            table.write('MCNPX Simulation For XXXXX                                       \n')
            table.write('================================================================ \n')
            table.write('================================================================ \n')
            table.write('Depth          Flux          %Reduction          SEM             \n')
            table.write('================================================================ \n')
            for (D,F,C,S,) in zip(Depth,Flux,Change,SEM,):    ####### zip is roughly a matrix transposition.
                table.write(str(D).ljust(20) + '     ' +
                            str(F).ljust(20) + '     ' +
                            str(C).ljust(20) + '     ' +
                            str(S) + '\n')
    
            print('Report created')
            table.close()
    Last edited by b49P23TIvg; January 8th, 2013 at 06:03 PM.
    [code]Code tags[/code] are essential for python code and Makefiles!
  28. #15
  29. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    25
    Rep Power
    0
    Originally Posted by b49P23TIvg
    Try this program. Mostly I only moved some right parentheses to the left a bit. Your report headings should include dimensions. % is the only one you show.
    Well that works nicely. I really appreciate your help. You're making my life too easy! So did the error have something to do with the variables being strings and it wanted them to be integers? I like your approach of putting the data in a matrix and using a for loop as opposed to a while loop. It certainly makes the code more robust. I do know I need to put units on the report I was going to research to see if there was another way to go about it. For example the units of flux are neutrons/(cm^2*s) which looks a little messy. I was going to see if there was a way to output the report as a .pdf file rather than a .txt and somehow use special characters like sub and superscripts. No idea if that is possible at this point or if it would involve more coding than I want to do. Next step is getting it to graph and then I can modify this to create a couple different types of reports I need with ease.
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo