#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2012
    Posts
    4
    Rep Power
    0

    Question Noob questions on Reading textual data into lists/database etc?


    Hello, I am a 4o+ yr old noob that want to learn to program using python.

    I have watched online tutorials (I.E. Bucky & jase) but they only go so far, I have tried looking at other tutorials but need to know what I'm looking for.

    I have learnt C & pascal at college over 25 years ago but have done nothing since then as I have not needed to.

    Now I would like to learn to write small programs with python to be able to read in some text file, sort them and output some differences etc. I want to do specifically this because I need a goal for my learning, one that applies to me rather than displaying a picture of my cat etc. Without this goal (which would if I can get it to work help me with my job) I would not have anything to aim for and would probably idle away and play xbox instead.

    I also have a very bad memory and unless I do something struggle to remember it - hence why I need to have a practical use for my learning.

    I could try VB for this but would prefer to learn python - again that would be useful at work.

    So let me explain my goal.

    What I want to do is read in a text file - may be upto 5mb in size.
    Then read in a couple of other text files and compare the contents of them all and output some lists based on those compares. I don't want to program a lunar lander just yet.

    As an example of the sort of data I want to read in - the first one would have its contents having multiple entries (could be hundreds of them) like:
    (after some bumpf at the top of the file)

    .SN54120D :1 ;D_DUAL PULSE SYNCHRONIZERS
    SO16 (reflow)
    *STM U
    *SYM SYNC_120
    *EXT 4 2 3 5 1 7 6
    *EXT 12 13 14 11 15 9 10
    *DFN SN54120D
    ~thm_power_diss (0.1785)
    @Value (SN54120D)
    ~Manufacturer (TEXAS)
    SYNC_120 (ANSI)
    4.1!I 2.1!I 3.1!I 5.1!I 1.1!I 7.0!N 6.0!N
    SYNC_120 (ANSI)
    12.1!I 13.1!I 14.1!I 11.1!I 15.1!I 9.0!N 10.0!N
    /GND 8.0!G
    /VCC 16.0!P
    Where the info that I am primarily interested in is:
    .SN54120D :1 ;D_DUAL PULSE SYNCHRONIZERS
    This is found at the start of the above section.
    This is a part.

    Each new part/section starts with a full stop.
    I.E. .SN54120D
    Although there could be several similar lines such as:
    .SN54120D
    .SN54120B
    .SN54120C
    at the top of a part. (this is the part name (minus the full stop))

    I want to ignore the
    :1 ;D_DUAL PULSE SYNCHRONIZERS
    bit that may or may not be there.

    And then the line I am looking for is immediately after them.
    SO16 (reflow)
    although there may or may not be the (reflow) in brackets.

    This is the component name.

    Then I want to extract form the data the symbol which will be the line that starts

    SYNC_120 (ANSI)
    This as you can see may have more than 1 entry, although I am only interested in identifying 1 of them if they are the same, but need to know if there are others if they are different.

    So having read all that in I want a "xxxx" that contains a part name, the component it uses and the symbols it uses.

    All the other bits I am not interested in and all the other lines may or may not be in each section.

    I.E.
    *STM U
    *SYM SYNC_120
    *EXT 4 2 3 5 1 7 6
    *EXT 12 13 14 11 15 9 10
    *DFN SN54120D
    ~thm_power_diss (0.1785)
    @Value (SN54120D)
    ~Manufacturer (TEXAS)
    I want to ignore.
    Also the lines at the end:
    /GND 8.0!G
    /VCC 16.0!P
    So I can see I am going to have to do some IF checks for lines starting with *~@/ characters and not enter them in the xxxx.

    What I want to know to start me off is what am I going to read that data into? is it a list, an array, a database or what?
    Can I get any pointers on this please?

    Once I have that data read in what I then want to be able to do is read in another text file with similar information in and compare both sets of data and output a list of parts that use each symbol or component, also a list of symbols and components from the 2nd set od data that is not used in the first.

    I am pretty sure that this is basic file reading in, storing, sorting/parsing, comparing and outputting.

    I can learn to wrap it in fancy graphics later, do more etc but I need to know where to look, what to look at to start?

    Then I can learn do do more with it all.

    As much as it would be nice, I don't want anyone to write the code for me - just guide me for where to look and what to look at to achieve my goal - your assistance and advice would be appreciated.

    Thank you - Matt.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    139
    Rep Power
    3

    Goodness


    Well. That's quite a description, and i have to admit i couldn't force myself to soak it all in. However, i may be able to help you with some general things that you may could use.

    To read/write data from files, you can try:
    Code:
    NewData = open('filename' , 'w')
    Data1 = open('filename' , 'r')
    Data1 = Data1.readlines()
    
    Data2 = open('filename' , 'r')
    Data2 = Data2.readlines()
    
    for Line1 in Data1 : # compare each line in Data1 against lines in Data2
      Line1 = Line1.strip() # get rid of tricky things that can lead to a mismatch
      For Line2 in Data2 :
        Line2 = Line2.strip()
        if Line1 in Line2 :
        # do something phenomenal of your choosing here, like build
           a String variable
        
       # and then if you want to write some results:
         NewData.write(String)
    
    NewData.close()













    Last edited by WynnDeezl; September 25th, 2012 at 04:58 PM.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2012
    Posts
    4
    Rep Power
    0
    Thank you but what am I reading data1 into?

    This is the xxxx I referred to above.

    What sort of structure is it? whats it called?
    I need to know this so I can go away and read up on it.

    Will the
    Data1 = Data1.readlines()
    read it into a similar format one line at a time? (which I can then compare each line and then ignore some lines and copy interesting lines into a new one).

    Cheers,
    Matt
  6. #4
  7. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,709
    Rep Power
    480
    Yes, quite a description. And yes, most people will agree that you have to use something regularly else you'll tend to forget. That's why people retell stories, I suppose.

    Is a "full stop" a line that begins with a period?

    There's a string function called startswith
    Code:
    >>> '.full stop'.startswith('skrunk')
    False
    >>> '.full stop'.startswith('.')
    True
    >>>
    [code]Code tags[/code] are essential for python code and Makefiles!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2012
    Posts
    4
    Rep Power
    0
    Is a "full stop" a line that begins with a period?
    yes that's the word....

    I thought I'd better explain fully what I am aiming for so that the response would be aimed in the right direction rather than me feed "oh then I want to do this" bits that may change the original response. If you get my drift.

    It's my initial spec for the program lol
  10. #6
  11. Contributing User
    Devshed Demi-God (4500 - 4999 posts)

    Join Date
    Aug 2011
    Posts
    4,709
    Rep Power
    480
    Data1 = open('filename' , 'r')

    On this line Data1 is, roughly in c, a FILE* .
    It's an object with a bunch of methods including readlines .


    Data1 = Data1.readlines()

    This statement invokes the readlines() method without an argument, which causes it to read the entire file into a list. Each list item is a line of the file. Furthermore, the lines are in order! That list is assigned to Data1 which replaces the prior value.


    Reusing the variable name to mean 2 things is a poor programming practice. Leaving the file open is also a poor idea.


    GVR (Guido van Rossum, benevolent python dictator for life and creator of python) would, I'm certain, recommend that you open the file using a contextual with statement which handles errors and closes the file for you.
    Code:
    with open('file.name','r') as instream:
        lines_of_input_file = instream.readlines()
    [code]Code tags[/code] are essential for python code and Makefiles!
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2012
    Posts
    4
    Rep Power
    0
    Cheers, so its a list that it is read into.

    I'll look into handling lists etc.

    This is likely to take me a long time, when I get a chance in between everything else etc lol. If only my local college ran a course but they dont do them anymore

IMN logo majestic logo threadwatch logo seochat tools logo