#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    2
    Rep Power
    0

    Usecols function


    hi,
    I am very new to python so I am having problems with even the basic functions. This is used in numpy I guess. my problem is :

    I have a txt file with heterogeneous types of data columns. first column is a string second column is an integer, third is string, fourth is float etc. but when I use genfromtxt function with usecols=0, 2 option, for example if I want to see the string columns it doesnt work first column appears as it is in the file but I see the third column as whitespaces. I tried defining dtype as str didnt work and Now I dont know what to do???
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2013
    Location
    Usually Japan when not on contract
    Posts
    240
    Rep Power
    12
    Can you provide a same of the text file, or if the data is sensitive a sample of how it looks?

    Anyway, your problem has nothing to do with numpy or anything -- you just need to get data out of the text file and into a data structure within Python so you can start doing things with the data.

    Here is a sample snippet that might give you some ideas:

    Let's say we have a sample file with columns like this
    Code:
    joe   47   $216
    john 137  $4624
    Now we want to split it into a list of tuples so we can do things with it within our program:
    python Code:
    f = open('sample.txt', 'r')
    v = [tuple(l.split()) for l in f.readlines()]
    print(v)
    f.close()

    Produces:
    Code:
    [('joe', '47', '$216'), ('john', '137', '$4624')]
    If I wanted to make a dict of names to integer values from the second column I could change it to this:
    python Code:
    f = open('sample.txt', 'r')
    v = [tuple(l.split()) for l in f.readlines()]
    print(v)
    d = dict([(x[0], int(x[1])) for x in v])
    print(d)
    f.close()

    to produce:
    Code:
    [('joe', '47', '$216'), ('john', '137', '$4624')]
    {'john': 137, 'joe': 47}
    If this doesn't make sense to you, just think carefully over what is happening in each step. If you don't know what the different things are that you see above, please read the Python docs before going any further. All your basic questions can be answered there much more quickly than on a forum.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2013
    Posts
    2
    Rep Power
    0
    thank you for all this detailed explanation. I think the syntax was wrong somewhere cause I tried this
    a=np.genfromtxt('this.txt',usecols=(0, 3), dtype= str)
    and it was all fine. But still I was now trying to learn how to extrct lines and your post helped me a lot I am reading documentation of python numpy and scipy and also trying to use the functions on basic programs so I can learn and before posting this I tried making sense of what the documentations might have said about these functions. I think I have to practice a lot to understand which function and option may help in certain situations.

    Again thank you for your time and all help

IMN logo majestic logo threadwatch logo seochat tools logo