#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    7
    Rep Power
    0

    Question Python: Using Excel CSV file to read only certain columns and rows


    While I can read csv file instead of reading to whole file how can I print only certain rows and columns?

    Imagine as if this is Excel:

    Code:
    A              B              C                  D                    E
    State  |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate
    
    Alabama     235.5             54.5                 16.7                 18.01
    
    Alaska      147.9             44.3                  3.2                  N/A    
    
    Arizona     152.5             32.7                 11.9                  N/A    
    
    Arkansas    221.8             57.4                 10.2                  N/A    
    
    California  177.9             42.2                  N/A                  N/A    
    
    Colorado    145.3             39                    8.4                 9.25
    Heres what I have:

    Code:
    import csv
    
    try:
        risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file
    
    except:
        while risk != "riskfactors.csv":  # if the file cant be found if there is an error
        print("Could not open", risk, "file")
        risk = input("\nPlease try to open file again: ")
    else:
        with open("riskfactors.csv") as f:
            reader = csv.reader(f, delimiter=' ', quotechar='|')
    
            data = []
            for row in reader:# Number of rows including the death rates 
                for col in (2,4): # The columns I want read   B and D
                    data.append(row)
                    data.append(col)
            for item in data:
                print(item) #print the rows and columns
    I need to only read column B and D with all statistics to read like this:

    Code:
    A              B                D                    
     State  |Heart Disease Rate| HIV Diagnosis Rate |
    
     Alabama       235.5             16.7                
    
      Alaska       147.9             3.2                     
    
      Arizona      152.5             11.9                     
    
      Arkansas     221.8             10.2                    
    
     California    177.9             N/A                     
    
     Colorado      145.3             8.4
    no errors

    Any ideas on how to tackle this? Everything I try isn't working. Any help or advice is much appreciated.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    138
    Rep Power
    2
    Could you please provide the exact input data you have? In my world, a csv file looks like this:

    State,Heart Disease rate,Stroke Death rate
    Alabama,235.5,54.5
    And the code for only printing the second column would be:

    Code:
    >>> import csv
    >>> with open('a') as f:
    ...     rows = csv.reader(f)
    ...     for row in rows:
    ...             print row[1]  # Only print the column in the row
    ... 
    Heart Disease rate
    235.5
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    7
    Rep Power
    0
    Sorry about that the file is really long and I thought I should condense it but here is the actual data:

    State Heart Disease Death Rate (2007) Stroke Death Rate (2007) HIV Diagnosis Rate (2009) CLABSI-SIR (2010)
    Alabama 235.5 54.5 16.7 N/A
    Alaska 147.9 44.3 3.2 N/A
    Arizona 152.5 32.7 11.9 N/A
    Arkansas 221.8 57.4 10.2 N/A
    California 177.9 42.2 N/A N/A
    Colorado 145.3 39 8.4 0.68
    Connecticut 171 34.2 14.4 0.94
    Delaware 200.2 39.4 N/A 0.59
    District of Columbia 239.4 36.9 N/A N/A
    Florida 162.4 33.6 33 N/A
    Georgia 203 49.7 32.9 N/A
    Hawaii 140.3 39.5 N/A N/A
    Idaho 164.1 43.2 3.2 N/A
    Illinois 192.8 43.9 15.7 0.9
    Indiana 203 45.7 8.3 N/A
    Iowa 174.8 42.1 4.6 N/A
    Kansas 178.7 46 5.7 N/A
    Kentucky 220.9 48 9.1 N/A
    Louisiana 230 50.1 28.8 N/A
    Maine 172.9 40.2 6.1 N/A
    Maryland 202.4 42.7 N/A 1.3
    Massachusetts 165.5 36.5 N/A 0.59
    Michigan 221.5 44.3 10 N/A
    Minnesota 129.8 38.1 8 N/A
    Mississippi 266.5 53 21.3 N/A
    Missouri 214.4 48.2 10 N/A
    Montana 163.2 38.5 N/A N/A
    Nebraska 165.3 43.1 6.3 N/A
    Nevada 200 38.3 15.8 N/A
    New Hampshire 174.9 34.3 4.3 0.57
    New Jersey 191.9 35.8 22.8 0.82
    New Mexico 159.2 39.2 9.2 N/A
    New York 225.1 28.2 29.5 0.99
    North Carolina 191 50.3 19.7 N/A
    North Dakota 164.1 37.3 2.3 N/A
    Ohio 204.8 45.2 11.9 N/A
    Oklahoma 241.6 53.8 10.9 0.5
    Oregon 156.9 43.6 N/A 0.61
    Pennsylvania 199.4 42.9 14.5 0.7
    Rhode Island 203.6 33.5 N/A N/A
    South Carolina 192.9 53.4 19.9 1.16
    South Dakota 159.1 38.7 3.5 N/A
    Tennessee 220.6 53.9 17.2 1.15
    Texas 191.9 49 18.4 N/A
    Utah 152.1 38.9 5 N/A
    Vermont 161.2 37.6 N/A 0.27
    Virginia 182.7 44.5 17.2 0.83
    Washington 167.3 41.3 N/A 0.58
    West Virginia 229.4 48.9 5.1 N/A
    Wisconsin 171.9 42.3 5.9 N/A
    Wyoming 178.3 39.5 3.8 N/A
  6. #4
  7. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    7
    Rep Power
    0

    Talking


    Originally Posted by partoj
    Could you please provide the exact input data you have? In my world, a csv file looks like this:



    And the code for only printing the second column would be:

    Code:
    >>> import csv
    >>> with open('a') as f:
    ...     rows = csv.reader(f)
    ...     for row in rows:
    ...             print row[1]  # Only print the column in the row
    ... 
    Heart Disease rate
    235.5
    Sorry about that the file is really long and I thought I should condense it but here is the actual data. Thank you very much for the reply.
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2013
    Posts
    138
    Rep Power
    2
    Okay, so what you have is a csv file with tab as delimiter. Then you also have some padding with spaces in your third column.

    Code:
    State	Heart Disease Death Rate (2007)	Stroke Death Rate (2007)	HIV Diagnosis Rate (2009)	CLABSI-SIR (2010)
    Alabama	235.5	      54.5	16.7	N/A
    So the program has to be modified like so:

    Code:
    >>> import csv
    >>> with open('a') as f:
    ...     rows = csv.reader(f, delimiter='\t')
    ...     for row in rows:
    ...             print row[1].strip(), row[3].strip()  # Print column 2 and 4, strip surrounding spaces
    ...
    Not tested with the entire file, just with heading and first data row.

IMN logo majestic logo threadwatch logo seochat tools logo