Thread: os problem

    #1
  1. onCsdfeu
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2003
    Location
    Canada
    Posts
    100
    Rep Power
    12

    os problem


    The following code is supposed to count the files on a computer:
    Code:
    import os,sys
    
    def count(filepath):
        global counter
        print filepath #temporary
        print os.listdir(filepath) #temporary
        for file in os.listdir(filepath):
            if os.path.isfile(file) or os.path.islink(file):
                counter += 1
            elif os.path.isdir(file):
                count(file)
            else:
                print "Erreur."
                sys.exit(1)
    
    counter = 0
    root = 'C:/' #place your own system root here
    count(root)
    print 'There are ',counter,' files on this computer.'
    For some reason, my C:\ is not considered as a file, nor a directory (and obviously not a symlink, since I'm on XP). Why could that be ?
    Time is the greatest of teachers ; sadly, it kills all of its students.
    - Hector Berlioz
  2. #2
  3. No Profile Picture
    Hi, I'm Calvin
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    LosAngeles, SanDiego, Houston
    Posts
    50
    Rep Power
    11

    hmmm


    i looked up the specifications for os.path.isdir() and all those functions...

    you have to specify the path, not just the filename

    so, what you'd want to do is os.path.isfile('%s%s'%(filepath,file))

    'C:/' should work just fine as a directory specification...

    so yeah... that answers why you're unable to do that. because you had a system.exit(1) in there, you never got any useful messages telling you what was going on

    the way i figured out that the 'C:/' part was working while the isdir and isfile etc functions weren't... was that i replaced "system.exit(1)" with

    print 'nope',file

    so, when i saw a file printed after 'nope' i immediately saw that the file was not determinded to be a valid file or directory... that led me to digging around the os.path module, and that caused me to learn that the path must be specified

    however, i'm still running into problems in trying to count all the files on my computer. i'm still getting 'nope + file' printouts to screen and i can't figure that part out; i changed the recursive call to count(filepath+file) and that allowed me to get into subfolders of C:/, but it stopped working after that =/ ah well... i'll maybe play around with it more later, but for now i'll give it back to you
  4. #3
  5. No Profile Picture
    Hi, I'm Calvin
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    LosAngeles, SanDiego, Houston
    Posts
    50
    Rep Power
    11

    oh yeah


    i also got this error after doing the whole '%s%s'%(filepath,file) thing...

    WindowsError: [Errno 5] Access is denied: 'C:/System Volume Information/*.*'


    i guess that means there is a protected system folder that you won't be able to get into? well, just be aware of it...
  6. #4
  7. No Profile Picture
    Hi, I'm Calvin
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    LosAngeles, SanDiego, Houston
    Posts
    50
    Rep Power
    11

    haha


    sorry for spamming this post... but i figured out why i wasn't able to get beyond the subfolders...

    instead of this:
    os.path.isfile('%s%s'%(filepath,file))

    do this:
    os.path.isfile('%s/%s'%(filepath,file))
    ...the added '/' tells the computer that you're dealing with adding on directory names

    then, for the recursive call, you'll want to do this:
    count('%s/%s/'%(filepath,file))

    yeah... that should do it.

    i still don't know what to do about the 'C:/ System Volume Information' directory though =/
  8. #5
  9. No Profile Picture
    Hi, I'm Calvin
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    LosAngeles, SanDiego, Houston
    Posts
    50
    Rep Power
    11

    ok, last one from me i swear


    soo sorry for all these msgs

    in my last post, i said to do this:
    count('%s/%s/'%(filepath,file))

    that's not correct. do this instead:
    count('%s/%s'%(filepath,file))

    sorry!

    i did that and i counted up to 9354 files on my computer before i ran into 'C:/System Volume Information' and was forced to stop with the error =/
  10. #6
  11. onCsdfeu
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2003
    Location
    Canada
    Posts
    100
    Rep Power
    12
    Thanks a lot for the input ; I really missed the insertion of /.

    As for that problem of yours, that's easily fixed. You can catch that error with os.access() :
    Code:
    import os
    
    def count(filepath):
        global counter,access_error
        for file in os.listdir(filepath):
            try:
                if os.path.isfile(filepath+'/'+file) or os.path.islink(filepath+'/'+file):
                    counter += 1
                elif os.path.isdir(filepath+'/'+file):
                    count(filepath+'/'+file)
            except:
                if not os.access and not access_error: access_error = True
    
    counter = 0
    access_error = False
    root = 'C:/' #place your own system root here
    count(root)
    if access_error:
        print 'You have access to ',counter,' files on this computer.'
        print 'Could not access some of the files.'
    else:
        print 'There is ',counter,' files on this computer.'
    EDIT : anybody sees a problem in this code ? Last time I ran Ad-Aware, it scanned 300k+ files, and I had about 138k according to my program.
    Last edited by SolarBear; October 22nd, 2003 at 04:42 PM.
  12. #7
  13. No Profile Picture
    Hi, I'm Calvin
    Devshed Newbie (0 - 499 posts)

    Join Date
    Sep 2003
    Location
    LosAngeles, SanDiego, Houston
    Posts
    50
    Rep Power
    11

    :)


    cool, thanks for the pointer about access...

    as for the file count, maybe Ad-Aware is able to go into the access protected directories? Also... doesn't ad-aware also scan the inside of zip files? i don't think the code ya got there reaches inside zip files... i could be wrong of course.
  14. #8
  15. onCsdfeu
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jul 2003
    Location
    Canada
    Posts
    100
    Rep Power
    12
    If Python can't access protected directories, I don't believe Ad-Aware could ; however, you could be right about ZIP, RAR, etc. files : it does scan inside them.

    However, I wonder if there's any way of getting the system's root folder without specifying it in the file...
  16. #9
  17. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Location
    Tucson AZ
    Posts
    29
    Rep Power
    0
    instead of using listdir and isfile I chose to use path.walk

    I was able to use path.walk to list all files including those in system folders. (I also have XP) I'm not sure why there'd be a difference betweent he two approaches. But if we're just after a count of files....

    Code:
    import os
    
    def walkfunc(ext,dir,files):
        global TotalFiles
        for file in files:
            TotalFiles += 1
    
    TotalFiles = 0    
    os.path.walk('c:\\',walkfunc,'.*')
    print TotalFiles
    Last edited by irishtek; October 23rd, 2003 at 11:05 AM.
  18. #10
  19. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Location
    Tucson AZ
    Posts
    29
    Rep Power
    0
    also with a minor modification to the last program you can count the number of contents in a zip file....

    Code:
    import os
    import zipfile
    
    def walkfunc(ext,dir,files):
        global TotalFiles,ZipFiles
        for file in files:
            if file[-3:] == "zip":
                zipfilepath = os.path.join(dir,file)
                z = zipfile.ZipFile(zipfilepath)
                ZipFiles += len(z.namelist())
            TotalFiles += 1
    
    TotalFiles = 0
    ZipFiles = 0
    os.path.walk('c:\\',walkfunc,'.*')
    print TotalFiles
    print ZipFiles
    Last edited by irishtek; October 23rd, 2003 at 11:05 AM.
  20. #11
  21. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    I've written a small, hopefully efficent, self contained function based on Irishs first program using os.walk() instead of os.path.walk().. if somone else could give it a test and tell me if it works properly because i'm getting different from the two programs .

    My guess is that os.walk() does the opposit to os.path.walk() when dealing with protected directories and just ignores them, but i can't find anything to confirm or deny this?!

    Code:
    import os
    
    def countree(path, type = 2, count = 0):
    	for object in os.walk(path):
    		count = count + len(object[type])
    	return count
    Note: you can use the numbers 0-2 in the type option to count directories aswell files.

    Oh, Irish, this isn't exactly accurate since the list returned by ZipFile.namelist() includes directories aswell as any file!

    Mark.
    programming language development: www.netytan.com Hula

  22. #12
  23. No Profile Picture
    Junior Member
    Devshed Newbie (0 - 499 posts)

    Join Date
    Oct 2003
    Location
    Tucson AZ
    Posts
    29
    Rep Power
    0
    Your right.
    The file count also counts directories... although in the case of viruses directories themselves could be infected.. at any rate there's even a higher level of complication...


    Which probably is quickest to handle with a little recursion:

    Counting files that are compressed as 'zip' and then a series of zip files compressed together as 'zip'

    The files in the nested zip files don't get counted with this algorithim either.
  24. #13
  25. Hello World :)
    Devshed Frequenter (2500 - 2999 posts)

    Join Date
    Mar 2003
    Location
    Hull, UK
    Posts
    2,537
    Rep Power
    69
    Ok i've written a little function, which should give you a list of ALL the FILES inside a zip archive, regardless if they are inside another zip! You shouldn't have a problem modifieing this to include directories or both.. Obviously using recursion

    Code:
    #!/usr/bin/env python
    
    import os, zipfile
    
    def rezipe(path, files = []):
    	
    	zip = zipfile.ZipFile(path)
    	
    	for name in zip.namelist():
    		if name.endswith('zip'):
    			file(name, 'wb').write(zip.read(name))
    			rezipe(name)
    			os.remove(name)
    		elif not name.endswith('/'):
    			files.append(name)
    			
    	return files
    Note: The zip archives themselves are not actually counted.. but again thats easily fixed!

    Have fun,
    Mark.
    programming language development: www.netytan.com Hula


IMN logo majestic logo threadwatch logo seochat tools logo