#1
  1. Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2002
    Location
    Seoul - Korea
    Posts
    213
    Rep Power
    12

    Smile parsing the array returned by ftp_rawlist()


    I'm going to use the parsing method at:
    http://beben.lanparty.de/smint/ftp_rekdiranalys.phps
    where the parsing process is:
    PHP Code:
        if(ereg("([-dl])[rwxst-]{9}",substr($dirline,0,10))) {
            
    $systyp "UNIX";
        }
        
        if(
    substr($dirline,0,5) == "total") {
            
    $dirinfo[0] = -1;
        } elseif(
    $systyp=="Windows_NT") {
            if(
    ereg("[-0-9]+ *[0-9:]+[PA]?M? +<DIR> {10}(.*)",$dirline,$regs)) {
                
    $dirinfo[0] = 1;
                
    $dirinfo[1] = 0;
                
    $dirinfo[2] = $regs[1];
            } elseif(
    ereg("[-0-9]+ *[0-9:]+[PA]?M? +([0-9]+) (.*)",$dirline,$regs)) {
                
    $dirinfo[0] = 0;
                
    $dirinfo[1] = $regs[1];
                
    $dirinfo[2] = $regs[2];
            }
        } elseif(
    $systyp=="UNIX") {
            if(
    ereg("([-d])[rwxst-]{9}.* ([0-9]*) [a-zA-Z]+ [0-9: ]*[0-9] (.+)",$dirline,$regs)) {
                if(
    $regs[1]=="d")    $dirinfo[0] = 1;
                
    $dirinfo[1] = $regs[2];
                
    $dirinfo[2] = $regs[3];
            }
        }
        
        if((
    $dirinfo[2]==".")||($dirinfo[2]=="..")) $dirinfo[0]=0;

        
    // array -> 0 = switch, directory or not
        // array -> 1 = filesize (if dir =0)
        // array -> 2 = filename or dirname 
    The problem comes from that the lines from Windows_NT are the same as Linux.
    For example,
    -rwxrwxrwx 1 owner group 2579 Dec 31 2001 bannerETK.gif
    which I think does not match the above regular expression for Windows_NT.
    If there is anyone who know the proper codes to parse the raw line, plz let me know it.
  2. #2
  3. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2003
    Posts
    9
    Rep Power
    0
    You know, posts like these are really annoying. They come up first in Google search, and have no good information on them. In fact, they're SOO old (2003!) that they are probably no longer relevant.

    Complain mode off.

    I've found some decent parsing code on php.net, like this:

    function parse_rawlist3($list) {

    $folders = array();
    $files = array();
    $links = array();
    for ($i=0; $i<count($list); $i++)
    {
    //----convert tabs to blanks
    $list[$i] = str_replace ("\t", " ", $list[$i]);
    //----delete multiple blanks
    while (($k = strpos($list[$i], " ")) !== FALSE)
    $list[$i] = substr($list[$i],0,$k+1).trim(substr($list[$i],$k));
    //----split link reference from filename where available
    if (($k = strpos($list[$i], " -> ")) !== FALSE)
    {
    $filelink = substr($list[$i], $k+4);
    $list[$i] = substr($list[$i], 0, $k);
    }
    else
    $filelink = "";
    //----parse filename
    $k = strrpos($list[$i], " ");
    $filename = substr($list[$i], $k+1);
    $list[$i] = substr($list[$i], 0, $k);
    //----parse the rest of info
    list ($permissions, $list[$i]) = parsenext ($list[$i]);
    list ($number, $list[$i]) = parsenext ($list[$i]);
    list ($owner, $list[$i]) = parsenext ($list[$i]);
    list ($group, $list[$i]) = parsenext ($list[$i]);
    list ($size, $time) = parsenext ($list[$i]);
    //----ok, put all this into the related array
    if ($filename != "." && $filename != "..")
    {
    $m = array();
    $m["name"] = $filename;
    $m["link"] = $filelink;
    $m["size"] = $size;
    $m["time"] = $time;
    $m["owner"] = $owner;
    $m["group"] = $group;
    $m["permissions"] = $permissions;
    if (substr($permissions, 0, 1) == "d")
    $folders[count($folders)] = $m;
    else if (substr($permissions, 0, 1) == "l")
    $links[count($files)] = $m;
    else
    $files[count($files)] = $m;
    }
    }
    sort ($folders);
    sort ($files);
    sort ($links);

    return $files;

    }
    Last edited by hiker; November 3rd, 2010 at 09:57 AM. Reason: Signature Removed

IMN logo majestic logo threadwatch logo seochat tools logo