#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    12
    Rep Power
    0

    regex in my php script ?!!


    hello i am studying php and i encountered some regex functions i dont understand , please i need sombody explain what this regex do in simple words in details ?
    after the script get a directory name from $_REQUEST['f'] global variable it is replacing/checking it with this regex ...why ?? here is it :
    // sanitise input
    $f=preg_replace('#^/*|/(/)|/$#','\1',$_REQUEST['f']);
    if(preg_match('#(^|/)\.\./#',$f))exit;

    and in another place it is doing this :
    $ps=explode('/',preg_replace('#/[^/]*$#','','/'.$f));

    note : in my case the $_REQUEST['f'] = '/dir1/' but i need to understand what the regex check and replace in this directory name...maybe somthing against hackers...?!

    thanks
  2. #2
  3. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,696
    Rep Power
    6351
    Please color your code using the [ PHP ] tags (the white PHP page above the edit box will do it for you), it makes it much simpler to read.

    Now, as for regular expressions, they're a completely different language than PHP, which is why they have their own forum. It's not recommended you attempt to learn them until you're ready.

    These particular expressions:
    PHP Code:
    $f=preg_replace('#^/*|/(/)|/$#','\1',$_REQUEST['f']); 
    Replaces starting slashes with nothing, ending slashes with nothing, and double slashes with a single slash. This is confusingly written, but valid.

    PHP Code:
    if(preg_match('#(^|/)\.\./#',$f))exit; 
    Returns TRUE if the directory passed is ".." (meaning "up one directory"). This is, as you suspected, an anti-hacking tool.

    PHP Code:
    $ps=explode('/',preg_replace('#/[^/]*$#','','/'.$f)); 
    The preg_replace removes the last bit of a path, usually the filename. The explode turns a single string in the format of path/to/something and turns it into an array(0=>path, 1=>to, 2=>something)

    -Dan
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    12
    Rep Power
    0

    Dan realy thank you , you are great and kind.


    thanks dan your answer helped me so much but i need more indepth explanation of this regular expressions , so i decided to ask more specific questions i hope you know the answers.
    for example i will devide each regular expression to small chunks and lets give some examples on it :

    [PHPNET="$f=preg_replace('#^/*|/(/)|/$#','\1',$_REQUEST['f']); "]$f=preg_replace('#^/*|/(/)|/$#','\1',$_REQUEST['f']); [/PHPNET]
    lets start :
    regular expression here is :#^/*|/(/)|/$#
    ^/* = find any string that start with zero forward slash or more.
    | = or
    /(/) = or find any string that has two forward slashes like this // "i am not sure !!!"
    | = or
    /$ = or find any string that end with forward slash.

    examples for matching strings :
    1) zingo "start with zero slashes" "first rule"
    2) /zingo "start with one slash" "first rule"
    3) ///zingo "start with three slashes" "first rule
    4)zin//go "has inside it two slashes maybe !!! not sure" "second rule"
    6)zingo/ "any string that end with one slash" "last rule"

    * i dont understand the second rule "/(/)" what does it mean ?
    * now what i dont understand at all why he/she would replace the matching string with this string "\1" !!! what this string mean ?!!!
    * my conclusion :what i understand he/she wants to find/catch any string that start or has or end with slashes and replace it with thid mysterious string "\1"....

    i believe that in programming there isnt somthing called 99% understanding its (100% or nothing%) thats why i am stuck always with strange source codes...

    i have more questions but i prefer to keep it after i get quality answer for this.

    thank you dan .......
  6. #4
  7. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,696
    Rep Power
    6351
    The second pattern does indeed match two slashes, and captures the second slash into a capture group (important later).

    '\1' means "the first capture group." The first set of parenthesis that appear in the pattern correspond to the first capture group. In this example, if the second pattern (double slashes) is what's matched, the value of '\1' is a single slash, so two slashes are replaced by 1. If the first or third patterns (starting/ending slash) match, '\1' is empty, so the beginning or ending slashes are replaced by nothing.

    This pattern is particularly confusing and relies on an odd behavior. Specifically, it relies on '\1' not being set under certain circumstances, and USING '\1' when it's not set NOT throwing an error.

    -Dan
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    12
    Rep Power
    0

    hi maniacdan thank you :)


    so what i understand is like that for example :
    preg_replace ( $pattern , $replacement , $subject )
    the $replacement string will have a value only when two things happen : one of the $patterns match and the $pattern find at least one capture group inside it "like (/) in our case" then the replecement will be the first encountered capture group otherwise if the $pattern doesnt has any captured group the $replacement will be an empty string ' '.

    now i understand 99%......

    realy thanks for helping me ......

    there is another thing i dont understand .....

    why would he/she want to search for double slashes and replace it with only one slash is that somthing against hackers or what ? or maybe mistyped path ? what you think the real purpose ?
    and what will happen if that double slashes was at the start or the end of the $subject string ? like //docs/ or /docs// or even both //docs// ?!!!
    as i understand this function will replace any double slashes with one slash even at the start of the $target or at the end? am i right ?
    such a string like //docs//mic//hhh will be /docs/mic/hhh ? am i right ?
    maybe my last question regarding this is what will happen if we have more than two slashed like odds and even like this for example : ///////docs/////mic/////////////////hhh ?!!!

    ------------------------------------------------------
    now i dont understand this :
    if(preg_match('#(^|/)\.\./#',$f))exit;

    what i dont understand is why he/she would put ^|/ in a capture group (^|/) ?!!
    what i understand here that he/she want to find one of two possible patterns inside the subject string $f in our case
    1) if $f start immediately with two points and slash like ../docs/net
    2) if $f has inside it any form of this pattern /../
    like docs/../net or even docs/net/../ or /../docs/net/../
    am i right ?!!
    this obviously against hackers who want to escape or snoop/view inside directories outside the pre declared and restricted scope in some webroot directory ?!! am i right ?
    they try to pass ../ to our script to fool us ?!
    -----------------------------------------------------
    now what i understand at all this function also :

    $ps=explode('/',preg_replace('#/[^/]*$#','','/'.$f));

    i know he/she want to create an array from a given dir path !!!
    ok ! but why they dont just explode $f directly ?!!!
    what they did before exploding it i dont understand !!!?

    $pattern here is #/[^/]*$#
    $replacement here is ' ' or nothing.
    and they appended a preceding slash for the $subject !!!!
    that is realy hard to understand i think ....!!

    as much as i understand he/she want to find any word insdie the $subject that has a slash and dont end with slashes !!!?

    can you explain this more in depth for me please ?!!!

    thank you realy
  10. #6
  11. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,696
    Rep Power
    6351
    Replacing double slashes is so that later on, when the code explodes on slash, there's no empty array elements. It's not an anti-hacker thing, it's preparing the string for a future piece of code.

    what i dont understand is why he/she would put ^|/ in a capture group
    It's IN a capture group, true, but in this case the parens are ensuring that the OR operator only operates on those two characters, not "^ or {rest of the string}". As it stands, the pattern is "starts with .., or /.. anywhere in it"

    but why they dont just explode $f directly ?
    The preg_replace cuts the last bit (usually the filename) off the string. It's really unnecessary, it would be more efficient to explode first, then unset the last element, but the code works.

    I can't speculate as to what they were thinking or what the goal is when all I have are regular expressions, all I can tell you is what they do.

    Also, in the future, do try to write better. You're not going to get too much help with 5 exclamation points and no capital letters. I'm only still here because your first post was so short.

    -Dan
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Feb 2010
    Posts
    12
    Rep Power
    0

    Dan i dont know how to thank you virtually...."THANK YOU"


    And i think you are 100% right i am writing too much exclamation points , and i ask too much questions like a child , from now on i will ask very short questions , and i will try to write good formatted text even though english isn't my native language and i use it only on the net.

    PHP Code:
    $ps=explode('/',preg_replace('#/[^/]*$#','','/'.$f)); 
    yes you are right this regex remove file name and explode the rest to the array.

    Maniacdan this regex i took from a code that should show the structure of a directory outside the webroot , i have a winxp os and xampp with apache webserver "a regular testing server" ,
    the author of the book iam reading ordered us to config apache or php to be able to access this directory : c:/xampp/uploaded_files
    but in the same time not allow web browsers to access this dir , only my specific php script allowed to access this dir, i tried uncountable examples from the web/google with no success, do you know how i should do that please ?
    i have opened a thread on this issue in the apache forum , please help if you can.
    thanks you maniacdan.

IMN logo majestic logo threadwatch logo seochat tools logo