#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    11
    Rep Power
    0

    Storing newline in a variable for parsing


    I am trying to replace a carriage return, new line feed with a special character to make dynamically parsing a string a little easier,
    Code:
    "\r\n"
    . I would like to replace them with say, a pipe.
    Code:
    |
    What I have tried is
    Code:
    str_replace("\r\n", "|", $haystack)
    . This worked 100% fine. The issue that I am running into is I am reading my new line variable from an xml file using DOM.

    Code:
    $string  = $attr->nodeValue;
    So when I run
    Code:
    str_replace($string, "|", $haystack);
    My $haystack still looks like this
    Code:
    haystack Line1\r\nhaystack Line2\r\n
    My question then becomes, what do I need to do to my $string variable so that it escapes \r\n properly.

    Notes: I have checked the value of $string using xdebug, it does contain \r\n. So that leads me to assume that the string is just not escaped properly, or something to that degree.
  2. #2
  3. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,908
    Rep Power
    6352
    Does it contain the literal string '\r\n', those four characters?

    If so, note that '\r\n' is 4 characters (two slashes and two letters) while "\r\n" is 2 characters (a carriage return and a newline).

    You seem to be mixing interpolated strings with literal ones.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    11
    Rep Power
    0
    When I store the string \r\n from the xml and store it in $string, it is stored as a literal string '\r\n', which is why

    Code:
     str_replace("\r\n", "|", $haystack)
    works and
    Code:
    <v>\r\n</v>
    $needle = $attr->nodeValue;
    str_replace($needle, "|", $haystack);
    does not.

    I'm trying to figure out how to get the literal '\r\n' stored as a CRLF "\r\n", which as you correctly stated, is only two character.
    My question is what do I need to run/how do I need to store $string to escape \r\n as a carriage return, line feed.
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2003
    Posts
    3,621
    Rep Power
    595
    Perhaps I am not understanding the problem but why can you not just remove the '\r' and leave the new line character. DOM and other parsers know how to handle that.
    There are 10 kinds of people in the world. Those that understand binary and those that don't.
  8. #5
  9. --
    Devshed Expert (3500 - 3999 posts)

    Join Date
    Jul 2012
    Posts
    3,959
    Rep Power
    1014
    Hi,

    if any input contains a literal \r\n, then it's broken. The only way to fix it is to replace the \r\n with a newline sequence.
    The 6 worst sins of security ē How to (properly) access a MySQL database with PHP

    Why canít I use certain words like "drop" as part of my Security Question answers?
    There are certain words used by hackers to try to gain access to systems and manipulate data; therefore, the following words are restricted: "select," "delete," "update," "insert," "drop" and "null".
  10. #6
  11. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    11
    Rep Power
    0
    The reason that I cannot drop /r and use DOM to parse, is because I am not using DOM to parse a string. I am only using DOM to pull the value
    Code:
    <v>\r\n</v>
    from an external xml file. (The reason for this is because I am receiving this string externally and not all strings that I receive are formatted the same way so I am pulling an end of line setting from xml so that I can dynamically find a textual end of line setting based on the source of where my string came from.)

    I have a string
    Code:
    haystack Line1\r\nhaystack Line2\r\n
    This string is generated by
    Code:
    $haystack =  quoted_printable_decode();
    So I cannot simply remove '\r\n'.

    if any input contains a literal \r\n, then it's broken. The only way to fix it is to replace the \r\n with a newline sequence.
    This is of absolutely no help, because that is exactly what I am trying to do. My issue is the variable I am loading from xml, $needle is stored as a literal. My haystack string contains proper newline sequences, which is why I'm having this issue.

    $needle = $attr->nodeValue; (is being stored as '\r\n')
    $haystack is stored as "line1\r\nline2"

    What do i need to do to $needle to represent it as "\r\n"
  12. #7
  13. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,908
    Rep Power
    6352
    Originally Posted by choppyfireballs
    When I store the string \r\n from the xml and store it in $string, it is stored as a literal string '\r\n', which is why

    Code:
     str_replace("\r\n", "|", $haystack)
    works
    That's not possible. That's the problem here: what you're saying is happening cannot possibly be happening. As I said, '\r\n' is 4 characters while "\r\n" is 2. You are saying that the 2 characters representing a windows-style carriage return are being used to successfully replace the 4 literal characters '\r\n'.

    If you have the literal 4-character string '\r\n' in a variable and wish for it to be an ACTUAL NEWLINE, you must do this:
    PHP Code:
    $var str_replace('\r\n'"\r\n"$var); 
    It's also possible your use of quoted_printable_decode (which is most likely unnecessary) could be turning this string into a third, heretofore unmentioned string of unknown configuration.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    11
    Rep Power
    0
    That's not possible. That's the problem here: what you're saying is happening cannot possibly be happening. As I said, '\r\n' is 4 characters while "\r\n" is 2. You are saying that the 2 characters representing a windows-style carriage return are being used to successfully replace the 4 literal characters '\r\n'.
    This is the misunderstanding. This is the problem that I have run into, '\r\n' is not the same as "\r\n" I was looking for a way to load \r\n as an escaped string. However I have concluded that if php can do this, I have found a solution that better fits my needs.

    It's also possible your use of quoted_printable_decode (which is most likely unnecessary) could be turning this string into a third, heretofore unmentioned string of unknown configuration.
    The reason for running this, is that sometimes the string that I receive is encoded html. I have to run quoted_printable_decode() because any tags that I receive appear as such.
    Code:
    <br=3E 
    OR
    <style=2E
    quoted printable is the easiest way I have found to fix those tags if you can recommend something else I'm ears.

    My original goal was to dynamically tell php when the end of a line occurred, in text, for parsing purposes, so that I can load variables, and my new lines can be anything from | to "\r\n" to <br>, it's all how I receive the string that I am trying to parse. However these values will always be specified. So I tried loading the setting into an xml file so that I could load the file read the setting, then parse the line given the newline variable that was loaded. This xml IS generated automatically, as is the string that I am trying to parse.

    What I have decided to do is similar to enumeration, I have decided, in my class, to generate an array with all the pre selected identifiers as interpolated array names and store my new lines.

    what it would look like

    Code:
    <value>CRLF</value>
    <value>PIPE</value>
    
    $newLineVariable = $attr->nodeValue;
    
    $newLineArray['CRLF'] = "\r\n";
    $newLineArray['PIPE'] = "|";
    setting this up in my framework allows me to run this, and replace any text I deem as being a new line given a parameter, and replace it with static text. This allows for automatic, and dynamic parsing of a string.

    Code:
    str_replace($newLineArray[$newLineVariable, "§", $haystack);
  16. #9
  17. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,908
    Rep Power
    6352
    Quoted_printable is fine, just making sure it isn't screwing up your whitespace.

    This is the problem that I have run into, '\r\n' is not the same as "\r\n" I was looking for a way to load \r\n as an escaped string.
    What do you mean by "escaped string"? You have supplied sample code which does nothing, and claim it does something. This particular topic is confusing because PHP differentiates between these strings while people generally do not.

    PHP absolutely can do this, so I'm going to try to break it down into an actual code snippet:
    PHP Code:
    <?php

    $stringLiteral 
    'These two sentences are on the same line when printed.\r\nThe string is single-quoted and therefore what you consider special characters are not interpolated and are simply represented literally.';

    echo 
    $stringLiteralPHP_EOL;

    $interpolatedString "These two sentences are on separate lines.\r\nThe string is double-quoted so the special character references are interpolated and replaced with their actual literal equivalents (carriage return and newline).";

    echo 
    $interpolatedStringPHP_EOL;

    //These will work and replace the "newlines" with pipes:
    echo str_replace('\r\n''|'$stringLiteral), PHP_EOL;
    echo 
    str_replace("\r\n"'|'$interpolatedString), PHP_EOL;

    //these will NOT WORK, as the needle does not appear in the haystack:
    echo str_replace("\r\n"'|'$stringLiteral), PHP_EOL;
    echo 
    str_replace('\r\n''|'$interpolatedString), PHP_EOL;
    I hope this further illustrates what I'm talking about. Single-quoted strings can contain the four characters '\r\n' without those four characters MEANING anything. Only when the string is double-quoted does '\r\n' become actual carriage return and newline characters (which are completely separate ASCII entities from the 4 entities that made up the original string).

    There is no "escaping" involved in anything we've discussed, though the character \ is usually used to escape quotes and other bounding characters.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.
  18. #10
  19. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Jan 2013
    Posts
    11
    Rep Power
    0
    $stringLiteral = 'These two sentences are on the same line when printed.\r\nThe string is single-quoted and therefore what you consider special characters are not interpolated and are simply represented literally.';

    echo $stringLiteral, PHP_EOL;

    $interpolatedString = "These two sentences are on separate lines.\r\nThe string is double-quoted so the special character references are interpolated and replaced with their actual literal equivalents (carriage return and newline).";

    echo $interpolatedString, PHP_EOL;
    You are 100% correct
    Code:
    echo str_replace('\r\n', '|', $stringLiteral), PHP_EOL; 
    echo str_replace("\r\n", '|', $interpolatedString), PHP_EOL;
    Will work. and
    Code:
    //these will NOT WORK, as the needle does not appear in the haystack: 
    echo str_replace("\r\n", '|', $stringLiteral), PHP_EOL; 
    echo str_replace('\r\n', '|', $interpolatedString), PHP_EOL;
    Will not. That is why i posted here.

    I am loading a variable from xml, when i load this variable, it is stored as a string literal.

    The string that I am comparing it to is an interpolated string.

    that is what is causing the issue, my question was what do I have to do to load the variable from xml AS an interpolated string.

    I was calling them escaped strings because of http://www.php.net/manual/en/language.types.string.php#language.types.string.syntax.double

    You are correct the string is not escaped it is interpolated, the characters are escaped as "CR LF" essentially, so my mistake there.

    I come from C# and C++, where '' is a character and "" is a string so I understand the difference between a character representation of a string and an actual string. I agree that there needs to be more knowledge about this among developers.

    Let me close by restating my issue.

    I am loading a variable from xml, when I load this variable it is represented as a LITERAL string. I am loading the variable using DOM $attr->nodeValue.

    My question is how can i load that variable from xml OR how can i convert a literal string to an interpolated string. Mind you I cannot hard code the value I am converting to. I.e. I do not want to do
    Code:
    str_replace($loadedLiteral, "\r\n", $body);
    I want to either load the variable as an interpolated string or convert the literal string to an interpolated string.
  20. #11
  21. Sarcky
    Devshed Supreme Being (6500+ posts)

    Join Date
    Oct 2006
    Location
    Pennsylvania, USA
    Posts
    10,908
    Rep Power
    6352
    Then we're back to my second reply:
    Originally Posted by ManiacDan
    If you have the literal 4-character string '\r\n' in a variable and wish for it to be an ACTUAL NEWLINE, you must do this:
    PHP Code:
    $var str_replace('\r\n'"\r\n"$var); 
    You have a string with non-interpolated special characters. Replace them with their interpolated equivalents.

    Once you do that, you'll have strings with real newlines in them and your line-ending detection will work again.
    HEY! YOU! Read the New User Guide and Forum Rules

    "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -Benjamin Franklin

    "The greatest tragedy of this changing society is that people who never knew what it was like before will simply assume that this is the way things are supposed to be." -2600 Magazine, Fall 2002

    Think we're being rude? Maybe you asked a bad question or you're a Help Vampire. Trying to argue intelligently? Please read this.

IMN logo majestic logo threadwatch logo seochat tools logo