January 8th, 2008, 02:40 PM
Getting raw string form of string variables
I would like to convert string variable content to raw form.
#What I've got.
>>> str = "\f"
#What I want.
>>> str = r"\f"
#But I don't have the literal because I'm reading my str from a text file.
name = open(sys.argv, 'r')
str = name.readline()
#Is there something like raw(str)?
Or...is there a simpler way to say "convert this Windows path to a Unix path"?
January 8th, 2008, 04:28 PM
Escape sequences "\n, \t, \r, ..." exist as a way of displaying unprintable characters to humans.
What creating a raw string ( r"..." ) does is switch off interpreting escape sequences in a string.
When you read from a file this isn't a problem because files can store a newline as a newline (one character), and so can string variables. Your string contains what was in the file. It already is "in raw form".
Or, to put it another way, the distinction between "raw form" and "string with escape sequences" happens when you cross the boundary from computer format to human readable, or human to computer.
When reading from a file to a variable the data can stay in the same form all the way - there is no raw/not raw divide.
There isn't a simple way because it's not a simple operation - what's the unix path equivalent of "c:\windows\system32\cmd.exe"?
Last edited by sfb; January 8th, 2008 at 04:36 PM.
January 9th, 2008, 08:40 AM
Hmmm.... yes, I need to speak more plainly.
As the rest of my post indicates, I'm concerned about the slashes. I'd like to make
(the Windows convention) into
(the Unix convention).
January 9th, 2008, 10:03 AM
Still not quite clear, I’m afraid
Originally Posted by auguri
If you have already have a string like "a\b\c\d\e.aaa" somewhere, you can just use the “replace” method:
>>> s = r"a\b\c\d\e.aaa"
>>> print s.replace("\\", "/")
But maybe you should take a look at the os.path module as well.
January 9th, 2008, 01:27 PM
Ugh. Replace works. Before you get angry, hear me out. The first thing I tried was replace! To be cautious, I first tried it in the interactive window. For example:
[Dbg]>>> bla = "a\f"
Replace sees \f as an escape sequence and so it's not replacing the slash. I was asking about getting variable contents into raw form because, of course, if bla = r"a\f", then bla.replace("\\","/") yields "a/f".
But today I threw caution to the wind and tried the replace inside my code (though I thought I tried this yesterday...maybe I was looking at the wrong version of output?). Because I am getting the strings from an iterator (as opposed to assigning them as literals) replace works as desired. Though I'm still not sure I truly understand the reasons, the script is doing its job now. Here's the relevant bit:
output = open(sys.argv, 'w')
#walk through files in mounted directory
for root, dirs, files in os.walk("Y:/"):
root = root.replace("Y:/", "/servername/")
root = root.replace("\\","/")
thanks for listening.
January 9th, 2008, 04:49 PM
Replace doesn't see \f as an escape sequence, it's already gone before that, handled by the python interpreter when it creates 'bla' for you:
[Dbg]>>> bla = "a\f"
Create a string, the first character is the character 'a'.
The second character is the character '\' - wait, that indicates the start of an escape sequence... hold that thought...
The next character is 'f' and \f is a valid escape sequence, so treat these two as one character.
So the second character is actually "ascii formfeed character"
End of string.
From now, bla is two characters, ascii characters number 97 (a) and 12 (the form-feed instruction).
Here, you are telling replace to replace one backslash with one forward slash. One forward slash is a valid character, but one backslash isn't - it marks the start of an escape sequence. So you need the escape sequence which means "I didn't want an escape sequence this time, I just wanted a \ character" - which is "\\".
When replace does it's work, there are no backslashes in bla - just 'a' and 'form feed', so it doesn't find what it's looking for, and does nothing.
Afterwards, Python has to show you the result of the replace, and it can't print the formfeed character. So it prints it as an escape sequence - \x0c - which is a two digit number in hexadecimal, value 0C (aka, 12).
Which is why it looks like there's a backslash which has been ignored - but it doesn't really exist, it's only there to show you something which would otherwise be invisible, like tab, return, form feed, new line, etc.
This is really awkward to explain
Last edited by sfb; January 9th, 2008 at 04:51 PM.
February 2nd, 2010, 07:30 AM
February 6th, 2010, 02:21 PM
When dealing with escape sequences, it usually helps to drop down to decimal/ord.
fname = r"\a\\b\c\d\e.aaa"
backslash = 92
subpath = ""
path_list = 
for ch in fname:
if (backslash == ord(ch)):
subpath = ""
subpath += ch
if len(subpath): ## final name
print("/" + "/".join(path_list))
Last edited by dwblas; February 6th, 2010 at 02:25 PM.
July 31st, 2012, 12:27 PM
Some more thoughts:
[code=python]# make sure you save s as a raw string
s = r"a\b\c\d\e.aaa"
fname = "atest.txt"
with open(fname, "w") as fout:
with open(fname, "r") as fin:
s2 = "%r" % fin.read()
s3 = s2.replace(r'\\', '/')
Last edited by Dietrich; July 31st, 2012 at 12:30 PM.
Real Programmers always confuse Christmas and Halloween because Oct31 == Dec25