Page 1 of 2 12 Last
  • Jump to page:
    #1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Mar 2004
    Posts
    2
    Rep Power
    0

    Specific SED replace command needed


    HI all
    I need to replace the 23rd character on all lines beginning with P , with a blank character. Ive heard that the SED command is the best option, Can anybody help with the specific command line that would perform this operation.
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2003
    Posts
    121
    Rep Power
    14
    sed '/P/s/^\(.\{22\}\)\(.\)\(.*\)$/\1 \3/'

    Tested on hp-ux 11i.
  4. #3
  5. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2003
    Location
    USA
    Posts
    334
    Rep Power
    14
    sed 's/^\(P.\{21\}\)./\1 /' infile
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2003
    Posts
    121
    Rep Power
    14
    Hmmm... I like fpmurphy's solution better than mine. Good job!
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2007
    Posts
    9
    Rep Power
    0

    ANOTHER specific SED replace command needed


    Hi everyone,

    I need a SED command (or an awk or grep command) that will do the following:
    - Find the last line that begins with "144536". There are multiple lines that begin with 144536 and it's important that it be the last one only
    - Replace the 637th and 638th characters on that line with "PR"

    I need to do that about 40 times -- not a trivial number, but not a huge number. So if it's too much to do all that in one line, I could also use grep -n "144536" to list the line numbers of all of it's appearances, then I would only need a sed command that would replace the 637th and 638th characters of a line number that I specify manually. Thanks for your help!
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2006
    Posts
    177
    Rep Power
    237
    Originally Posted by akleinb
    Hi everyone,

    I need a SED command (or an awk or grep command) that will do the following:
    - Find the last line that begins with "144536". There are multiple lines that begin with 144536 and it's important that it be the last one only
    - Replace the 637th and 638th characters on that line with "PR"

    I need to do that about 40 times -- not a trivial number, but not a huge number. So if it's too much to do all that in one line, I could also use grep -n "144536" to list the line numbers of all of it's appearances, then I would only need a sed command that would replace the 637th and 638th characters of a line number that I specify manually. Thanks for your help!
    wow, this thread is from 2004.!
    anyway, something like this
    Code:
     awk '/^pm/{line=$0}
     END{ n=split(line,arr,"")
          arr[637]=P;  arr[638]=R
          for (i=1;i<=n;i++) printf arr[i]
          print
        }' "file"
  12. #7
  13. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2007
    Posts
    9
    Rep Power
    0
    ghostdog74,

    Thanks so much for the reply -- I'm amazed you found this ancient thread! Anyway, I don't know awk, so forgive the dumb questions.

    If the line I need to change is line number 358386 of my text file, do I change the part of the first line in braces from line=$0 to line=$358385 (implying that line numbering begins with 0)? And do I need to specify an outfile, or will this code change the data file directly? Finally, is there a way to write this code in one-line format? Thanks so much,

    akleinb


    Originally Posted by ghostdog74
    wow, this thread is from 2004.!
    anyway, something like this
    Code:
     awk '/^pm/{line=$0}
     END{ n=split(line,arr,"")
          arr[637]=P;  arr[638]=R
          for (i=1;i<=n;i++) printf arr[i]
          print
        }' "file"
  14. #8
  15. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2006
    Posts
    177
    Rep Power
    237
    Originally Posted by akleinb
    ghostdog74,

    Thanks so much for the reply -- I'm amazed you found this ancient thread! Anyway, I don't know awk, so forgive the dumb questions.

    If the line I need to change is line number 358386 of my text file, do I change the part of the first line in braces from line=$0 to line=$358385 (implying that line numbering begins with 0)? And do I need to specify an outfile, or will this code change the data file directly?
    awk reads the input file line by line and the default record separator is a newline. $0 is what is used to reference an entire line. $1, $2 and so on is used to reference each field in that record, for example if a line is like this:
    Code:
    one two three 4 5 6
    then $1 is "one', $2 is "two" and $6 is '6' and so on. by default awk uses spaces as a field separator.

    in awk, we use NR to get the current record number, so if you want line 358386, then you can use NR==358386 as a check...
    Code:
     awk '/^pm/ && NR == 358385 {line=$0} #this is a comment
    ...
    i use line=$0 to store the value of $0.(current line). you put the awk code in a script, then call the script as you normally would from the shell. To specify output file, you use the ''.>" redirection operator....

    Code:
    # ./script.sh > outfile.
    Finally, is there a way to write this code in one-line format? Thanks so much,
    i don't advise you write one liners. It makes code hard to read and troubleshoot. If its short, it should be okay, but it its too long, then whoever takes over your script next time will hard a hard trying to read your code. sorry if you really want one liners, i can't help you there.
  16. #9
  17. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2007
    Posts
    9
    Rep Power
    0
    That was a very helpful explanation of the code, but for some reason, it doesn't work. I had two problems. First, I tried using the "&& NR == 358386", but it kept pulling the wrong line. To solve that problem, I used a sed to give only the single line that needs to be edited as the input to your awk script. Here's exactly what I entered at the Linux shell prompt to make the first change:

    Code:
    sed '358386q;d' datafile | awk '/^pm/{line=$0}
    END{ n=split(line,arr,"")
    arr[637]=P;  arr[638]=R
    for (i=1;i<=n;i++) printf arr[i]
    print
    }'
    Unfortunately, this doesn't seem to work. It gives as output the same line it took as input; characters 637 and 638 are not changed to "PR". In fact the output is identical to the output of "sed '358386q;d' datafile". Any idea what's going wrong? Thanks again.
  18. #10
  19. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2006
    Posts
    177
    Rep Power
    237
    Originally Posted by akleinb
    That was a very helpful explanation of the code, but for some reason, it doesn't work. I had two problems. First, I tried using the "&& NR == 358386", but it kept pulling the wrong line. To solve that problem, I used a sed to give only the single line that needs to be edited as the input to your awk script. Here's exactly what I entered at the Linux shell prompt to make the first change:

    Code:
    sed '358386q;d' datafile | awk '/^pm/{line=$0}
    END{ n=split(line,arr,"")
    arr[637]=P;  arr[638]=R
    for (i=1;i<=n;i++) printf arr[i]
    print
    }'
    Unfortunately, this doesn't seem to work. It gives as output the same line it took as input; characters 637 and 638 are not changed to "PR". In fact the output is identical to the output of "sed '358386q;d' datafile". Any idea what's going wrong? Thanks again.

    No no, i used /^pm/ for my own testing , you should change it to /^144536/ as stated in your requirement. Use
    /^144536/ && NR==358386 when you want to find the 358386th line that is also starting with 144536.
  20. #11
  21. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2007
    Posts
    9
    Rep Power
    0
    OK, now we're making progress! So I re-ran the script as follows:

    Code:
    awk '/^144536/{line=$0}
    END{ n=split(line,arr,"")
    arr[636]="P"; arr[637]="R"
    for (i=1;i<=n;i++) printf arr[i]
    print
    }' datafile > outfile
    I had to make 2 small fixes to your code: (1) I guess awk starts numbering the array with 0, not 1, so I had to subtract 1 from each of the indices to put "PR" in the right place. And (2) I had to put P and R in quotes to make the script recognize them as characters and enter them correctly.

    The only remaining problem with the code is that the output includes 2 lines of data: the one correct line, correctly edited, but also the last line of the data file (unedited), which begins with a totally different sequence of numbers and which shouldn't be there. Any thoughts?

    Also, what part of the script is directing awk to choose only the LAST line that begins with 144536, rather than all lines beginning with 144536? Thanks again, I'm really grateful for the help.

    Originally Posted by ghostdog74
    No no, i used /^pm/ for my own testing , you should change it to /^144536/ as stated in your requirement. Use
    /^144536/ && NR==358386 when you want to find the 358386th line that is also starting with 144536.
  22. #12
  23. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2006
    Posts
    177
    Rep Power
    237
    Originally Posted by akleinb
    OK, now we're making progress! So I re-ran the script as follows:

    Code:
    awk '/^144536/{line=$0}
    END{ n=split(line,arr,"")
    arr[636]="P"; arr[637]="R"
    for (i=1;i<=n;i++) printf arr[i]
    print
    }' datafile > outfile
    I had to make 2 small fixes to your code: (1) I guess awk starts numbering the array with 0, not 1, so I had to subtract 1 from each of the indices to put "PR" in the right place. And (2) I had to put P and R in quotes to make the script recognize them as characters and enter them correctly.
    yup, i forget to include quotes in "P" and "R". good thing you found out yourself.
    The only remaining problem with the code is that the output includes 2 lines of data: the one correct line, correctly edited, but also the last line of the data file (unedited), which begins with a totally different sequence of numbers and which shouldn't be there. Any thoughts?
    can you provide the output, how is it like. ( if possible, a snapshot of your input file )

    Also, what part of the script is directing awk to choose only the LAST line that begins with 144536, rather than all lines beginning with 144536? Thanks again, I'm really grateful for the help.
    this line!
    Code:
    ...
    /^144536/{line=$0}
    ...
    as awk goes through every line and finds 144536, it assigns the whole line to "line" variable, right until the very last 144536. the final value of line will be what you want.
  24. #13
  25. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    May 2007
    Posts
    9
    Rep Power
    0
    Unfortunately, I can't provide either the input or output files. The input file consists of data in fixed-width format. There are many thousands of lines and each line consists of exactly 700 characters. The first 6 characters are a 6-digit number representing each person's ID number and there are multiple entries for each person. The output file consists of the one "corrected" line followed by the last line of the input file (unmodified), followed by 2 carriage returns. I tried the script on a few of the changes I need to make and that format is consistent for the output file. Thanks for your ongoing help!

    Originally Posted by ghostdog74
    can you provide the output, how is it like. ( if possible, a snapshot of your input file )
  26. #14
  27. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Apr 2006
    Posts
    177
    Rep Power
    237
    no problem. it's good that you have solved it.
  28. #15
  29. No Profile Picture
    Contributing User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Dec 2004
    Location
    Prague, Czech Rep.
    Posts
    117
    Rep Power
    15

    Smile


    Originally Posted by akleinb
    Hi everyone,

    I need a SED command (or an awk or grep command) that will do the following:
    - Find the last line that begins with "144536". There are multiple lines that begin with 144536 and it's important that it be the last one only
    - Replace the 637th and 638th characters on that line with "PR"

    I need to do that about 40 times -- not a trivial number, but not a huge number. So if it's too much to do all that in one line, I could also use grep -n "144536" to list the line numbers of all of it's appearances, then I would only need a sed command that would replace the 637th and 638th characters of a line number that I specify manually. Thanks for your help!
    Hi there!

    Your statement is not quite complete. I would rather await that you need to change the whole input file and produce its new version. If so, you must decompose the task into two subtasks.
    1. Read the input file and locate the last appropriate line
    2. Read the input file second time, correct the line found in step 1 and copy it into standard output.

    I have tested following on Linux under bash, but I think it must run on all flavours of Unix, because I use only the basic features of awk and shell scripting.


    :
    # Find the number of the last appropriate line
    # Note the backticks!
    N=`awk '
    /144536/{N = NR}
    END {
    print N
    }
    ' infile
    ` # The ending backtick

    # Update the line the number of which is in the shell variable N
    awk '
    {
    # Get the shell variable N into awk script by surrounding it with ''
    if (NR == '$N') {
    printf "%s%s%s\n", substr($0, 1, 636), "PR", substr($0, 639)
    }
    else print
    }

    ' infile

    This script produces the new file in the standard output. Redirect it wherever you need.

    Regards zlutovsky
Page 1 of 2 12 Last
  • Jump to page:

IMN logo majestic logo threadwatch logo seochat tools logo