#1
  1. 300lb Bench!
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Aug 2001
    Location
    New York
    Posts
    2,350
    Rep Power
    61

    Help with script to remove null characters


    Ok, so we get a file from a vendor that sometimes has null characters in them. We have a (complicated) process that reads this csv file and inserts the data in one of our tables. If the file has a null character, the script (which appears to be doing an fread) hangs.

    In any event, I searched and found that I can remove these null characters with

    sed 's/\x0//g' filename.txt >filename.txt

    So search for the null (hex 0) and replace it with nothing, globally. This works perfectly. Now the thing is, there are going to be a bunch of these files and I won't know what they're named ahead of time. So I need to look for each file, search for nulls, replace them, then overwrite the old file. After a little googling, I tried

    find . -name "*.txt" -print0 | xargs -0 -I {} sed 's/\x0//g' {} > {}

    Supposedly {} holds the filename passed from find (when you use -I). The -0 is supposed to make sure that any file passed by find that has a space or a funky character in it doesn't mess things up. The -I helps assign the filename from find to {}.

    Just to see if this made any sense, I tried

    find . -name "*.txt" | xargs -I {} echo File: {}

    and was able to echo out every txt file in the directory. However, when I tried the above find command, sed indeed stripped out the null characters, but combined all of the files in the directory (I had two in my test directory) to a file called '{}'. Anybody have insight on this one? Thanks in advance.
    Correspondence chess
    nothingbutchess.com
  2. #2
  3. 300lb Bench!
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Aug 2001
    Location
    New York
    Posts
    2,350
    Rep Power
    61
    Ok, I searched the man pages and realized there's a -i (extension) option that you can use with sed. The -i says to edit the files in place. If you supply "extension", the original files supposedly get saved as backups with the extension "extension".

    Anyway, the following did the trick:

    Code:
    ls | xargs sed -i 's/\x0//g'
    Thanks a lot, guys.

    Comments on this post

    • aitken325i agrees
    Correspondence chess
    nothingbutchess.com

IMN logo majestic logo threadwatch logo seochat tools logo