#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0

    Exclamation How to grep two variables (bash)


    Dear all
    I have a text file with plenty of lines in this format (the lines between every two # defined as a group):
    # some str for test
    hdfv 12 9 b
    cgj 5 11 t
    # another string to examine
    kinj 58 96 f
    dfg 7 26 u
    fds 9 76 j
    ---
    key.txt:
    string to
    ---
    output:
    # another string to examine
    kinj 58 96 f
    dfg 7 26 u
    fds 9 76 j

    I should search some keywords(string,to) from lines which starts with # and if the keywords does not exist in key.txt (a file with two columns) then I should remove that line and the following lines(of that group).I've written this code without result!

    cat input.txt | while IFS=$'#' read -r -a myarray
    do
    a=${myarray[1]}
    b=${myarray[0]}
    unset IFS
    read -r a x y z <<< "$a"
    key=$(echo "$x $y")
    if grep "$key" key.txt > /dev/null
    then
    echo $key exists
    else
    grep -v -e "$a" -e "$b" input.txt > $$ && mv $$ input.txt
    fi
    done

    can some one help me?
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,439
    Rep Power
    1688
    Let me try and understand this ...
    You have a file that contains a search word/phrase.
    You have an 'input file' that you wish to search for that word/phrase (but only in a line starting with a #), and if found you wish the rest of the input file (up to the next # or EOF) to be output?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    Hello Simon
    yes key.txt contains two columns that each line defined as a key,like longitude and latitude of points in a text file.
    yes output would be the whole groups which contain those keys.as I've put an example in this Thread.

    Thanks


    Originally Posted by SimonJM
    Let me try and understand this ...
    You have a file that contains a search word/phrase.
    You have an 'input file' that you wish to search for that word/phrase (but only in a line starting with a #), and if found you wish the rest of the input file (up to the next # or EOF) to be output?
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,439
    Rep Power
    1688
    Do the two words (will it always be two words?) have to be together or in the same order to qialify as a match? For example:

    Hit Miss <--- your key words
    # This is Hit or Miss <--- Hit present, Miss present, but not together
    # Miss Penelope was a Hit <--- Both present, different order, not together

    Would both of those be defined a match or not?

    Is it whole words?
    # The stone Hit the Mississippi <--- Miss appears, but as part of another word

    Case sensitivity?
    # Hither and yon I searched, but I always miss them <--- Hit appears (as part of another word), miss appears, but all in lower case

    Will there be more than one match in the input file, or can you be sure that there will only be one? Is the key file only going to have one pair of search terms, or can there be multiple terms?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  8. #5
  9. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0
    Hit Miss will be together in input as key word.I think I should write here

    I don't know whether the keywords repeat, the probability is very low.



    Originally Posted by SimonJM
    Do the two words (will it always be two words?) have to be together or in the same order to qialify as a match? For example:

    Hit Miss <--- your key words
    # This is Hit or Miss <--- Hit present, Miss present, but not together
    # Miss Penelope was a Hit <--- Both present, different order, not together

    Would both of those be defined a match or not?

    Is it whole words?
    # The stone Hit the Mississippi <--- Miss appears, but as part of another word

    Case sensitivity?
    # Hither and yon I searched, but I always miss them <--- Hit appears (as part of another word), miss appears, but all in lower case

    Will there be more than one match in the input file, or can you be sure that there will only be one? Is the key file only going to have one pair of search terms, or can there be multiple terms?
  10. #6
  11. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,439
    Rep Power
    1688
    Am I hallucinating? It was late and I was tired but I could have sworn I saw another example file (this one a proper one, full of numbers?)
    Anyway, regardless of that (though it IS nice to know with what we are actually dealing!) I do not think that grep is 'your man'. It's a good tool for finding stuff, but not so hot at filtering out ranges of stuff.

    Having multiple keys adds to the complexity, suggesting that a loop of some sort is needed. I think we can, possibly, get round that need ...

    Bear in mind that none of this is done with any actual testing or use of *nix, just from memory, so your mileage may vary!!

    Avoiding a loop:
    Take a look at sed, it is good at this sort of thing and, as a bonus has a -f parameter which points to a script file. Of course we don't have a script file, just a file with keywords in. But if we were to take that 'simple' keyword file and chuck it through something like awk we could produce a temporary script file that sed could use.
    Look at sed we have options of using regexp to find start and end lines to process. If we were to use, say: /key1 key2/\#/ then that should do what we want (use \# to force sed to see a character not a start of comment command). We probably want sed to print that so a 'p' command would be handy.
    So we'd end up with something like:
    Code:
    sed -n -f Temp_sed_file.txt input.txt > output.txt
    You could, if you were feeling all sorts of cunning do it all in one line, but I'd recommend doing the awk separately:

    Code:
    awk 'printf("\%s %s\,\#\p\n",$1,$2)' query.txt > Temp_sed_file.txt
    You may (probably?) may have to add 'quoting' to some of that to get the desired final file format, but that is simple to play with.

    With a loop:
    On the basis that may not work, we will likely need to go back to using a loop. Here grep may well help out. Your 'driver' for the loop would be the keyfile. What we'd do is use grep -n to return the line number in the input.txt for where the match is found, and then use sed LineNum,/\#/p to pull out the 'stanza' required (replacing LineNum with the value back from the grep, when not zero!)
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  12. #7
  13. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,439
    Rep Power
    1688
    Just checked on a Linux machine and the process seems valid with your current demo files:
    Originally Posted by input.txt
    # some str for test
    hdfv 12 9 b
    cgj 5 11 t
    # another string to examine
    kinj 58 96 f
    dfg 7 26 u
    fds 9 76 j
    Originally Posted by key.txt
    string to
    Code:
    awk '{printf("/%s %s/,/\#/p\n",$1,$2)}' key.txt > Temp_sed_file.txt && sed -n -f Temp_sed_file.txt input.txt
    # another string to examine
    kinj 58 96 f
    dfg 7 26 u
    fds 9 76 j
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  14. #8
  15. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Nov 2013
    Posts
    18
    Rep Power
    0

    Smile


    Many Thanks Simon.Yes it works well.



    Originally Posted by SimonJM
    Just checked on a Linux machine and the process seems valid with your current demo files:



    Code:
    awk '{printf("/%s %s/,/\#/p\n",$1,$2)}' key.txt > Temp_sed_file.txt && sed -n -f Temp_sed_file.txt input.txt
    # another string to examine
    kinj 58 96 f
    dfg 7 26 u
    fds 9 76 j
  16. #9
  17. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,439
    Rep Power
    1688
    Glad it's working.
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc

IMN logo majestic logo threadwatch logo seochat tools logo