#1
  1. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    2
    Rep Power
    0

    Compare 2 files in Unix


    Hi Linux gurus,

    Help need to compare 2 files .

    File 1 :

    DB_NAME FIRST_ACTIVE_LOG DBPARTITIONNUM
    -------- -------------------- --------------
    BP1 279231 0
    BP1 12735 1
    BP1 12734 2
    BP1 12735 3
    BP1 12616 4
    BP1 12730 5
    BP1 12810 6
    BP1 12602 7
    BP1 12669 8
    BP1 12690 9
    BP1 12631 10
    BP1 9850 11
    BP1 9842 12
    BP1 9912 13
    BP1 9792 14
    BP1 6185 15
    BP1 6194 16
    BP1 6164 17
    BP1 6154 18
    BP1 6164 19
    BP1 6147 20

    File 2 :


    Rollforward Status

    Input database alias = bp1
    Number of nodes have returned status = 21

    Node number Rollforward Next log Log files processed Last committed transaction
    status to be read
    ----------- -------------------------- ------------------- ------------------------- --------------------------
    0 DB working S0278273.LOG S0278240.LOG-S0278272.LOG 2011-08-03-05.00.20.000000 UTC
    1 DB working S0011751.LOG S0011717.LOG-S0011750.LOG 2011-08-02-23.11.53.000000 UTC
    2 DB working S0011753.LOG S0011719.LOG-S0011752.LOG 2011-08-02-23.11.56.000000 UTC
    3 DB working S0011731.LOG S0011696.LOG-S0011730.LOG 2011-08-02-23.11.50.000000 UTC
    4 DB working S0011645.LOG S0011611.LOG-S0011644.LOG 2011-08-02-23.11.57.000000 UTC
    5 DB working S0011747.LOG S0011713.LOG-S0011746.LOG 2011-08-02-23.11.51.000000 UTC
    6 DB working S0011814.LOG S0011779.LOG-S0011813.LOG 2011-08-02-23.11.50.000000 UTC
    7 DB working S0011629.LOG S0011595.LOG-S0011628.LOG 2011-08-02-23.11.55.000000 UTC
    8 DB working S0011692.LOG S0011658.LOG-S0011691.LOG 2011-08-02-23.11.56.000000 UTC
    9 DB working S0011704.LOG S0011670.LOG-S0011703.LOG 2011-08-02-23.11.51.000000 UTC
    10 DB working S0011650.LOG S0011616.LOG-S0011649.LOG 2011-08-02-23.11.55.000000 UTC
    11 DB working S0009569.LOG S0009560.LOG-S0009568.LOG 2011-08-02-23.11.23.000000 UTC
    12 DB working S0009560.LOG S0009552.LOG-S0009559.LOG 2011-08-02-23.10.23.000000 UTC
    13 DB working S0009630.LOG S0009621.LOG-S0009629.LOG 2011-08-02-23.11.19.000000 UTC
    14 DB working S0009511.LOG S0009502.LOG-S0009510.LOG 2011-08-02-23.11.23.000000 UTC
    15 DB working S0005904.LOG S0005895.LOG-S0005903.LOG 2011-08-02-23.11.19.000000 UTC
    16 DB working S0005913.LOG S0005904.LOG-S0005912.LOG 2011-08-02-23.11.19.000000 UTC
    17 DB working S0005884.LOG S0005875.LOG-S0005883.LOG 2011-08-02-23.11.21.000000 UTC
    18 DB working S0005874.LOG S0005865.LOG-S0005873.LOG 2011-08-02-23.11.23.000000 UTC
    19 DB working S0005884.LOG S0005875.LOG-S0005883.LOG 2011-08-02-23.11.20.000000 UTC
    20 DB working S0005867.LOG S0005858.LOG-S0005866.LOG 2011-08-02-23.11.22.000000 UTC

    I need to compare file 1 (DBPARTITIONNUM) column with file 2 ( Node number ) column as a first step and then

    Match the FIRST_ACTIVE_LOG column of file 1 with Next log to be read column in file 2 (integer value alone ) and then print the result ..

    Result should be like this

    If
    FIRST_ACTIVE_LOG == Next log to be read

    then

    Log Gap is zero for Node $DBPARTITIONNUM

    example:

    Log Gap is zero for Node 0

    else

    Log gap is (FIRST_ACTIVE_LOG minus Next log to be read ) for node $DBPARTITIONNUM


    Example

    Log gap is 100 for node 0


    Please help

    Thanks in advance

    Dilip
  2. #2
  3. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,436
    Rep Power
    1688
    Ok, so each file has, in effect, 21 lines (in this example) of actual data.
    You want to match data based on column 3 (of 3) from the first file to column 1 (of 7) in the second file (columns based on white space delimiter).
    So, taking the first rows of data:
    BP1 279231 0
    would match:
    0 DB working S0278273.LOG S0278240.LOG-S0278272.LOG 2011-08-03-05.00.20.000000 UTC
    based on the values in red.
    Having matched those rows you wish to check column 2 of the first file against the numeric part of the file name in column 4 of the second file:
    BP1 279231 0
    against:
    0 DB working S0278273.LOG S0278240.LOG-S0278272.LOG 2011-08-03-05.00.20.000000 UTC
    So you'd be comparing 279231 against 278272.
    Is that correct?
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  4. #3
  5. No Profile Picture
    Registered User
    Devshed Newbie (0 - 499 posts)

    Join Date
    Aug 2011
    Posts
    2
    Rep Power
    0
    Thanks Simon for the reply , i am absolutely looking for the same
  6. #4
  7. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,436
    Rep Power
    1688
    Ok, there are, of course, this is *nix we are talking about, lots of ways of doing this. The method I'd pick (and will have a go at showing when I get a moment) will be to, in effect, merge the two files, maybe just 'extracting' the required columns and placing that data in a temporary file and processing that in some way or another - either in awk or in a loop in a script. We shall see ...
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc
  8. #5
  9. No Profile Picture
    Contributing User
    Devshed Regular (2000 - 2499 posts)

    Join Date
    Mar 2006
    Posts
    2,436
    Rep Power
    1688
    Ok, here we go - do a man of the various commands used (sed, awk, join) for details:

    Code:
    #!/bin/bash
    #
    # Extract only the data we want from each file
    sed '1,/---/d' f1.txt | awk '{print $3, $2}' > base_f1.txt
    sed '1,/---/d' f2.txt | awk '{print $1, $4}' | tr -d "[:alpha:]\." > base_f2.txt
    
    # Now merge the two files to allow easier processing
    #  The default settings will work fine for us
    join base_f1.txt base_f2.txt > base_data.txt
    
    # And now process the data
    awk '{gap=$2-$3; printf("Log Gap is %d for Node: %d\n",gap,$1)}' base_data.txt
    Should add that my f1.txt and f2.txt are your two files as shown above!
    Last edited by SimonJM; August 11th, 2011 at 07:42 AM.
    The moon on the one hand, the dawn on the other:
    The moon is my sister, the dawn is my brother.
    The moon on my left and the dawn on my right.
    My brother, good morning: my sister, good night.
    -- Hilaire Belloc

IMN logo majestic logo threadwatch logo seochat tools logo