Hi, am a total newb to this,

I would like to compare two large csv files row by row and output any difference in the row to a third file that clearly states the row ID and the row data. I was wondering if there is a way to do that using bash script.

I researched the diff command wrote a command such as this:

Code:
diff -c file1.csv file2.csv >> test.csv
This code doesnt really output the differences properly.

Say file1 has:

Code:
RISK, PTFOLIO, FMLY, GRP, TYPE, NUM1
ALL, FUNDING, BR, IRD, US, -32702
ALL, FUNDING, BR, IRD, US, -40000
Say file2 has:

Code:
RISK, PTFOLIO, FMLY, GRP, TYPE, NUM1
ALL, FUNDING, BR, IRS, CA, -38001
the first row (RISK...) is the heading so I do not want to compare that. The Second row have 3 difference, so i want the output to test.csv as such

Code:
RISK, PTFOLIO, FMLY, GRP, TYPE, NUM1

[Row1 #]
ALL, FUNDING, BR, IRD, US, -32702
ALL, FUNDING, BR, IRS, CA, -38001

[Row2 #]
ALL+, FUNDING, BR, IRD, US, -40000

I believe this clearly highlights the difference. Thje plus sign in row2 highlights that this file was in only one file and not the other.

I also looked into the awk command, but am not fluent with it, so am not able to use to to create the comparison.

Am not sure if something like this can be done. Any ideas? Plz advice, thanks!!!

Regards,
ssampath[/QUOTE]