awk - Comparing files based on column -
i try write script comparing 2 large files based on column 2. each file contains 1 million records. output, need know records common on column 2 (exist on both files) have different value in column 1. files quoted comma separated value files
file1_pair 20151026,1111 20141113,2222 20130102,3333 77777777,9999 file2_pair 20151026,1111 20203344,2222 50506677,3333 77777777,8888 desired_output 20141113,2222,20203344 20130102,3333,50506677
i tried modifying script below not able right.
awk 'fnr==nr { a[$0]; next } !($2) in { c++ } end { print c }' file1_pair file2_pair`
you had right idea operating on wrong fields.
you need save $2
values first file in array , check $2
values second file against array. need compare value of $1
in corresponding rows.
this awk script that.
awk -f , -v ofs=, ' nr==fnr { # store value of $1 under $2 key in a[$2]=$1 next } # if $2 in (we've seen value before) , # if value in array (first file's $1 value) doesn't match files $1 value ($2 in a) && (a[$2] != $1) { # print original $1 value (from array),$2,$1 print a[$2],$2,$1 }' file1_pair file2_pair
Comments
Post a Comment