Quote:
Originally Posted by zenith
The key is first 2 columns of file.
If the first 2 columns matches then the remaining columns are combined to on column for different records
This is complex to implement.
Help is highly appreciated
|
Assuming the first two keys are already sorted in your file:
Code:
$
$ cat input.txt
ID,place,org,animal,country
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,USzoo,lion,America
ITS234,chicago,INzoo,zebra,America
ITS235,New York,zoo_1,Tiger,America
ITS235,New York,zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger,America
ITS236,Dallas,zoo,Camel,America
ITS237,Seattle,zoo,Tiger,America
ITS237,Seattle,zoo,Tiger,Russia
ITS237,Seattle,zoo,Tiger,Australia
ITS238,Memphis,park,Tiger,Russia
ITS238,Memphis,zoo,Eagle,America
ITS238,Memphis,library,Kangaroo,Australia
ITS299,Moscow,Mall,Jaguar,Russia
$
$ awk -F"," '$1","$2 == LastKey {
> if ($3 != ORG) {ORG = ORG" "$3}
> if ($4 != ANML) {ANML = ANML" "$4}
> if ($5 != CTRY) {CTRY = CTRY" "$5}
> }
> $1","$2 != LastKey {
> if (ORG != "") {print LastKey","ORG","ANML","CTRY}
> LastKey = $1","$2
> ORG = $3; ANML = $4; CTRY = $5
> }
> END {print LastKey","ORG","ANML","CTRY}' input.txt
ID,place,org,animal,country
ITS234,chicago,zoo USzoo INzoo,Tiger lion zebra,America
ITS235,New York,zoo_1 zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger Camel,America
ITS237,Seattle,zoo,Tiger,America Russia Australia
ITS238,Memphis,park zoo library,Tiger Eagle Kangaroo,Russia America Australia
ITS299,Moscow,Mall,Jaguar,Russia
$
$
And if they are not, then you will have to sort them before you pipe it to the awk script:
Code:
$
$ # first 2 keys are not sorted in this file
$
$ cat input.txt
ID,place,org,animal,country
ITS237,Seattle,zoo,Tiger,Australia
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,USzoo,lion,America
ITS234,chicago,INzoo,zebra,America
ITS235,New York,zoo_1,Tiger,America
ITS235,New York,zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger,America
ITS299,Moscow,Mall,Jaguar,Russia
ITS236,Dallas,zoo,Camel,America
ITS237,Seattle,zoo,Tiger,America
ITS237,Seattle,zoo,Tiger,Russia
ITS238,Memphis,park,Tiger,Russia
ITS238,Memphis,zoo,Eagle,America
ITS238,Memphis,library,Kangaroo,Australia
$
$ sort -t"," -k1,2 input.txt |
> awk -F"," '$1","$2 == LastKey {
> if ($3 != ORG) {ORG = ORG" "$3}
> if ($4 != ANML) {ANML = ANML" "$4}
> if ($5 != CTRY) {CTRY = CTRY" "$5}
> }
> $1","$2 != LastKey {
> if (ORG != "") {print LastKey","ORG","ANML","CTRY}
> LastKey = $1","$2;
> ORG = $3; ANML = $4; CTRY = $5
> }
> END {print LastKey","ORG","ANML","CTRY}'
ID,place,org,animal,country
ITS234,chicago,INzoo USzoo zoo,zebra lion Tiger,America
ITS235,New York,zoo_1 zoo_2,Tiger,America
ITS236,Dallas,zoo,Camel Tiger,America
ITS237,Seattle,zoo,Tiger,America Australia Russia
ITS238,Memphis,library park zoo,Kangaroo Tiger Eagle,Australia Russia America
ITS299,Moscow,Mall,Jaguar,Russia
$
$
Hope that helps,
tyler_durden
__________________________________________________
"Without pain, without sacrifice, we would have nothing."
|