The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
searching thru or combining multiple lines in a unix file ndedhia1 Shell Programming and Scripting 0 03-16-2009 09:17 AM
Sed: Combining Multiple Lines into one goldfish Shell Programming and Scripting 3 09-10-2008 03:52 PM
help combining lines in awk blueheed Shell Programming and Scripting 2 03-23-2006 06:26 PM
need help appending lines/combining lines within a file... mr_manny Shell Programming and Scripting 2 01-06-2006 06:45 PM
Combining multiple lines DUST Shell Programming and Scripting 4 07-15-2005 11:57 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 05-06-2009
pinnacle pinnacle is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 182
Combining many lines to one using awk or any unix cmd

Combining many lines to one using awk or any unix cmd


Inputfile:
Quote:
ID,place,org,animal,country
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,zoo,lion,America
ITS234,chicago,zoo,zebra,America
Output :
Quote:
ID,place,org,animal,country
ITS234,chicago,zoo,Tiger lion zebra,America
Appreciate help on this.
  #2 (permalink)  
Old 05-06-2009
vgersh99's Avatar
vgersh99 vgersh99 is offline Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,131
zenith,
were exactly are stuck in your code?
  #3 (permalink)  
Old 05-06-2009
pinnacle pinnacle is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 182
Quote:
Originally Posted by vgersh99 View Post
zenith,
were exactly are stuck in your code?

I tried till here but doesnt work:

Code:
nawk -F, '{k= ($1 SUBSEP $2 SUBSEP $3) u[k]++ d[k] = d[k] ? d[k] RS $0: $0} END { for (K in u) if (u[K] > 1) print d[K] } ' infile


The below input file i modified.
Quote:
Inputfile:
ID,place,org,animal,country
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,USzoo,lion,America
ITS234,chicago,INzoo,zebra,America

Output file:

Quote:
ITS234,chicago,zoo USZoo INzoo,Tiger lion zebra,America

The key is first 2 columns of file.
If the first 2 columns matches then the remaining columns are combined to on column for different records

This is complex to implement.
Help is highly appreciated
  #4 (permalink)  
Old 05-06-2009
siquadri siquadri is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 44
Quote:
Originally Posted by zenith View Post
I tried till here but doesnt work:

Code:
nawk -F, '{k= ($1 SUBSEP $2 SUBSEP $3) u[k]++ d[k] = d[k] ? d[k] RS $0: $0} END { for (K in u) if (u[K] > 1) print d[K] } ' infile


The below input file i modified.



Output file:




The key is first 2 columns of file.
If the first 2 columns matches then the remaining columns are combined to on column for different records

This is complex to implement.
Help is highly appreciated

I think this cannot be implemented in unix
  #5 (permalink)  
Old 05-06-2009
durden_tyler's Avatar
durden_tyler durden_tyler is offline Forum Advisor  
Registered User
  
 

Join Date: Apr 2009
Posts: 553
Quote:
Originally Posted by zenith View Post
The key is first 2 columns of file.
If the first 2 columns matches then the remaining columns are combined to on column for different records

This is complex to implement.
Help is highly appreciated
Assuming the first two keys are already sorted in your file:


Code:
$ 
$ cat input.txt
ID,place,org,animal,country
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,USzoo,lion,America
ITS234,chicago,INzoo,zebra,America
ITS235,New York,zoo_1,Tiger,America
ITS235,New York,zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger,America
ITS236,Dallas,zoo,Camel,America
ITS237,Seattle,zoo,Tiger,America
ITS237,Seattle,zoo,Tiger,Russia
ITS237,Seattle,zoo,Tiger,Australia
ITS238,Memphis,park,Tiger,Russia
ITS238,Memphis,zoo,Eagle,America
ITS238,Memphis,library,Kangaroo,Australia
ITS299,Moscow,Mall,Jaguar,Russia
$
$ awk -F"," '$1","$2 == LastKey {
>   if ($3 != ORG) {ORG = ORG" "$3}
>   if ($4 != ANML) {ANML = ANML" "$4}
>   if ($5 != CTRY) {CTRY = CTRY" "$5}
> }
> $1","$2 != LastKey {
>   if (ORG != "") {print LastKey","ORG","ANML","CTRY}
>   LastKey = $1","$2
>   ORG = $3; ANML = $4; CTRY = $5
> }
> END {print LastKey","ORG","ANML","CTRY}' input.txt
ID,place,org,animal,country
ITS234,chicago,zoo USzoo INzoo,Tiger lion zebra,America
ITS235,New York,zoo_1 zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger Camel,America
ITS237,Seattle,zoo,Tiger,America Russia Australia
ITS238,Memphis,park zoo library,Tiger Eagle Kangaroo,Russia America Australia
ITS299,Moscow,Mall,Jaguar,Russia
$
$

And if they are not, then you will have to sort them before you pipe it to the awk script:


Code:
$ 
$ # first 2 keys are not sorted in this file
$                                           
$ cat input.txt                             
ID,place,org,animal,country                 
ITS237,Seattle,zoo,Tiger,Australia          
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,USzoo,lion,America
ITS234,chicago,INzoo,zebra,America
ITS235,New York,zoo_1,Tiger,America
ITS235,New York,zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger,America
ITS299,Moscow,Mall,Jaguar,Russia
ITS236,Dallas,zoo,Camel,America
ITS237,Seattle,zoo,Tiger,America
ITS237,Seattle,zoo,Tiger,Russia
ITS238,Memphis,park,Tiger,Russia
ITS238,Memphis,zoo,Eagle,America
ITS238,Memphis,library,Kangaroo,Australia
$
$ sort -t"," -k1,2 input.txt |
> awk -F"," '$1","$2 == LastKey {
>   if ($3 != ORG) {ORG = ORG" "$3}
>   if ($4 != ANML) {ANML = ANML" "$4}
>   if ($5 != CTRY) {CTRY = CTRY" "$5}
> }
> $1","$2 != LastKey {
>   if (ORG != "") {print LastKey","ORG","ANML","CTRY}
>   LastKey = $1","$2;
>   ORG = $3; ANML = $4; CTRY = $5
> }
> END {print LastKey","ORG","ANML","CTRY}'
ID,place,org,animal,country
ITS234,chicago,INzoo USzoo zoo,zebra lion Tiger,America
ITS235,New York,zoo_1 zoo_2,Tiger,America
ITS236,Dallas,zoo,Camel Tiger,America
ITS237,Seattle,zoo,Tiger,America Australia Russia
ITS238,Memphis,library park zoo,Kangaroo Tiger Eagle,Australia Russia America
ITS299,Moscow,Mall,Jaguar,Russia
$
$

Hope that helps,
tyler_durden

__________________________________________________
"Without pain, without sacrifice, we would have nothing."
  #6 (permalink)  
Old 05-06-2009
pinnacle pinnacle is offline
Registered User
  
 

Join Date: Apr 2009
Posts: 182
Quote:
Originally Posted by durden_tyler View Post
Assuming the first two keys are already sorted in your file:


Code:
$ 
$ cat input.txt
ID,place,org,animal,country
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,USzoo,lion,America
ITS234,chicago,INzoo,zebra,America
ITS235,New York,zoo_1,Tiger,America
ITS235,New York,zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger,America
ITS236,Dallas,zoo,Camel,America
ITS237,Seattle,zoo,Tiger,America
ITS237,Seattle,zoo,Tiger,Russia
ITS237,Seattle,zoo,Tiger,Australia
ITS238,Memphis,park,Tiger,Russia
ITS238,Memphis,zoo,Eagle,America
ITS238,Memphis,library,Kangaroo,Australia
ITS299,Moscow,Mall,Jaguar,Russia
$
$ awk -F"," '$1","$2 == LastKey {
>   if ($3 != ORG) {ORG = ORG" "$3}
>   if ($4 != ANML) {ANML = ANML" "$4}
>   if ($5 != CTRY) {CTRY = CTRY" "$5}
> }
> $1","$2 != LastKey {
>   if (ORG != "") {print LastKey","ORG","ANML","CTRY}
>   LastKey = $1","$2
>   ORG = $3; ANML = $4; CTRY = $5
> }
> END {print LastKey","ORG","ANML","CTRY}' input.txt
ID,place,org,animal,country
ITS234,chicago,zoo USzoo INzoo,Tiger lion zebra,America
ITS235,New York,zoo_1 zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger Camel,America
ITS237,Seattle,zoo,Tiger,America Russia Australia
ITS238,Memphis,park zoo library,Tiger Eagle Kangaroo,Russia America Australia
ITS299,Moscow,Mall,Jaguar,Russia
$
$

And if they are not, then you will have to sort them before you pipe it to the awk script:


Code:
$ 
$ # first 2 keys are not sorted in this file
$                                           
$ cat input.txt                             
ID,place,org,animal,country                 
ITS237,Seattle,zoo,Tiger,Australia          
ITS234,chicago,zoo,Tiger,America
ITS234,chicago,USzoo,lion,America
ITS234,chicago,INzoo,zebra,America
ITS235,New York,zoo_1,Tiger,America
ITS235,New York,zoo_2,Tiger,America
ITS236,Dallas,zoo,Tiger,America
ITS299,Moscow,Mall,Jaguar,Russia
ITS236,Dallas,zoo,Camel,America
ITS237,Seattle,zoo,Tiger,America
ITS237,Seattle,zoo,Tiger,Russia
ITS238,Memphis,park,Tiger,Russia
ITS238,Memphis,zoo,Eagle,America
ITS238,Memphis,library,Kangaroo,Australia
$
$ sort -t"," -k1,2 input.txt |
> awk -F"," '$1","$2 == LastKey {
>   if ($3 != ORG) {ORG = ORG" "$3}
>   if ($4 != ANML) {ANML = ANML" "$4}
>   if ($5 != CTRY) {CTRY = CTRY" "$5}
> }
> $1","$2 != LastKey {
>   if (ORG != "") {print LastKey","ORG","ANML","CTRY}
>   LastKey = $1","$2;
>   ORG = $3; ANML = $4; CTRY = $5
> }
> END {print LastKey","ORG","ANML","CTRY}'
ID,place,org,animal,country
ITS234,chicago,INzoo USzoo zoo,zebra lion Tiger,America
ITS235,New York,zoo_1 zoo_2,Tiger,America
ITS236,Dallas,zoo,Camel Tiger,America
ITS237,Seattle,zoo,Tiger,America Australia Russia
ITS238,Memphis,library park zoo,Kangaroo Tiger Eagle,Australia Russia America
ITS299,Moscow,Mall,Jaguar,Russia
$
$

Hope that helps,
tyler_durden



__________________________________________________
"Without pain, without sacrifice, we would have nothing."

tyler_durden
Can you please explain the code.
  #7 (permalink)  
Old 05-07-2009
durden_tyler's Avatar
durden_tyler durden_tyler is offline Forum Advisor  
Registered User
  
 

Join Date: Apr 2009
Posts: 553
Quote:
Originally Posted by zenith View Post
tyler_durden
Can you please explain the code.
Nope, I really don't want to rob from you the joy of discovering things by yourself.

The script is pretty brief and self-explanatory. Try it out on a sample data, comment out different portions and see how the result changes, check the syntax from the man pages or the online gawk manual ("http://www.gnu.org/software/gawk/manual/gawk.html"), put in some effort and soon enough, you'll figure it out yourself. And then you'll *know* it real good.

tyler_durden

__________________________________________________
"Without pain, without sacrifice, we would have nothing."
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 07:55 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0