De-group data


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting De-group data
# 1  
Old 11-01-2013
De-group data

Hi, some help is highly appreciated, I want to de-group my data for statistical analysis. I made up some sample data, there shouldnt be repeated lines in the output. My data is in excel but I can make it tab-delimited text.

Code:
A B,C
A B,D,E
X Y
X Y,Z

Expected output

Code:
A B
A C
A D
A E
X Y
X Z

# 2  
Old 11-01-2013
Try:
Code:
awk '{n=split($2,a,","); for (i=1;i<=n;i++) print $1,a[i]}' file | sort | uniq

This User Gave Thanks to bartus11 For This Post:
# 3  
Old 11-01-2013
Perfecto ! thank yous Smilie
# 4  
Old 11-01-2013
If you don't care about the output order, the following should be more efficient:
Code:
awk -F '[ \t,]' '
{ for(i = 2; i <= NF; i++) o[$1"\t"$i] }
END { for(i in o) print i }' file

It also produces tab delimited output instead of space delimited output.

If your input file is tab delimited (instead of space delimited as in your sample input), you can change the first line of this script to:
Code:
awk -F '[\t,]]'

If the output does need to be sorted you can make bartus11's script more efficient by replacing sort | uniq with sort -u.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to group similar data?

Hello everybody: I want rearrange about 5 million rows (with 300 columns) into groups. Data looks like the following: where there were various experiments (column 2) conducted at different locations (column headers in top row column 4 onwards) in different years (column 1) using instruments... (0 Replies)
Discussion started by: sheetalk
0 Replies

2. Shell Programming and Scripting

Reading 2 CSV files and filtering data based on group

I have two CSV files in the following format: First file: GroupID, PID:TID, IP, Port Sample data: 0,1000:11,127.0.0.1,445 0,-1:-1,127.0.0.1,800 1,1000:11,127.0.0.1,445 1,-1:-1,127.0.0.1,900 2,1000:11,127.0.0.1,445 2,-1:-1,180.0.0.3,900 Second file: IP,Port,PID Sample data... (6 Replies)
Discussion started by: rakesh_arxmind
6 Replies

3. Solaris

Create file for group of data:

Hi folks, I have the following data.Any help is greatly appreciated. order File_name 7222245 7222245.pdf 7222245 7222245a.pdf 7222245 7222245b.pdf 7222245 7222245c.pdf 7222245 7222245d.pdf 7222250 ... (1 Reply)
Discussion started by: kumar444
1 Replies

4. Shell Programming and Scripting

How to pick a group of data using awk/ksh

Hi gurus, I have data coming in as shown below. And in each case, I need to pick the data in the last group as shown below. Data Set 1: DC | 18161621 LA | 15730880 NY | 16143237 DC | 18161621 LA | 17316397 NY | 17915905 DC | 18161621 LA | 17993534 NY | 18161621 DC | 18161621... (11 Replies)
Discussion started by: calredd
11 Replies

5. Shell Programming and Scripting

need a one liner to grep a group info from /etc/group and use that result to search passwd file

/etc/group tiadm::345:mk789,po312,jo343,ju454,ko453,yx879,iy345,hn453 bin::2:root,daemon sys::3:root,bin,adm adm::4:root,daemon uucp::5:root /etc/passwd mk789:x:234:1::/export/home/dummy:/bin/sh po312:x:234:1::/export/home/dummy:/bin/sh ju454:x:234:1::/export/home/dummy:/bin/sh... (6 Replies)
Discussion started by: chidori
6 Replies

6. Shell Programming and Scripting

Sort Data by Group !

Hello, I have a file and i want to sort by third column and extract the three top lines of each group, it is determined by the second column (144, 89, 55, etc). Could you please help me with the appropiate awk shell script XLY-XLP 144 0.592772 XLY-XLE 144 0.798121 ... (3 Replies)
Discussion started by: csierra
3 Replies

7. Shell Programming and Scripting

Split, Search and Reformat by Data Group

Hi, I am writing just to share my appreciation for help I have received from this site in the past. In a previous post Split File by Data Group I received a lot of help with a troublesome awk script to reformat some complicated data blocks. What I learned really came in hand recently when I... (1 Reply)
Discussion started by: mkastin
1 Replies

8. Shell Programming and Scripting

Group search (multiple data points) in Linux

Hi All I have a data set like this tab delimited: weft fgr-1 345 -1 fgrythdgd weft fgr-3 456 -2 ghjdklflllff weft fgr-11 456 -3 ghtjuffl weft fgr-1 213 -2 ghtyjdkl weft fgr-34 567 -5 fghytkflf frgt fgr-36 567 -1 ghrjufjf frgt fgr-45 678 -2 ghjruir frgt fgr-34 546 -5 gjjjgkldlld frgt... (4 Replies)
Discussion started by: Lucky Ali
4 Replies

9. Shell Programming and Scripting

Split file by data group

Hi all, I'm having a little trouble solving a file split I need to get done. I have the following data: 1. Light 1A. Light Soft texture: it's soft color: the color value is that of something light vital statistics: srm: 23 og: 1.035 sp: 1.065 comment: this is nice if you like... (8 Replies)
Discussion started by: mkastin
8 Replies
Login or Register to Ask a Question