Merge group numbers and add a column containing group names


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Merge group numbers and add a column containing group names
# 1  
Old 12-08-2009
Merge group numbers and add a column containing group names

Hi All
I do have a file like this with 6 columns. Groups of data merge together and the group number is indicated above each group.

Code:
1
1       12      26      289     3.2e-027        GCGTATGGCGGC
2       12      26      215     6.7e+006        TTCCACCTTTTG
3       9       26      175     8.9e+016        GCGGTAACT
4       20      26      232     1.7e+013        TTTTTATTTTTTTTTTTTCC
5       7       26      161     7.2e+019        ATGCAAA
6       7       26      161     4.2e+019        CTTCAAA
7       7       26      144     7.4e+025        AGAAAAA
8       7       26      155     2.6e+021        TAGGCTG
9       9       26      148     7.3e+028        AATTTATTC
10      7       26      156     1.8e+021        TTGATTT
2
1       16      37      404     2.3e-025        AAAATTGCATGCATGC
2       12      37      351     6.1e-009        AAGAAAAAAAAA
3       9       37      328     1.5e-007        TTTGCCGCC
4       20      37      369     1.2e+001        AAAAGAGGAAAAAAAAAAAA
5       9       37      295     3.1e+007        ATGCATGTA
6       9       37      280     3.3e+014        CATTTTTTT
7       16      37      313     6.1e+015        AGAGAAAAATTAAAAA
8       11      37      288     7.5e+015        AATAATTTGAG
9       7       37      247     4.5e+023        GGAAAGG
4       20      37      369     1.2e+001        AAAAGAGGAAAAAAAAAAAA
3
1       11      36      329     6.0e-012        ATTTGCATGCA
2       7       36      277     7.0e+001        GTGGGGA
3       9       36      273     3.9e+008        CTTACATGC
4       12      36      287     7.1e+010        AAAAAAAGTAAA
5       9       36      254     1.9e+017        ATTTGGCGA
6       7       36      228     6.7e+023        TCCCTTC
7       12      36      255     2.8e+024        TAATAATTTATT
8       16      36      252     5.6e+032        TTTTAAAGAATAATCA
9       16      36      228     1.3e+042        TTTTTTCTGTATTATT
10      12      36      224     5.1e+035        CCACATAAAAAT
.
.
.
.

150
1       7       11      102     7.0e-001        CCCGCCA
2       7       11      90      2.0e+005        GCACTTT
3       12      11      108     7.0e+004        CCCCCAACAATA
4       9       11      94      3.4e+007        GATTTGGAA
5       7       11      87      1.1e+007        AAGAGCT
6       9       11      91      2.1e+009        ATTAAGTTT
7       7       11      84      7.0e+007        CTGGTCA
8       12      11      100     4.4e+009        TTTATTAATCAT
9       7       11      77      3.0e+011        ATTTATG
10      12      11      90      1.7e+013        CATTTTTTTTAC

I wanted to add another column (separated by tab) such that the file looks like:

Code:
1 1       12      26      289     3.2e-027        GCGTATGGCGGC
 1 2       12      26      215     6.7e+006        TTCCACCTTTTG
 1 3       9       26      175     8.9e+016        GCGGTAACT
 1 4       20      26      232     1.7e+013        TTTTTATTTTTTTTTTTTCC
 1 5       7       26      161     7.2e+019        ATGCAAA
1       16      37      404     2.3e-025        AAAATTGCATGCATGC
 2 2       12      37      351     6.1e-009        AAGAAAAAAAAA
 2 3       9       37      328     1.5e-007        TTTGCCGCC
 2 4       20      37      369     1.2e+001        AAAAGAGGAAAAAAAAAAAA
 2 5       9       37      295     3.1e+007        ATGCATGTA
 2 6       9       37      280     3.3e+014        CATTTTTTT
.
.
.
.
150 1       7       11      102     7.0e-001        CCCGCCA
 150 2       7       11      90      2.0e+005        GCACTTT
 150 3       12      11      108     7.0e+004        CCCCCAACAATA
 150 4       9       11      94      3.4e+007        GATTTGGAA
 150 5       7       11      87      1.1e+007        AAGAGCT

Basically add a column that contains the number of each group and delete the group heading and merge.

Is there an efficient way using shell scripting to do it. I have 1000's of such small groups to manipulate.

Please let me know.

LA

Last edited by radoulov; 12-09-2009 at 10:09 AM.. Reason: Added code tags.
# 2  
Old 12-09-2009
Code:
awk 'NF==1{o=$1;next}$0=o"\t"$0' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To group the text (rows) by similar columns-names in a file

As part of some report generation, I've written a script to fetch the values from DB. But, unluckily, for certain Time ranges(1-9.99,10-19.99 etc), I don't have data in DB. In such cases, I would like to write zero (0) instead of empty. The desired output will be exported to csv file. ... (1 Reply)
Discussion started by: kumar_karpuram
1 Replies

2. Programming

Sql ORA-00937: not a single-group group function

I'm trying to return only one row with the highest value for PCT_MAX_USED. Any suggestions? When I add this code, I get the ORA-00937 error. trunc(max(decode( kbytes_max, 0, 0, (kbytes_alloc/kbytes_max)*100))) pct_max_used This is the original and returns all rows. select (select... (3 Replies)
Discussion started by: progkcp
3 Replies

3. Shell Programming and Scripting

Add the values in second and third columns with group by on first column.

Hi All, I have a pipe seperated file. I need to add the values in second and third columns with group by on first column. MYFILE_28012012_1115|47|173.90 MYFILE_28012012_1115|4|0.00 MYFILE_28012012_1115|6|22.20 MYFILE_28012012_1116|47|173.90 MYFILE_28012012_1116|4|0.00... (3 Replies)
Discussion started by: angshuman
3 Replies

4. Shell Programming and Scripting

need a one liner to grep a group info from /etc/group and use that result to search passwd file

/etc/group tiadm::345:mk789,po312,jo343,ju454,ko453,yx879,iy345,hn453 bin::2:root,daemon sys::3:root,bin,adm adm::4:root,daemon uucp::5:root /etc/passwd mk789:x:234:1::/export/home/dummy:/bin/sh po312:x:234:1::/export/home/dummy:/bin/sh ju454:x:234:1::/export/home/dummy:/bin/sh... (6 Replies)
Discussion started by: chidori
6 Replies

5. UNIX for Advanced & Expert Users

script regarding listing long group names

Hello, When listing the file systems (using ls -ltr) , if the group names are longer the group name is getting truncated. Can someone help with the script which would display the truncated group name? I appreciate if someone could help in this regard. (1 Reply)
Discussion started by: mike12
1 Replies

6. UNIX for Advanced & Expert Users

Merge a group of lines into single line

Hi Everybody, Below are the contents of the a text file .., SN = 8 MSI = 405027002277133 IKVALUE = DE6AA6A11D42B69DF6398D44B17BC6F2 K4SNO = 2 CARDTYPE = SIM ALG = COMP128_3 SN = 8 MSI = 405027002546734 IKVALUE = 1D9F8BAA73973D8FBF8CBFB01436D822 K4SNO = 2 CARDTYPE = SIM ALG =... (8 Replies)
Discussion started by: prasanth_babu
8 Replies

7. Shell Programming and Scripting

Sort the file contents in each group....print the group title as well

I've this file and need to sort the data in each group File would look like this ... cat file1.txt Reason : ABC 12345-0023 32123-5400 32442-5333 Reason : DEF 42523-3453 23345-3311 Reason : HIJ 454553-0001 I would like to sort each group on the last 4 fileds and print them... (11 Replies)
Discussion started by: prash184u
11 Replies

8. Shell Programming and Scripting

Merge group numbers and add a column containing group names

I have a file in the following format. Groups of data merge together and the group number is indicated above each group. 1 adrf dfgr dfg 2 dfgr dfgr 3 dfef dfr fd 4 fgrt fgr fgg 5 fgrt fgr (3 Replies)
Discussion started by: Lucky Ali
3 Replies

9. UNIX for Advanced & Expert Users

retrieving all group names with a given group number

hi, which Unix/C function can i use to retrieve all group names with a particular group id? The following C code prints out the group id number of a particular group name: ------------------------------------------------------------------------ #include <stdio.h> #include <grp.h> int... (3 Replies)
Discussion started by: Andrewkl
3 Replies
Login or Register to Ask a Question