Sponsored Content
Top Forums Shell Programming and Scripting Count and merge using common column Post 302724527 by empyrean on Wednesday 31st of October 2012 06:14:45 PM
Old 10-31-2012
Count and merge using common column

I have the following records from multiple files.

Code:
415     A       G
415     A       G
415     A       T
415     A       .
415     A       .
421     G       A
421     G       A,C
421     G       A
421     G       A
421     G       A,C
421     G       .
427     A       C
427     A       C
427     A       .
427     A       .

1) i wanted to remove the columns which have "." in third column
2) count the columns and merge based on first column

I want output like this

Code:
 
      3  2,1    415     A       G/T
      5  3,2    421     G       A/A,C
      2         427     A       C

first column "3 2,1 415 A G/T"

3 - how many times 415 is repeated
2,1 - if i count uniq it is giving two times of " 415 A G" and one time "415 A T" pattern. so i wanted to merge this and get final as " 3 2,1 415 G/T"


I used this command to count unique but unable to merge and combine the columns

Code:
cat file | awk '$3 ~/A|T|G|C/{print $0}'| sort | uniq -c

By using above code i am getting the following output

Code:
      2 415     A       G
      1 415     A       T
      3 421     G       A
      2 421     G       A,C
      2 427     A       C

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

merge based on common, awk help

All, $ cat x.txt z 11 az x 12 ax y 13 ay $ cat y.txt ay TT ax NN Output required: y 13 ay TT x 12 ax NN (3 Replies)
Discussion started by: jkl_jkl
3 Replies

2. Shell Programming and Scripting

merge rows based on a common column

Hi guys, Please guide me if you have a solution to this problem. I have tried paste -s but it's not giving the desired output. I have a file with the following content- A123 box1 B345 bat2 C431 my_id A123 service C431 box1 A123 my_id I need two different outputs- OUTPUT1 A123... (6 Replies)
Discussion started by: smriti_shridhar
6 Replies

3. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

4. UNIX for Dummies Questions & Answers

Writing a loop to merge multiple files by common column

I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns. I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
Discussion started by: evelibertine
1 Replies

5. UNIX for Dummies Questions & Answers

Merge rows with common column

Dear all I have big file with two columns A_AA960715 GO:0006952 A_AA960715 GO:0008152 A_AA960715 GO:0016491 A_AA960715 GO:0007165 A_AA960715 GO:0005618 A_AA960716 GO:0006952 A_AA960716 GO:0005618 A_AA960716... (15 Replies)
Discussion started by: AAWT
15 Replies

6. Shell Programming and Scripting

file merge based on common columns

I have two files 1.txt 34, ABC, 7, 8, 0.9 35, CDE, 6.5, -2, 0.01 2.txt 34, ABC, 9, 6, -1.9 35, CDE, 8.5, -2.3, 5.01 So in both files common columns are 1 and 2 so final o/p should look like 34, ABC, 7, 8, 0.9, 9, 6, -1.9 35, CDE, 6.5, -2, 0.01, 8.5, -2.3, 5.01 I tried using... (3 Replies)
Discussion started by: manas_ranjan
3 Replies

7. Shell Programming and Scripting

Merge with common column

hi i have two files and i wanted to join them using common column. try to do this using "join" command but that did not help. File 1: 123 9a.vcf hy92.vcf hy90.vcf Index Ref Alt Ref Alt Ref Alt 315 14 0 7 4 ... (6 Replies)
Discussion started by: empyrean
6 Replies

8. Shell Programming and Scripting

Count common elements in a column

HI, I have a 3-column tab separated column (approx 1GB) in which I would like to count and output the frequency of all of the common elements in the 1st column. For instance: If my input was the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 My... (4 Replies)
Discussion started by: owwow14
4 Replies

9. Shell Programming and Scripting

Seperated by columns, merge in a file, sort them on common column

Hi All, I have 4 files in below format. I took them as an example. File 1: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting character H to the initial of all line like HCTOT. CTOT 456787897 Low fever CTOR 556712345 High fever... (2 Replies)
Discussion started by: Mannu2525
2 Replies

10. Shell Programming and Scripting

Merge multiple files with common header

Hi all, Say i have multiple files x1 x2 x3 x4, all with common header (date, time, year, age),, How can I merge them to one singe file "X" in shell scripting Thanks for your suggestions. (2 Replies)
Discussion started by: msarguru
2 Replies
TSI(5)								File Formats Manual							    TSI(5)

NAME
tsi - Transmission Subscriber Identification (TSI) access control list DESCRIPTION
The HylaFAX configuration parameter QualifyTSI specifies whether or not the identity of a calling facsimile machine should be checked against an access control list before receiving facsimile. If QualifyTSI is non-null, then only messages from facsimile machines identi- fied in the file specified by the string (typically etc/tsi) will be accepted. Patterns are specified one per line and must conform to the regular expressions syntax specified by POSIX 1003.2; see re_format(7). Com- ments may be included; they are introduced with the ``#'' character and extend to the end of the line. Any trailing white space on a line is ignored (for convenience when comments are used). If a line begins with ``!'', then the regular expression identifies clients that should be rejected; otherwise regular expressions identify clients whose transmissions should be accepted. The order of patterns in a TSI file is important. When a facsimile is to be received, the fax server will compare the client's TSI against the patterns in the access control list in the order in which they appear in the file. The first pattern that matches the client TSI is used to decide whether to accept or reject the facsimile. If no patterns match the client TSI then the facsimile is rejected. Thus if you want to accept all but a restricted set of TSI the last line in the file should be ``^.*$''. Note that regular expression patterns should be written to match a TSI exactly. That is, patterns should be of the form: ^<pattern>$ where the ``^'' and ``$'' characters are used to specify the start and end of the matching TSI. Additionally, regular expression patterns should handle white space that may appear in known locations. For example, ^([+]1){1}[ .-]*415[ .-]*555[ .-]*1212.*$ matches the following TSI strings: +1.415.555.1212 415 555 1212 1-415-555-1212 Finally, note that regular expressions can be used to specify many TSI with one pattern. NOTES
It would be nice if TSI that were to be matched against were placed in some canonical form (e.g. remove white space and white space-like characters). This is, however, problematic, because some facsimile machines permit any printable ASCII string to be sent as a TSI. SEE ALSO
faxgetty(8), re_format(7), hylafax-config(5) December 5, 1994 TSI(5)
All times are GMT -4. The time now is 05:03 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy