Sponsored Content
Top Forums Shell Programming and Scripting Count and merge using common column Post 302724527 by empyrean on Wednesday 31st of October 2012 06:14:45 PM
Old 10-31-2012
Count and merge using common column

I have the following records from multiple files.

Code:
415     A       G
415     A       G
415     A       T
415     A       .
415     A       .
421     G       A
421     G       A,C
421     G       A
421     G       A
421     G       A,C
421     G       .
427     A       C
427     A       C
427     A       .
427     A       .

1) i wanted to remove the columns which have "." in third column
2) count the columns and merge based on first column

I want output like this

Code:
 
      3  2,1    415     A       G/T
      5  3,2    421     G       A/A,C
      2         427     A       C

first column "3 2,1 415 A G/T"

3 - how many times 415 is repeated
2,1 - if i count uniq it is giving two times of " 415 A G" and one time "415 A T" pattern. so i wanted to merge this and get final as " 3 2,1 415 G/T"


I used this command to count unique but unable to merge and combine the columns

Code:
cat file | awk '$3 ~/A|T|G|C/{print $0}'| sort | uniq -c

By using above code i am getting the following output

Code:
      2 415     A       G
      1 415     A       T
      3 421     G       A
      2 421     G       A,C
      2 427     A       C

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

merge based on common, awk help

All, $ cat x.txt z 11 az x 12 ax y 13 ay $ cat y.txt ay TT ax NN Output required: y 13 ay TT x 12 ax NN (3 Replies)
Discussion started by: jkl_jkl
3 Replies

2. Shell Programming and Scripting

merge rows based on a common column

Hi guys, Please guide me if you have a solution to this problem. I have tried paste -s but it's not giving the desired output. I have a file with the following content- A123 box1 B345 bat2 C431 my_id A123 service C431 box1 A123 my_id I need two different outputs- OUTPUT1 A123... (6 Replies)
Discussion started by: smriti_shridhar
6 Replies

3. Shell Programming and Scripting

"Join" or "Merge" more than 2 files into single output based on common key (column)

Hi All, I have working (Perl) code to combine 2 input files into a single output file using the join function that works to a point, but has the following limitations: 1. I am restrained to 2 input files only. 2. Only the "matched" fields are written out to the "matched" output file and... (1 Reply)
Discussion started by: Katabatic
1 Replies

4. UNIX for Dummies Questions & Answers

Writing a loop to merge multiple files by common column

I have 100 data files labelled 250.1.txt through 250.100.txt. The second column of the data files partially match (there is about %90 overlap). Each data file has 4 columns. I want the merge all these text files by the matching values in the second column. In the output, the first column should... (1 Reply)
Discussion started by: evelibertine
1 Replies

5. UNIX for Dummies Questions & Answers

Merge rows with common column

Dear all I have big file with two columns A_AA960715 GO:0006952 A_AA960715 GO:0008152 A_AA960715 GO:0016491 A_AA960715 GO:0007165 A_AA960715 GO:0005618 A_AA960716 GO:0006952 A_AA960716 GO:0005618 A_AA960716... (15 Replies)
Discussion started by: AAWT
15 Replies

6. Shell Programming and Scripting

file merge based on common columns

I have two files 1.txt 34, ABC, 7, 8, 0.9 35, CDE, 6.5, -2, 0.01 2.txt 34, ABC, 9, 6, -1.9 35, CDE, 8.5, -2.3, 5.01 So in both files common columns are 1 and 2 so final o/p should look like 34, ABC, 7, 8, 0.9, 9, 6, -1.9 35, CDE, 6.5, -2, 0.01, 8.5, -2.3, 5.01 I tried using... (3 Replies)
Discussion started by: manas_ranjan
3 Replies

7. Shell Programming and Scripting

Merge with common column

hi i have two files and i wanted to join them using common column. try to do this using "join" command but that did not help. File 1: 123 9a.vcf hy92.vcf hy90.vcf Index Ref Alt Ref Alt Ref Alt 315 14 0 7 4 ... (6 Replies)
Discussion started by: empyrean
6 Replies

8. Shell Programming and Scripting

Count common elements in a column

HI, I have a 3-column tab separated column (approx 1GB) in which I would like to count and output the frequency of all of the common elements in the 1st column. For instance: If my input was the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 My... (4 Replies)
Discussion started by: owwow14
4 Replies

9. Shell Programming and Scripting

Seperated by columns, merge in a file, sort them on common column

Hi All, I have 4 files in below format. I took them as an example. File 1: Cut from position 1-4 then 6-7 then 8-14 then rest left and make them as columns in one new file. Inserting character H to the initial of all line like HCTOT. CTOT 456787897 Low fever CTOR 556712345 High fever... (2 Replies)
Discussion started by: Mannu2525
2 Replies

10. Shell Programming and Scripting

Merge multiple files with common header

Hi all, Say i have multiple files x1 x2 x3 x4, all with common header (date, time, year, age),, How can I merge them to one singe file "X" in shell scripting Thanks for your suggestions. (2 Replies)
Discussion started by: msarguru
2 Replies
RECVSTATS(8)						      System Manager's Manual						      RECVSTATS(8)

NAME
recvstats - summarize HylaFAX receive accounting information SYNOPSIS
/usr/sbin/recvstats [ options ] [ files ] DESCRIPTION
recvstats is a command script that prints a summary of the HylaFAX accounting statistics for received facsimile. The statistics can be sorted using one of several keys: the caller's CSI (default), CIDName (CallID2), CIDNumber (CallID1), the communication speed, or the com- munication data format. Statistics are read from the file /var/spool/hylafax/log/xferfaxlog unless filenames are specified on the command line. The following table is an example of the output from recvstats (using the default sort key): Sender CIDName CIDNumber Pages Time Pg/min Errs TypRate TypData 61 30:47 2.0 16 9600 1-DMR +13037904674 5 1:34 3.2 2 9600 1-DMH +14159657824 1 1:00 1.0 0 9600 1-DMH +15086636852 2 0:00 2.0 0 9600 1-DMH +15105268781 740 5:47:51 2.1 59 9600 1-DMH +15105287164 20 7:00 2.9 0 9600 1-DMH +15123713545 3 1:19 2.3 0 9600 1-DMH +15123713641 5 1:04 4.6 0 9600 1-DMH +16099211926 3 2:00 1.5 0 9600 1-DMH +17088987617 1 0:18 3.3 0 9600 1-DMH 415 390 6175 22 6:01 3.6 0 9600 1-DMH 415 965 7651 56 34:52 1.6 14 9600 1-DMH 415 973 6723 1 2:09 0.5 0 9600 2-DMR 49 211 161514 1 0:00 1.0 0 9600 2-DMR 8185970503 1 0:28 2.1 0 9600 2-DMR ALDUS CORPORATION ALDUS CO 5551212 3 2:34 1.2 0 9600 1-DMH -------------------------------------------------------------------------------------------------- Total 925 7:19:03 2.1 91 The Sender column shows the caller's TSI (notice that some callers have not setup a TSI). The CIDName column is the received CIDName value for the sender. The CIDNumber column is the received CIDNumber value for the sender. The Pages column is the total number of good pages received. The time column shows the total amount of time spent receiving facsimile from a sender. The Pg/min column displays the average transfer rate. The Errs column indicates how many protocol errors were encountered while receiving facsimile. The TypRate column displays the most common communication speed with which facsimile were received. The TypData column shows the most common data format in which fac- simile data were received. OPTIONS
-age days Show information only for the facsimile received in the last specified number of days. -csi* Sort output by caller's CSI. -cidname Sort output by CIDName -cidnumber Sort output by CIDNumber -format Sort output by data format. -send* Sort output by sender (i.e. caller's CSI). -speed Sort output by signalling rate. -since time Show information only for the facsimile received since the specified time; where time is of the form ``MM/DD/YY HH:MM'' (the date(1) format string %D %H:%M). -end time Show information only for the facsimile received before the specified time; where time is of the same form as since. FILES
/var/spool/hylafax spooling area /var/spool/hylafax/log/xferfaxlog default file from which to read statistics SEE ALSO
faxq(8), xferfaxstats(8), hylafax-log(5) March 3, 1995 RECVSTATS(8)
All times are GMT -4. The time now is 06:22 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy