Sponsored Content
Top Forums Shell Programming and Scripting Column content match and add suffix Post 302649523 by jacobs.smith on Thursday 31st of May 2012 01:34:28 PM
Old 05-31-2012
Column content match and add suffix

My input

Code:
chr3    galGal3_xenoRefFlat     CDS     4178235 4178264 0.000000        +       0       gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     exon    4178235 4178264 0.000000        +       .	gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     CDS     4178746 4178826 0.000000        +       0       gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     exon    4178746 4178826 0.000000        +       .       gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     CDS     4179277 4179338 0.000000        +       0       gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     exon    4179277 4179338 0.000000        +       .       gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     CDS     4184594 4184751 0.000000        +       0       gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     exon    4184594 4184751 0.000000        +       .	gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     CDS     4187403 4187538 0.000000        +       1	gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     exon    4187403 4187541 0.000000        +       .       gene_id "T6J4.19; T6J4_19"; transcript_id "T6J4.19; T6J4_19";
chr3    galGal3_xenoRefFlat     CDS     4179280 4179336 0.000000        +       0       gene_id "T15C9.2"; transcript_id "T15C9.2";
chr3    galGal3_xenoRefFlat     exon    4179280 4179336 0.000000        +       .	gene_id "T15C9.2"; transcript_id "T15C9.2";
chr3    galGal3_xenoRefFlat     CDS     4180045 4180087 0.000000        +       1	gene_id "AT3G26020"; transcript_id "AT3G26020_dup1";
chr3    galGal3_xenoRefFlat     exon    4180045 4180087 0.000000        +       .       gene_id "AT3G26020"; transcript_id "AT3G26020_dup1";
chr3    galGal3_xenoRefFlat     CDS     4187410 4187538 0.000000        +       0	gene_id "AT3G26020"; transcript_id "AT3G26020_dup1";
chr3    galGal3_xenoRefFlat     exon    4187410 4187541 0.000000        +       .       gene_id "AT3G26020"; transcript_id "AT3G26020_dup1";
chr3    galGal3_xenoRefFlat     CDS     4178746 4178876 0.000000        +       0	gene_id "si687042f02"; transcript_id "si687042f02";
chr3    galGal3_xenoRefFlat     exon    4178746 4178876 0.000000        +       .	gene_id "si687042f02"; transcript_id "si687042f02";

My output

Code:
chr3    galGal3_xenoRefFlat     CDS     4178235 4178264 0.000000        +       0       gene_id "T6J4.19_1"; transcript_id "T6J4.19_1";
chr3    galGal3_xenoRefFlat     exon    4178235 4178264 0.000000        +       .	gene_id "T6J4.19_2"; transcript_id "T6J4.19_2";
chr3    galGal3_xenoRefFlat     CDS     4178746 4178826 0.000000        +       0       gene_id "T6J4.19_2"; transcript_id "T6J4.19_3";
chr3    galGal3_xenoRefFlat     exon    4178746 4178826 0.000000        +       .       gene_id "T6J4.19_3"; transcript_id "T6J4.19_4";
chr3    galGal3_xenoRefFlat     CDS     4179277 4179338 0.000000        +       0       gene_id "T6J4.19_4"; transcript_id "T6J4.19_5";
chr3    galGal3_xenoRefFlat     exon    4179277 4179338 0.000000        +       .       gene_id "T6J4.19_5"; transcript_id "T6J4.19_6";
chr3    galGal3_xenoRefFlat     CDS     4184594 4184751 0.000000        +       0       gene_id "T6J4.19_6"; transcript_id "T6J4.19_7";
chr3    galGal3_xenoRefFlat     exon    4184594 4184751 0.000000        +       .	gene_id "T6J4.19_7"; transcript_id "T6J4.19_8";
chr3    galGal3_xenoRefFlat     CDS     4187403 4187538 0.000000        +       1	gene_id "T6J4.19_8"; transcript_id "T6J4.19_9";
chr3    galGal3_xenoRefFlat     exon    4187403 4187541 0.000000        +       .       gene_id "T6J4.19_9"; transcript_id "T6J4.19_10";
chr3    galGal3_xenoRefFlat     CDS     4179280 4179336 0.000000        +       0       gene_id "T15C9.2_1"; transcript_id "T15C9.2_1";
chr3    galGal3_xenoRefFlat     exon    4179280 4179336 0.000000        +       .	gene_id "T15C9.2_2"; transcript_id "T15C9.2_2";
chr3    galGal3_xenoRefFlat     CDS     4180045 4180087 0.000000        +       1	gene_id "AT3G26020_1"; transcript_id "AT3G26020_dup1_1";
chr3    galGal3_xenoRefFlat     exon    4180045 4180087 0.000000        +       .       gene_id "AT3G26020_2"; transcript_id "AT3G26020_dup1_2";
chr3    galGal3_xenoRefFlat     CDS     4187410 4187538 0.000000        +       0	gene_id "AT3G26020_3"; transcript_id "AT3G26020_dup1_3";
chr3    galGal3_xenoRefFlat     exon    4187410 4187541 0.000000        +       .       gene_id "AT3G26020_4"; transcript_id "AT3G26020_dup1_4";
chr3    galGal3_xenoRefFlat     CDS     4178746 4178876 0.000000        +       0	gene_id "si687042f02_1"; transcript_id "si687042f02_1";
chr3    galGal3_xenoRefFlat     exon    4178746 4178876 0.000000        +       .	gene_id "si687042f02_2"; transcript_id "si687042f02_2";

Basically, what I want to do is match the content of one row's gene_id and transcript_id which are separated by space to the other row. If they are the same, I would like to add a suffix as a series, i.e _1, _2...so on.

Please note that the columns are separated by spaces.

Thanks for ur help.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

add a column and match two files

I have two files: File #1: ...... ATOM 91 H2'' G A 3 17.357 8.753 -30.401 1.00 0.00 A ATOM 92 O2' G A 3 16.590 9.059 -28.495 1.00 0.00 A ATOM 93 H2' G A 3 16.670 9.792 -27.880 1.00 0.00 A ATOM 94 ... (6 Replies)
Discussion started by: rockytodd
6 Replies

2. Shell Programming and Scripting

How to add a new line between different column data content?

Input file: Germany 10 500 5000 Germany 20 500 5000 Germany 50 10 500 England 5 10 25 USA 30 25 55 USA 20 35 90 Japan 2 5 60 Singapore 50 30 90 Singapore 150 230 290 Output file: Germany 10 500 5000 Germany 20 500 5000 Germany 50 10 500 England 5 10 25 (7 Replies)
Discussion started by: patrick87
7 Replies

3. Shell Programming and Scripting

SED to add a suffix

Hi all, Im trying to make a proper hosts.allow with the lists of sshbl.org to block the ssh brute force attackers. The list is a text file with an IP on every line. What I've gotten up sofar is to prefix "sshd : " on every line, but I need a " : deny" suffix behind every line as well. ... (9 Replies)
Discussion started by: necron
9 Replies

4. Shell Programming and Scripting

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2

Match column 3 in file1 to column 1 in file 2 and replace with column 2 from file2 file 1 sample SNDK 80004C101 AT XLNX 983919101 BB NETL 64118B100 BS AMD 007903107 CC KLAC 482480100 DC TER 880770102 KATS ATHR 04743P108 KATS... (7 Replies)
Discussion started by: rydz00
7 Replies

5. Shell Programming and Scripting

Awk or Sed, fubd match in column, then edit column.

FILE A: 9780743551526,(Abridged) 9780743551779,(Unabridged) 9780743582469,(Abridged) 9780743582483,(Unabridged) 9780743563468,(Abridged) 9780743563475,(Unabridged) FILE B: c3saCandyland 9780743518321 "CANDYLAND" "MCBAIN, ED" 2001 c3sbCandyland 9780743518321 ... (7 Replies)
Discussion started by: glev2005
7 Replies

6. Shell Programming and Scripting

Change file content 4 column to one Column using script

Hi Gurus, I have file content sample: ,5113955056,,TAgent-Suspend ,5119418233,,TAgent-Suspend ,5102119078,,TAgent-Suspend filenames 120229H5_suspend, 120229H6_unsuspend I receive those files one of directory /home/temp/ I need following: 1. Backup first /home/temp/ file to... (5 Replies)
Discussion started by: thepurple
5 Replies

7. Shell Programming and Scripting

awk Print New Column For Every Two Lines and Match On Multiple Column Values to print another column

Hi, My input files is like this axis1 0 1 10 axis2 0 1 5 axis1 1 2 -4 axis2 2 3 -3 axis1 3 4 5 axis2 3 4 -1 axis1 4 5 -6 axis2 4 5 1 Now, these are my following tasks 1. Print a first column for every two rows that has the same value followed by a string. 2. Match on the... (3 Replies)
Discussion started by: jacobs.smith
3 Replies

8. UNIX for Dummies Questions & Answers

Match sum of values in each column with the corresponding column value present in trailer record

Hi All, I have a requirement where I need to find sum of values from column D through O present in a CSV file and check whether the sum of each Individual column matches with the value present for that corresponding column present in the trailer record. For example, let's assume for column D... (9 Replies)
Discussion started by: tpk
9 Replies

9. Shell Programming and Scripting

awk script to append suffix to column when column has duplicated values

Please help me to get required output for both scenario 1 and scenario 2 and need separate code for both scenario 1 and scenario 2 Scenario 1 i need to do below changes only when column1 is CR and column3 has duplicates rows/values. This inputfile can contain 100 of this duplicated rows of... (1 Reply)
Discussion started by: as7951
1 Replies

10. Shell Programming and Scripting

If pattern match in other column, modify column 3.

My command sed will modify everything in column 3 if i will use the command below. I want to search for a pattern then modify everything in column 3. sed -i 's/\|165\|/server1/g' file.txt Input: 01-31-2019 19:14:05|device|165|1548962040165|5c5348f9-0804-1111|file_attach|7271|587|smtp|... (6 Replies)
Discussion started by: invinzin21
6 Replies
All times are GMT -4. The time now is 11:17 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy