Modify record headers from file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Modify record headers from file
# 1  
Old 08-22-2011
Modify record headers from file

Dear All

I was wondering if anybody is able to help me with a script I am struggling with. I have to add comments to the "head-lines" (start with >) from an other file according to the ID tag. Only the head lines should be modified.

Code:
cat File.txt
>H1_A1_A2_A3_A4_ID1_A5
A:B:C:S:E:E:K
>H2_A1_A2_A3_ID2_A5
A:B:C:S:E:E:K
D:E:G
>H3_A1_A2_A3_ID5_A5
A:B:S:R:E:K
D:E

Code:
cat comments.txt
ID1;NT001;QTR231
ID2;NT002;QTR251
ID3;NT011;QTR331
ID4;NT022;QTR311
ID5;NT023;QTR551

RESULTS.txt:
Code:
>H1_A1_A2_A3_A4_ID1_A5 ID1 NT001 QTR231
A:B:C:S:E:E:K
>H2_A1_A2_A3_ID2_A5 ID2 NT002 QTR251
A:B:C:S:E:E:K
D:E:G
>H3_A1_A2_A3_ID5_A5 ID5 NT023 QTR551
A:B:S:R:E:K
D:E

My best attempt so far:

Code:
#! /bin/bash 

grep ">" $1 | awk -F_ '{print $(NF-1)}' > IDtag.tmp

while read ID
do
    COMMENT=`grep "$ID" $2 | awk -F ";" '{$1="";print}' `
    HEADER=`grep "$ID" $1`
    #DATA=`grep -v "$ID" $1`
    echo $HEADER $COMMENT
    #echo $DATA
    
done < IDtag.tmp

rm IDtag.tmp

Thank you very much for your help!

Last edited by pludi; 08-22-2011 at 03:30 PM..
# 2  
Old 08-22-2011
Try:
Code:
awk 'NR==FNR{gsub(";"," ");a[$1]=$0;next}/^>/{x=$0;sub(".*_ID","ID",x);sub("_.*","",x);$0=$0" "a[x]}1' comments.txt File.txt

This User Gave Thanks to bartus11 For This Post:
# 3  
Old 08-22-2011
Another one:

Code:
gawk -F";" 'NR==FNR{a[$1]=$1" "$2" "$3;next} {for(ID in a){if($0~ID)print $0, a[ID]}} $0!~">.*"{print $0}' comments.txt File.txt

This User Gave Thanks to dude2cool For This Post:
# 4  
Old 08-22-2011
Hi,

Using 'perl':
Code:
$ cat infile
>H1_A1_A2_A3_A4_ID1_A5
A:B:C:S:E:E:K
>H2_A1_A2_A3_ID2_A5
A:B:C:S:E:E:K
D:E:G
>H3_A1_A2_A3_ID5_A5
A:B:S:R:E:K
D:E
$ cat script.pl
use warnings;
use strict;
use autodie;

my %comments;

@ARGV == 2 or die qq(Usage: perl $0 file comments\n);

open my $file_h, "<", $ARGV[0];
open my $comments_h, "<", $ARGV[1];

while ( <$comments_h> ) {
        chomp;
        next if /^\s*$/ || ! /;/;
        my @f = split /;/;
        $comments{ $f[0] } = join " ", @f;
}

while ( <$file_h> ) {
        chomp;
        if ( index( $_, ">" ) == 0 ) {
                my @f = split /_/;
                printf "%s\n", $_ . " " . $comments{ $f[$#f - 1] } || "";
                next;
        }

        printf "%s\n", $_;
}
$ perl script.pl infile comments 
>H1_A1_A2_A3_A4_ID1_A5 ID1 NT001 QTR231
A:B:C:S:E:E:K
>H2_A1_A2_A3_ID2_A5 ID2 NT002 QTR251
A:B:C:S:E:E:K
D:E:G
>H3_A1_A2_A3_ID5_A5 ID5 NT023 QTR551
A:B:S:R:E:K
D:E

Regards,
Birei
This User Gave Thanks to birei For This Post:
# 5  
Old 08-23-2011
Thank you all !

Dear bartus11, dear dude2cool,
thank you for the awk and gawk lines - both suggestions work perfectly!

Dear birei,
thanks for the Perl code. It works if I remove the "use autodie" line.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Cannot find logical file format for BSD file headers.

Hi. Unix rookie here. Been looking for a few days for reference documents on how BSD UNIX lays the logical file format onto a disk. Goal is to view/edit with hex editor for data repair. Lots of docs are available for how to use Unix commands (like xxd), but I want to learn the map of how Unix... (4 Replies)
Discussion started by: Chris_top_he_r
4 Replies

2. Shell Programming and Scripting

Extract timestamp from first record in xml file and it checks if not it will replace first record

I have test.xml <emp><id>101</id><name>AAA</name><date>06/06/14 1811</date></emp> <Join><id>101</id><city>london</city><date>06/06/14 2011</date></join> <Join><id>101</id><city>new york</city><date>06/06/14 1811</date></join> <Join><id>101</id><city>sydney</city><date>06/06/14... (2 Replies)
Discussion started by: vsraju
2 Replies

3. UNIX for Dummies Questions & Answers

Modify a record in a unix file

I have few records in a file and each file consists of a field for state and country of length 50 characters. Currently it being represented as Austin, Texas(remaining segments upto length 50 are blank) I need to modify this to Austin,Texas(remaining segments upto length 50 are blank) The... (3 Replies)
Discussion started by: jayumon
3 Replies

4. Shell Programming and Scripting

Merging of files with different headers to make combined headers file

Hi , I have a typical situation. I have 4 files and with different headers (number of headers is varible ). I need to make such a merged file which will have headers combined from all files (comman coluns should appear once only). For example - File 1 H1|H2|H3|H4 11|12|13|14 21|22|23|23... (1 Reply)
Discussion started by: marut_ashu
1 Replies

5. Shell Programming and Scripting

Remove text between headers while leaving headers intact

Hi, I'm trying to strip all lines between two headers in a file: ### BEGIN ### Text to remove, contains all kinds of characters ... Antispyware-Downloadserver.com (Germany)=http://www.antispyware-downloadserver.c om/updates/ Antispyware-Downloadserver.com #2... (3 Replies)
Discussion started by: Trones
3 Replies

6. UNIX for Dummies Questions & Answers

how to read record by record from a file in unix

Hi guys, i have a big file with the following format.This includes header(H),detail(D) and trailer(T) information in the file.My problem is i have to search for the character "6h" at 14 th and 15 th position in all the records .if it is there i have to write all those records into a... (1 Reply)
Discussion started by: raoscb
1 Replies

7. Shell Programming and Scripting

Script to search a bad record in a file then put the record in the bad file

I need to write a script that can find a bad record (for example: there is date field colom but value provided in the file for this field is N/A) then script shoud searches this pattern and then insert the whole record into the bad file. Example: File1 Name designation dateOfJoining... (2 Replies)
Discussion started by: shilendrajadon
2 Replies

8. UNIX for Advanced & Expert Users

Script to search a bad record in a file then put the record in the bad file

I need to write a script that can find a bad record (for example: there is date field colom but value provided in the file for this field is N/A) then script shoud searches this pattern and then insert the whole record into the bad file. Example: File1 Name designation dateOfJoining... (1 Reply)
Discussion started by: shilendrajadon
1 Replies

9. Shell Programming and Scripting

splitting a record and adding a record to a file

Hi, I am new to UNIX scripting and woiuld appreicate your help... Input file contains only one (but long) record: aaaaabbbbbcccccddddd..... Desired file: NEW RECORD #new record (hardcoded) added as first record - its length is irrelevant# aaaaa bbbbb ccccc ddddd ... ... ... (1 Reply)
Discussion started by: rsolap
1 Replies

10. Shell Programming and Scripting

modify a few field of the record information

Hello, I have the following record in a text file, i would like modify some field: 1 - remove all space between ",", but the company name of word will not delete. Anyway, I can use the following statement to do it. 's/^ *//;s/ *, */,/g;s/ *$//' file 2. field #12, I need to modify to time... (11 Replies)
Discussion started by: happyv
11 Replies
Login or Register to Ask a Question