Sponsored Content
Top Forums Shell Programming and Scripting how to fetch substring from records into another file Post 302220690 by aigles on Friday 1st of August 2008 11:57:08 AM
Old 08-01-2008
The order of the output correspond to the order of headers in file1.
The logic is :
For each record in file 1
Proceed extractions specified in file 2 relative to this record
The script:
Code:
awk '

NR==FNR {
   if (NR==1)
      Key_len = length($1) + 1;
   k = ">" substr($0,1, Key_len-1);
   n = ++Keys[k];
   From[k,n] = $2;
     To[k,n] = $3;
    Len[k,n] = $3 - $2;
   next;
}

function print_selected(    i,k,p,str) {
   if (selected) {
      k = substr(Header, 1, Key_len);
      for (i=1; i<=Keys[k]; i++) {
         printf("%s (%s-%s)\n", Header, From[k,i], To[k,i]);
         str = substr(Alphabets, From[k,i], Len[k,i]);
         p = 1
         while (p<=Len[k,i]) {
            print substr(str, p, 70);
            p += 70;
         }
      }
   }
}

/^>/ {
   print_selected();
   selected  = (substr($0, 1, Key_len) in Keys);
   Header    = $0;
   Alphabets = "";
   next;
}

selected {
   Alphabets = Alphabets $0;
}

END {
   print_selected();
}
' sm2.dat sm1.dat

Input file 1 (sm1.dat):
Code:
>bi|2138271|geb|AAC15885.1|precursor [Sambucus nigra]
MRVIAAAMLYLYIVVLAICSVGIQGIDYPSVSFNLAGAKSATWDFLRMPHDLVGEDNKYNDGEPITGNII
GRDGLCVDVRNGYDTDGTPLQLWPCGTQRNQQWTFYTDDTIRSMGKCMTANGLSNGSNIMIFNCSTAVEN
AIKWEVTIDGSIINPSSG
>bi|21083|em|CAA26939.1| precursor [Ricinus communis]
MKPGGNTIVIWMYAVATWLCFGSTSGWSFTLEDNNIFPKQYPIINFTTAGATVQSYTNFIRAVRGRLTTG
ADVRHEIPVLPNRVGLPINQRFILVELSNHAELSVTLALDVTNAYVVGYRAGNSAYFFHPDNQEDAEAIT
HLFTDVQNRYTFAFGGNYDRLEQLAGNLRENIELGNGPLEEAISALYYYSTGGTQLPTL
>bi|19526601|geb|AAL87006.1| chain A [Viscum album]
YERLRLRVTHQTTGEEYFRFITLLRDYVSSGSFSNEIPLLRQSTIPVSDAQRFVLVELTNEGGDSITAAI
DVTNLYVVAYQAGDQSYFLRDAPRGAETHLFTGTTRSSLPFNGSYPDLERYAGHRDQIPLGIDQLIQSVT
ALRFPGGNTRTQARSILILIQMISEAARFNPILWRARQYINSGASFLPDVY

Input file 2 (sm2.dat) :
Code:
bi|2138271|geb|AAC15885 92      110
bi|19526601|geb|AAL8700 74      92
bi|2138271|geb|AAC15885 20      132
bi|21083|em|CAA26939.1| 19      37
bi|21083|em|CAA26939.1| 52      70
bi|2138271|geb|AAC15885 26      38

Output :
Code:
>bi|2138271|geb|AAC15885.1|precursor [Sambucus nigra] (92-110)
LWPCGTQRNQQWTFYTDD
>bi|2138271|geb|AAC15885.1|precursor [Sambucus nigra] (20-132)
SVGIQGIDYPSVSFNLAGAKSATWDFLRMPHDLVGEDNKYNDGEPITGNIIGRDGLCVDVRNGYDTDGTP
LQLWPCGTQRNQQWTFYTDDTIRSMGKCMTANGLSNGSNIMI
>bi|2138271|geb|AAC15885.1|precursor [Sambucus nigra] (26-38)
IDYPSVSFNLAG
>bi|21083|em|CAA26939.1| precursor [Ricinus communis] (19-37)
LCFGSTSGWSFTLEDNNI
>bi|21083|em|CAA26939.1| precursor [Ricinus communis] (52-70)
TVQSYTNFIRAVRGRLTT
>bi|19526601|geb|AAL87006.1| chain A [Viscum album] (74-92)
NLYVVAYQAGDQSYFLRD

Jean-Pierre.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies

2. Shell Programming and Scripting

fetch substring from html code

hello mates. please help me out once again. i have a html file where i want to fetch out one value from the entire html-code sample html code: ..... <b>Amount:<b> 12345</div> ... now i only want to fetch the 12345 from the html document. how to i tell sed to get me the value from... (2 Replies)
Discussion started by: scarfake
2 Replies

3. Shell Programming and Scripting

Fetch lines from a file matching column2 of another file

Hi guys, Please help me out in this problem. I have two files FILE1 abc-23 : 4529675 cde-42 : 9824532 dge-91 : 1245367 gre-45 : 9824532 fgr-76 : 4529675 FILE2 4529675 : Gal Glu house-2-be 9824532 : cat mouse 1245367 : sirf surf-2-beta where FILE2 is a static file with fixed... (5 Replies)
Discussion started by: smriti_shridhar
5 Replies

4. Shell Programming and Scripting

how to scan a sequential file to fetch some of the records?

Hi I am working on a script which needs to scan a sequential file and fetch the row where 2nd column = 'HUB' Can any one help me with this... Thanks (1 Reply)
Discussion started by: manmeet
1 Replies

5. Shell Programming and Scripting

How to sca a sequential file and fetch some substring data from it

Hi, I have a task where i need to scan second column of seuential file and fetch first 3 digits of that column For e.g. FOLLOWING IS THE SAMPLE FOR MY SEQUENTIAL FILE AU_ID ACCT_NUM CRNCY_CDE THHSBC001 30045678 THB THHSBC001 10154267 THB THHSBC001 ... (2 Replies)
Discussion started by: manmeet
2 Replies

6. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

7. Shell Programming and Scripting

make the name of file and fetch few things from log file

Hello All, I am working on a script where I need to fetch the value from a log file and log file creates with different name but few thing are common DEV_INFOMGT161_MULTI_PTC_BLD01.Stage_All_to_stp2perf1.042312114644.log STP_12_02_01_00_RC01.Stage_stp-domain_to_stp2perf2.042312041739.log ... (2 Replies)
Discussion started by: anuragpgtgerman
2 Replies

8. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

9. Shell Programming and Scripting

Separate records of a file on 2 types of records

Hi I am new to shell programming in unix Please if I can provide help. I have a file structure of a header record and "N" detail records. The header record will be the total number of detail records I need to split the file in 2: One for the header Another for all detail records Could... (1 Reply)
Discussion started by: jamcogar
1 Replies

10. Shell Programming and Scripting

How to fetch matched records from files between two different directory?

awk 'NR==FNR{arr;next} $0 in arr' /tmp/Data_mismatch.sh /prd/HK/ACCTCARD_20160115.txt edit by bakunin: seems that one CODE-tag got lost somewhere. i corrected that, but please check your posts more carefully. Thank you. (5 Replies)
Discussion started by: suresh_target
5 Replies
Mail::Message::Construct::Text(3pm)			User Contributed Perl Documentation		       Mail::Message::Construct::Text(3pm)

NAME
Mail::Message::Construct::Text - capture a Mail::Message as text SYNOPSIS
my $text = $msg->string; my $text = "$msg"; # via overload my @text = $msg->lines; my @text = @$lines; # via overload my $fh = $msg->file; my $line = <$fh>; $msg->printStructure; DESCRIPTION
Complex functionality on Mail::Message objects is implemented in different files which are autoloaded. This file implements the functionality related to creating message replies. METHODS
The whole message as text $obj->file() Returns the message as file-handle. $obj->lines() Returns the whole message as set of lines. In LIST context, copies of the lines are returned. In SCALAR context, a reference to an array of lines is returned. $obj->printStructure([FILEHANDLE|undef],[INDENT]) Print the structure of a message to the specified FILEHANDLE or the selected filehandle. When explicitly "undef" is specified as handle, then the output will be returned as string. The message's subject and the types of all composing parts are displayed. INDENT specifies the initial indentation string: it is added in front of each line. The INDENT must contain at least one white-space. example: my $msg = ...; $msg->printStructure(*OUTPUT); $msg->printStructure; my $struct = $msg->printStructure(undef); # Possible output for one message: multipart/mixed: forwarded message from Pietje Puk (1550 bytes) text/plain (164 bytes) message/rfc822 (1043 bytes) multipart/alternative: A multipart alternative (942 bytes) text/plain (148 bytes, deleted) text/html (358 bytes) $obj->string() Returns the whole message as string. Flags SEE ALSO
This module is part of Mail-Box distribution version 2.105, built on May 07, 2012. Website: http://perl.overmeer.net/mailbox/ LICENSE
Copyrights 2001-2012 by [Mark Overmeer]. For other contributors see ChangeLog. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See http://www.perl.com/perl/misc/Artistic.html perl v5.14.2 2012-05-07 Mail::Message::Construct::Text(3pm)
All times are GMT -4. The time now is 06:03 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy