Sponsored Content
Top Forums Shell Programming and Scripting how to fetch substring from records into another file Post 302220575 by smriti_shridhar on Friday 1st of August 2008 07:32:50 AM
Old 08-01-2008
One more thing Jean..

I run that script on files which need to extract longer substrings and the format of output file is getting disturbed. I tried some options of printf but coudn't correct it. Please help me out.

===FILE1====
>bi|37779709|geb|AAP20876.1| kinetic [ ternata]
MASKLLLFLLPAILGLIIPRPAVAVGTNYLLSGETLDTDGHLKNGDFDFIMQEDCNAVLYNGNWQSNTAN
KGRDCKLTLTDRGELVINNGEGSAVWRSGSQSAKGNYAAVLHPEGKLVIYGPSVFKINPWVPGLNSLRLG
NVPFTCNMLFSGQVLYGDGKITARNHMLVMQGDCNLVLYGGKCDWQSNTHGNGEHCFLRLNHKGELIIKD
DDFKSIWSSQSSSKQGDYVFILQDNGYGVIYGPAIWATSSKRSVAAQETMIGMVTEKVN
>bi|146403769|geb|ABQ32294.1| pure [an eg]
MAKLLLFLLPAILGLLIPRSAVALGTNYLLSGQTLNTDGHLKNGDFDLVMQNDCNLVLYNGNWQSNTANN
GRDCKLTLTDYGELVIKNGDGSTVWRSRAKSVKGNYAAVLHPDGRLVVFGPSVFKIDPWVPGLNSLRFRN
IPFTDNLLFSGQVLYGDGRLTAKNHQLVMQGDCNLVLYGGKYGWQSNTHGNGEHCFLRLNHKGELIIKDD
DFRPSGAAVPAPSR

===FILE2====
bi|37779709|geb|AAP20876.1| 28 264
bi|146403769|geb|ABQ32294.1| 27 224

===OUTPUTFILE===
>gi|37779709|gb|AAP20876.1| lectin [Pinellia ternata] (28-264)
NYLLSGETLDTDGHLKNGDFDFIMQEDCNAVLYNGNWQSNTANKGRDCKLTLTDRGELVINNGEGSAVWRSGSQSAKGNYAAVLHPEGKLVIYGPSVFKI NPWVPGLNSLRLGNVPFTCNMLFSGQVLYGDGKITARNHMLVMQGDCNLVLYGGKCDWQSNTHGNGEHCFLRLNHKGELIIKDDDFKSIWSSQSSSKQGD YVFILQDNGYGVIYGPAIWATSSKRSVAAQETMIGM
>gi|146403769|gb|ABQ32294.1| lectin [Colocasia esculenta] (27-224)
NYLLSGQTLNTDGHLKNGDFDLVMQNDCNLVLYNGNWQSNTANNGRDCKLTLTDYGELVIKNGDGSTVWRSRAKSVKGNYAAVLHPDGRLVVFGPSVFKI DPWVPGLNSLRFRNIPFTDNLLFSGQVLYGDGRLTAKNHQLVMQGDCNLVLYGGKYGWQSNTHGNGEHCFLRLNHKGELIIKDDDFRPSGAAVPAPS


where as the output should be like this:
>gi|37779709|gb|AAP20876.1| lectin [Pinellia ternata] (28-264)
NYLLSGETLDTDGHLKNGDFDFIMQEDCNAVLYNGNWQSNTANKGRDCKLTLTDRGELVINNGEGSAVWR
SGSQSAKGNYAAVLHPEGKLVIYGPSVFKINPWVPGLNSLRLGNVPFTCNMLFSGQVLYGDGKITARNHML
VMQGDCNLVLYGGKCDWQSNTHGNGEHCFLRLNHKGELIIKDDDFKSIWSSQSSSKQGDYVFILQDNGY
GVIYGPAIWATSSKRSVAAQETMIGM
>gi|146403769|gb|ABQ32294.1| lectin [Colocasia esculenta] (27-224)
NYLLSGQTLNTDGHLKNGDFDLVMQNDCNLVLYNGNWQSNTANNGRDCKLTLTDYGELVIKNGDGSTVWR
SRAKSVKGNYAAVLHPDGRLVVFGPSVFKIDPWVPGLNSLRFRNIPFTDNLLFSGQVLYGDGRLTAKNHQLV
MQGDCNLVLYGGKYGWQSNTHGNGEHCFLRLNHKGELIIKDDDFRPSGAAVPAPS

where each line after header line should not have more than 70 characters.

I will be thankful to you. Smilie
-smriti
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count No of Records in File without counting Header and Trailer Records

I have a flat file and need to count no of records in the file less the header and the trailer record. I would appreciate any and all asistance Thanks Hadi Lalani (2 Replies)
Discussion started by: guiguy
2 Replies

2. Shell Programming and Scripting

fetch substring from html code

hello mates. please help me out once again. i have a html file where i want to fetch out one value from the entire html-code sample html code: ..... <b>Amount:<b> 12345</div> ... now i only want to fetch the 12345 from the html document. how to i tell sed to get me the value from... (2 Replies)
Discussion started by: scarfake
2 Replies

3. Shell Programming and Scripting

Fetch lines from a file matching column2 of another file

Hi guys, Please help me out in this problem. I have two files FILE1 abc-23 : 4529675 cde-42 : 9824532 dge-91 : 1245367 gre-45 : 9824532 fgr-76 : 4529675 FILE2 4529675 : Gal Glu house-2-be 9824532 : cat mouse 1245367 : sirf surf-2-beta where FILE2 is a static file with fixed... (5 Replies)
Discussion started by: smriti_shridhar
5 Replies

4. Shell Programming and Scripting

how to scan a sequential file to fetch some of the records?

Hi I am working on a script which needs to scan a sequential file and fetch the row where 2nd column = 'HUB' Can any one help me with this... Thanks (1 Reply)
Discussion started by: manmeet
1 Replies

5. Shell Programming and Scripting

How to sca a sequential file and fetch some substring data from it

Hi, I have a task where i need to scan second column of seuential file and fetch first 3 digits of that column For e.g. FOLLOWING IS THE SAMPLE FOR MY SEQUENTIAL FILE AU_ID ACCT_NUM CRNCY_CDE THHSBC001 30045678 THB THHSBC001 10154267 THB THHSBC001 ... (2 Replies)
Discussion started by: manmeet
2 Replies

6. UNIX for Dummies Questions & Answers

Grep specific records from a file of records that are separated by an empty line

Hi everyone. I am a newbie to Linux stuff. I have this kind of problem which couldn't solve alone. I have a text file with records separated by empty lines like this: ID: 20 Name: X Age: 19 ID: 21 Name: Z ID: 22 Email: xxx@yahoo.com Name: Y Age: 19 I want to grep records that... (4 Replies)
Discussion started by: Atrisa
4 Replies

7. Shell Programming and Scripting

make the name of file and fetch few things from log file

Hello All, I am working on a script where I need to fetch the value from a log file and log file creates with different name but few thing are common DEV_INFOMGT161_MULTI_PTC_BLD01.Stage_All_to_stp2perf1.042312114644.log STP_12_02_01_00_RC01.Stage_stp-domain_to_stp2perf2.042312041739.log ... (2 Replies)
Discussion started by: anuragpgtgerman
2 Replies

8. Shell Programming and Scripting

Deleting duplicate records from file 1 if records from file 2 match

I have 2 files "File 1" is delimited by ";" and "File 2" is delimited by "|". File 1 below (3 record shown): Doc1;03/01/2012;New York;6 Main Street;Mr. Smith 1;Mr. Jones Doc2;03/01/2012;Syracuse;876 Broadway;John Davis;Barbara Lull Doc3;03/01/2012;Buffalo;779 Old Windy Road;Charles... (2 Replies)
Discussion started by: vestport
2 Replies

9. Shell Programming and Scripting

Separate records of a file on 2 types of records

Hi I am new to shell programming in unix Please if I can provide help. I have a file structure of a header record and "N" detail records. The header record will be the total number of detail records I need to split the file in 2: One for the header Another for all detail records Could... (1 Reply)
Discussion started by: jamcogar
1 Replies

10. Shell Programming and Scripting

How to fetch matched records from files between two different directory?

awk 'NR==FNR{arr;next} $0 in arr' /tmp/Data_mismatch.sh /prd/HK/ACCTCARD_20160115.txt edit by bakunin: seems that one CODE-tag got lost somewhere. i corrected that, but please check your posts more carefully. Thank you. (5 Replies)
Discussion started by: suresh_target
5 Replies
GMT_SHELL_FUNCTIONS.SH(1gmt)				       Generic Mapping Tools				      GMT_SHELL_FUNCTIONS.SH(1gmt)

NAME
gmt_shell_functions.sh - Practical functions to be used in GMT bourne shell scripts SYNOPSIS
gmt_init_tmpdir gmt_remove_tmpdir gmt_clean_up [prefix] gmt_message message gmt_abort message gmt_nrecords file(s) gmt_nfields string gmt_get_field string gmt_get_region file(s) [options] gmt_get_gridregion file [options] gmt_get_map_width -R -J gmt_get_map_height -R -J gmt_set_psfile file gmt_set_framename prefix framenumber gmt_set_framenext framenumber DESCRIPTION
gmt_shell_functions.sh provides a set of functions to Bourne (again) shell scripts in support of GMT. The calling shell script should include the following line, before the functions can be used: . gmt_shell_functions.sh Once included in a shell script, gmt_shell_functions.sh allows GMT users to do some scripting more easily than otherwise. The functions made available are: gmt_init_tmpdir Creates a temporary directory in /tmp or (when defined) in the directory specified by the environment variable TMPDIR. The name of the temporary directory is returned as environment variable GMT_TMPDIR. This function also causes GMT to run in `isolation mode', i.e. all temporary files will be created in GMT_TMPDIR and the .gmtdefaults file will not be adjusted. gmt_remove_tmpdir Removes the temporary directory and unsets the GMT_TMPDIR environment variable. gmt_cleanup Remove all files and directories in which the current process number is part of the file name. If the optional prefix is given then we also delete all files and directories that begins with the given prefix. gmt_message Send a message to standard error. gmt_abort Send a message to standard error and exit the shell. gmt_nrecords Returns the total number of lines in file(s) gmt_nfields Returns the number of fields or words in string gmt_get_field Returns the given field in a string. Must pass string between double quotes to preserve it as one item. gmt_get_region Returns the region in the form w/e/s/n based on the data in table file(s). Optionally add -Idx/dy to round off the answer. gmt_get_gridregion Returns the region in the form w/e/s/n based on the header of a grid file. Optionally add -Idx/dy to round off the answer. gmt_map_width Expects the user to give the desired -R -J settings and returns the map width in the current measurement unit. gmt_map_height Expects the user to give the desired -R -J settings and returns the map height in the current measurement unit. gmt_set_psfile Create the output PostScript file name based on the base name of a given file (usually the script name $0). gmt_set_framename Returns a lexically ordered filename stem (i.e., no extension) given the file prefix and the current frame number, using a width of 6 for the integer including leading zeros. Useful when creating animations and lexically sorted filenames are required. gmt_set_framenext Accepts the current frame integer counter and returns the next integer counter. NOTES
1. These functions only work in the bourne shell (sh) and their derivatives (like ash, bash, ksh and zsh). These functions do not work in the C shell (csh) or their derivatives (like tcsh), and cannot be used in DOS batch scripts either. 2. gmt_shell_functions.sh were first introduced in GMT version 4.2.2 and have since been regularly expanded with other practical scripting short-cuts. If you want to suggest other functions, please do so by mailing to the GMT mailing list: gmt-help@lists.hawaii.edu. SEE ALSO
GMT(1), sh(1), bash(1), minmax(1), grdinfo(1) GMT 4.5.7 15 Jul 2011 GMT_SHELL_FUNCTIONS.SH(1gmt)
All times are GMT -4. The time now is 04:17 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy