Search and replace problem


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Search and replace problem
# 1  
Old 12-13-2013
Search and replace problem

Hi,

I am looking for bash or awk script to solve the following.
Input File 1:
Code:
>Min_0-t10270-RA|>Min_0-t10270-RA protein AED:0.41 eAED:0.46 QI:0|0|0|0.25|1|1|4|0|190
MIGLGFKYLDTSYFGGFCEPSEDMNKVCTMRADCCEGIEMRFHDLKLVLEDWRNFTKLST
EEKRLWATPAAEDFF
>Min_0-t10271-RA|>Min_0-t10271-RA protein AED:0.02 eAED:0.02 QI:0|-1|0|1|-1|1|1|0|97
MDWQGQKLAEQLMQIMLVVFAVGSFITGYAIGSFQMMLIIYAAGVVLTTLVTVPNWPFFN
RHPLKWLDPIEAERHPKPQPQPQPASSKKKPTKQHQK

I want that the entire header line (starting with '>' ) will be replaced by it's first part(before '|'). example:
Code:
>Min_0-t10270-RA|>Min_0-t10270-RA protein AED:0.41 eAED:0.46 QI:0|0|0|0.25|1|1|4|0|190

will become:
Code:
>Min_0-t10270-RA

Expected output:
Code:
>Min_0-t10270-RA
MIGLGFKYLDTSYFGGFCEPSEDMNKVCTMRADCCEGIEMRFHDLKLVLEDWRNFTKLST
EEKRLWATPAAEDFF
>Min_0-t10271-RA
MDWQGQKLAEQLMQIMLVVFAVGSFITGYAIGSFQMMLIIYAAGVVLTTLVTVPNWPFFN
RHPLKWLDPIEAERHPKPQPQPQPASSKKKPTKQHQK

If possible a little comment in script will help me to understand and learn as well.
Many thanks.
# 2  
Old 12-13-2013
Code:
sed 's/^\(>[^|]*\).*/\1/' file

This is simple substitution command. Find lines starting with ">" up to "|" and replace the whole line with that part you want...
\( \) determines whats held in \1
the first ^ anchors to beginning of line, the next ^ is part of a grouping, meaning not "|"
This User Gave Thanks to neutronscott For This Post:
# 3  
Old 12-13-2013
Try

Code:
$ awk -F"|" '/^>/{NF=1}1' file

>Min_0-t10270-RA
MIGLGFKYLDTSYFGGFCEPSEDMNKVCTMRADCCEGIEMRFHDLKLVLEDWRNFTKLST
EEKRLWATPAAEDFF
>Min_0-t10271-RA
MDWQGQKLAEQLMQIMLVVFAVGSFITGYAIGSFQMMLIIYAAGVVLTTLVTVPNWPFFN
RHPLKWLDPIEAERHPKPQPQPQPASSKKKPTKQHQK

This User Gave Thanks to Akshay Hegde For This Post:
# 4  
Old 12-13-2013
You could also try the slightly simpler awk and sed commands:
Code:
awk -F'|' '{print $1}' input
        and
sed 's/|.*//' input

The awk command uses "|" as the field separator and prints the 1st field on every input line.

The sed command removes "|" and anything that follows it from every input line that contains a "|" and then prints the resulting lines (whether or not they changed).
This User Gave Thanks to Don Cragun For This Post:
# 5  
Old 12-13-2013
Code:
#!/usr/bin/env perl

open( $fh, "<", "yourfile") or die "Cannot open file: $!\n";
# go through the file line by line
while( my $line = <$fh>) {
 	chomp($line);                                 # get rid of newline
	$line =~ s/\|.*$// if $line =~ /^>/;    # remove everything after pipe if start with >
	print $line."\n";
}

This User Gave Thanks to brianadams For This Post:
# 6  
Old 01-23-2014
Hello All,

Following may be a solution too.

Input code:
Code:
>Min_0-t10270-RA|>Min_0-t10270-RA protein AED:0.41 eAED:0.46 QI:0|0|0|0.25|1|1|4|0|190
MIGLGFKYLDTSYFGGFCEPSEDMNKVCTMRADCCEGIEMRFHDLKLVLEDWRNFTKLST
EEKRLWATPAAEDFF
>Min_0-t10271-RA|>Min_0-t10271-RA protein AED:0.02 eAED:0.02 QI:0|-1|0|1|-1|1|1|0|97
MDWQGQKLAEQLMQIMLVVFAVGSFITGYAIGSFQMMLIIYAAGVVLTTLVTVPNWPFFN
RHPLKWLDPIEAERHPKPQPQPQPASSKKKPTKQHQK

Code:
awk '/^\>/ gsub(/\.*\|.*/,X) 1' check_data_range

Output will be as follows.

Code:
>Min_0-t10270-RA
MIGLGFKYLDTSYFGGFCEPSEDMNKVCTMRADCCEGIEMRFHDLKLVLEDWRNFTKLST
EEKRLWATPAAEDFF
>Min_0-t10271-RA
MDWQGQKLAEQLMQIMLVVFAVGSFITGYAIGSFQMMLIIYAAGVVLTTLVTVPNWPFFN
RHPLKWLDPIEAERHPKPQPQPQPASSKKKPTKQHQK


NOTE: where file name is check_data_range.


Thanks,
R. Singh
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Large search replace using sed results in memory problem.

I have one big file of size 9GB (big_file.txt). This big file has sentences and paragraphs like any usual English document. I have another file consisting of replacement strings for sed to use. The file name is replace.sed and each entry in one line looks like this: s/\<shout\>/shout/g s/\<b is... (2 Replies)
Discussion started by: shoaibjameel123
2 Replies

2. Shell Programming and Scripting

Nested search in a file and replace the inner search

Hi Team, I am new to unix, please help me in this. I have a file named properties. The content of the file is : ##Mobile props east.url=https://qa.east.corp.com/prop/end west.url=https://qa.west.corp.com/prop/end south.url=https://qa.south.corp.com/prop/end... (2 Replies)
Discussion started by: tolearn
2 Replies

3. Programming

Binary Search Tree Search problem

I am writing code for a binary search tree search and when I compile it i am getting strange errors such as, " /tmp/ccJ4X8Xu.o: In function `btree::btree()': project1.cpp:(.text+0x0): multiple definition of `btree::btree()' " What does that mean exactly? tree.h #ifndef TREE_H #define... (1 Reply)
Discussion started by: meredith1990
1 Replies

4. Homework & Coursework Questions

VI Search and Replace problem help...

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: Give the vi command for replacing all occurances of the string "DOS" with the string "UNIX" in the whole... (4 Replies)
Discussion started by: kbreitsprecher
4 Replies

5. Shell Programming and Scripting

perl search and replace - search in first line and replance in 2nd line

Dear All, i want to search particular string and want to replance next line value. following is the test file. search string is tmp,??? ,10:1 "???" may contain any 3 character it should remain the same and next line replace with ,10:50 tmp,123 --- if match tmp,??? then... (3 Replies)
Discussion started by: arvindng
3 Replies

6. UNIX and Linux Applications

GNU sed - Search and Replace problem

Hi, The following code loops through every file with an error extension and then loops through all XML files in that directory and replaces the target character @ with / . The problem I have is that if there is more than one occurance of @ in each individual file it doesn't replace it. Any... (2 Replies)
Discussion started by: Fishn
2 Replies

7. Shell Programming and Scripting

awk - replace number of string length from search and replace for a serialized array

Hello, I really would appreciate some help with a bash script for some string manipulation on an SQL dump: I'd like to be able to rename "sites/WHATEVER/files" to "sites/SOMETHINGELSE/files" within the sql dump. This is quite easy with sed: sed -e... (1 Reply)
Discussion started by: otrotipo
1 Replies

8. Shell Programming and Scripting

Problem with sed (search/replace)

Hi, In a file FILE, the following lines appear : WORD 8 8 8 ANOTHERWORD blabla ... Directly in the prompt, if I type $sed '/WORD/s/8/10/g' FILE it replace the 8's by 10's in file : $cat FILE WORD 10 10 10 ANOTHERWORD blabla ... (9 Replies)
Discussion started by: tipi
9 Replies

9. Shell Programming and Scripting

Perl: Search for string on line then search and replace text

Hi All, I have a file that I need to be able to find a pattern match on a line, search that line for a text pattern, and replace that text. An example of 4 lines in my file is: 1. MatchText_randomNumberOfText moreData ReplaceMe moreData 2. MatchText_randomNumberOfText moreData moreData... (4 Replies)
Discussion started by: Crypto
4 Replies

10. UNIX for Dummies Questions & Answers

vi search & replace ... having '/' in string - problem.

I want to carry out search & replace for the paths mentioned in the file with the help of vi. 'abc/' to be replaced by 'abc/data' When I use command in vi as below - %s/abc//abc/data/g it gives me an error. How we should deal with '/' part in string for vi search & replace? ... (6 Replies)
Discussion started by: videsh77
6 Replies
Login or Register to Ask a Question