Concatenate lines between lines starting with a specific pattern


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Concatenate lines between lines starting with a specific pattern
# 1  
Old 10-22-2010
Concatenate lines between lines starting with a specific pattern

Hi,

I have a file such as:
---
>contig00001 length=35524 numreads=2944
gACGCCGCGCGCCGCGGCCAGGGCTGGCCCA
CAGGCCGCGCGGCGTCGGCTGGCTGAG
>contig00002 length=4242 numreads=43423
ATGCCGAAGGTCCGCCTGGGGCTGG
CGCCGGGAGCATGTAGCG
---
I would like to concatenate the lines not starting with ">" (concatenate any lines between lines starting with ">"). My wanted output is:
---
>contig00001 length=35524 numreads=2944
gACGCCGCGCGCCGCGGCCAGGGCTGGCCCACAGGCCGCGCGGCGTCGGCTGGCTGAG
>contig00002 length=4242 numreads=43423
ATGCCGAAGGTCCGCCTGGGGCTGGCGCCGGGAGCATGTAGCG
---

Thanks



---------- Post updated at 01:54 PM ---------- Previous update was at 01:48 PM ----------




I have tried like this:
% awk '{if(substr($0,1)==">") print $0"\n";else printf("%s",$0);}' test2.fna | fold -w60
But my output looks like:
Code:
>contig00001 length=35524   numreads=2944gACGCCGCGCGCCGCGGCC
AGGGCTGGCCCACGGCCcTCTTCCGGCGCGCTGCGCAGGCGTTCGGCCAGGCCGCGCGGC
GTCGGCTGGCTGAGCGCCCAGCGTAGCAGGCGATCGAACGGATGCCGACGGGCGCTTTCC
AGTCGTTCGCGCAAACGGGCGATCAACTGGGCGATCAACAGCGAGTCGCCGCCAGCCCCG
AAGAAGTCTTGCTCGACGCCCAGCGACGGGTTGTCCAGCACCTCCCGCCAGAGTGCCAGC

Instead of what I want which is like this:
Code:
>contig00001 length=35524   numreads=2944
gACGCCGCGCGCCGCGGCCAGGGCTGGCCCACGGCCcTCTTCCGGCGCGCTGCGCAGGCG
TTCGGCCAGGCCGCGCGGCGTCGGCTGGCTGAGCGCCCAGCGTAGCAGGCGATCGAACGG
ATGCCGACGGGCGCTTTCCAGTCGTTCGCGCAAACGGGCGATCAACTGGGCGATCAACAG
CGAGTCGCCGCCAGCCCCGAAGAAGTCTTGCTCGACGCCCAGCGACGGGTTGTCCAGCAC
CTCCCGCCAGAGTGCCAGC

# 2  
Old 10-22-2010
Try this:
Code:
awk '{printf (/>/)?RS"%s"RS:"%s",$0}END{print x}' infile

your code does not work because of this:
Code:
substr($0,1,1)

This User Gave Thanks to Scrutinizer For This Post:
# 3  
Old 10-22-2010
Thanks. It worked. Yeah.
Smilie
# 4  
Old 10-23-2010
Code:
awk '{printf /^>/?RS $0 RS:$0}' infile

# 5  
Old 10-23-2010
Quote:
Originally Posted by Scrutinizer
Try this:
Code:
awk '{printf (/>/)?RS"%s"RS:"%s",$0}END{print x}' infile

your code does not work because of this:
Code:
substr($0,1,1)

Hi Scruti ... ready for a nitpicking ? Smilie

Your code add an unexpected empty line if the input file start with a ">" line Smilie
# 6  
Old 10-23-2010
I know, but I figured it would complicate the code and it would not really matter. I did add the linefeed at the end, otherwise if the output gets written to a file, that last line becomes invalid, since the last line is not terminated with a linefeed..
Code:
awk '{printf />/?(NR>1?RS:x)"%s"RS:"%s",$0}END{print x}' infile

# 7  
Old 10-23-2010
Quote:
Originally Posted by Scrutinizer
I know, but I figured it would complicate the code and it would not really matter. I did add the linefeed at the end, otherwise if the output gets written to a file, that last line becomes invalid, since the last line is not terminated with a linefeed..
Code:
awk '{printf />/?((NR>1)?RS:x)"%s"RS:"%s",$0}END{print x}' infile

Dude, you're true, but i like to see your skill in action, that's why i challenged you Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Grep -v lines starting with pattern 1 and not matching pattern 2

Hi all! Thanks for taking the time to view this! I want to grep out all lines of a file that starts with pattern 1 but also does not match with the second pattern. Example: Drink a soda Eat a banana Eat multiple bananas Drink an apple juice Eat an apple Eat multiple apples I... (8 Replies)
Discussion started by: demmel
8 Replies

2. Shell Programming and Scripting

How to delete lines starting with specific string?

Dear all, I would like to delete even lines starting with "N" together with their respective titles which are actually odd lines. Below is the example of input file. I would like to remove line 8 and 12 together with its title line, i.e., line 7 and 11, respectively.... (2 Replies)
Discussion started by: huiyee1
2 Replies

3. Shell Programming and Scripting

Vi editor deleting lines with specific pattern

Hi, I need to delete all lines in the file using vi editor which start with word aternqaco. Please assist. aternqaco.__oracle_base='/amdbqa01/app/oracle'#ORACLE_BASE set from environment aternqa.__oracle_base='/amdbqa01/app/oracle'#ORACLE_BASE set from environment... (3 Replies)
Discussion started by: Vishal_dba
3 Replies

4. Shell Programming and Scripting

Want to get lines before specific pattern

Hi , I want to insert data into a new file after grepping specific pattern . for more info please read following for example: abc=12345678902222 def=45678904444 ------- ------- INAVLID ABC I want to "INAVLID ABC" grep above pattern from multiple files and want to write abc value and ... (3 Replies)
Discussion started by: vipin auja
3 Replies

5. Shell Programming and Scripting

How to concatenate lines with specific pattern?

How to concatenate lines with specific pattern? I have data dumped from a table into text file. In some occurrence the data row is split into two rows. Example: 12345678|Global Test|Global Test Task|My Request|Date|Date|Date|1|1| 12345679|Global Test2|Global Test Task2|My... (8 Replies)
Discussion started by: nixtime
8 Replies

6. Shell Programming and Scripting

Awk, sed - concatenate lines starting with string

I have a file that looks like this: John Smith http://www.profile1.com http://www.profile2.com http://www.profile3.com Marc Olsen http://www.profile4.com http://www.profile5.com http://www.profile6.com http://www.profile7.com Lynne Doe http://www.profile8.com http://www.profile9.com... (3 Replies)
Discussion started by: locoroco
3 Replies

7. Shell Programming and Scripting

Append lines for a specific pattern

Input: 09:43:46,538 INFO first text 10:45:46,538 INFO second text 11:00:46,538 INFO third more text Output: 09:43:46,538 INFO first text 10:45:46,538 INFO second text 11:00:46,538 INFO third more text The rule is to append all lines so each line contains this format... (7 Replies)
Discussion started by: chitech
7 Replies

8. Shell Programming and Scripting

Delete multiple lines starting with a specific pattern

Hi, just tried some script, awk, sed for the last 2 hours and now need help. Let's say I have a huge file of 800,000 lines like this : It's a tedious job to look through it, I'd like to remove those useless lines in it as there's a few thousands : Or to be even more precise : if line1 =... (6 Replies)
Discussion started by: Zurd
6 Replies

9. Shell Programming and Scripting

delete lines starting with a pattern

i have a file sample.txt containing i want to delete lines starting with 123 neglecting spaces and tabs. but not lines containing 123. i.e. i want files sample.txt as help me thanxx (4 Replies)
Discussion started by: yashwantkumar
4 Replies

10. Shell Programming and Scripting

shell script to remove all lines from a file before a line starting with pattern

hi,, i hav a file with many lines.i need to remove all lines before a line begginning with a specific pattern from the file because these lines are not required. Can u help me out with either a perl script or shell script example:- if file initially contains lines: a b c d .1.2 d e f... (2 Replies)
Discussion started by: raksha.s
2 Replies
Login or Register to Ask a Question