Home
Man
Search
Today's Posts
Register

BSD, Linux, and UNIX shell scripting Post awk, bash, csh, ksh, perl, php, python, sed, sh, shell scripts, and other shell scripting languages questions here.

Selectively deleting newlines with sed

Tags
deleting, newlines, sed, shell scripts

Login to Reply

 
Thread Tools Search this Thread
# 1  
Old 1 Week Ago
Selectively deleting newlines with sed

I have a file that look like this:
Code:
>Muestra-1
agctgcgagctgcgaccc
gggttatata
ggaagagacacacacaccccc
>Muestra-2
agctgcg
agctgcgacccgggttatataggaagagac
acacacaccccc
>Muestra-3
agctgcgagctgcgaccc
gggttatata
ggaagagacacacacaccccc

I use the following sed script to remove newlines from lines not starting with >
Code:
sed ':a /^>/!N;s/\r\?\n\([^>]\)/\1/;ta'

I was trying to use b instead of t. So, this is what I did:
Code:
sed '/^>/!{:a;N;$!ba};s/\r\?\n//g'

but didnt get the desired result. Is there any way to use b in the second script to eliminate the newlines skipping those ones that start with >?
# 2  
Old 1 Week Ago
Quote:
Originally Posted by Xterra
So, this is what I did:
Code:
sed '/^>/!{:a;N;$!ba};s/\r\?\n//g'

but didnt get the desired result. Is there any way to use b in the second script to eliminate the newlines skipping those ones that start with >?
The problem does not have anything to do with "t" or "b" but how sed actually works: lets say you have a sed-script like this:#

Code:
sed 'command1
     command2
     /regexp/ {
           command3
           command4
     }' /some/file

What happens is this: sed will read in the first line of the input file (this is called the "pattern space"), then apply the first line of its script to it ("command1"), then the next and so on until it reaches the end of the script. If still something is in the pattern space it will be printed to stdout, then the next line of input is read, setting the pattern space to it, then apply the first command ... So, in table format:

Code:
read line1 of input
apply "command1" to it
apply "command2" to the result of previous line
if /regexp/ matches
     apply "command3" to the result of previous line
     apply "command4" to the result of previous line
endif
read next line of input
apply "command1" to it
...

Now, what does your code do:
Code:
/^>/!            # do the following for all lines not starting with a ">"
     {:a                 # define a return point for any "t" or "b" command
     N                   # read next line immediately, not reurning to the beginning of the line
     $! ba               # if this is not the last line jump to a
     }
s/\r\?\n//g

Do you spot it? Once you are inside the condition it is never checked again, you only loop inside it, always adding more text to the pattern space but never doing anything with it - until you hit the last line. Also notice that "/^>/" is true for ANY pattern space content starting with ">". That means, for this:

Code:
> bla foo

but also for this, after adding a line:

Code:
> bla foo
more text

And the same goes the other way: not "/^>/" is true for this:

Code:
foo bar

but also for this:

Code:
foo bar
> a line starting with ">"

This means your logic is wrong, regardless of using "t" or "b". The difference is that "t" will branch only when the last s/...-command actually did something, whereas "b" will branch always. Say, this is the input file:

Code:
xxx
yyy
xxx

and this is your sed-script working on the file:

Code:
sed 's/xxx/XXX/
b end
s/yyy/YYY/
:end'

Then the substitution of "yyy" to "YYY" will never take place because ot is unconditionally skipped over. If you change the "b" to a "t" it will be executed because in the lines with no "xxx" the first substitution will do nothing and therefore the "t" will not branch to end.

I hope this helps.

bakunin
# 3  
Old 1 Week Ago
got it ! Thanks
Login to Reply

« Previous Thread | Next Thread »
Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Deleting [CR][LF] using sed dae Shell Programming and Scripting 4 02-28-2017 07:19 AM
sed a multiple line pattern that has newlines followed by text and r v_vineeta11 Shell Programming and Scripting 9 10-14-2015 10:39 AM
Using sed for deleting the first word of each line? Rajeev Nukala Shell Programming and Scripting 2 01-27-2015 04:54 PM
Deleting last 3 lines from a file via sed aldar UNIX for Dummies Questions & Answers 4 11-23-2012 10:36 AM
sed replacing required newlines midhun19 Shell Programming and Scripting 2 07-31-2012 09:26 AM
sed remove newlines and spaces rishav Shell Programming and Scripting 6 05-17-2011 02:01 AM
deleting text with sed alfredman UNIX for Dummies Questions & Answers 3 07-21-2010 01:58 PM
deleting text records with sed (sed paragraphs) Festus Hagen Shell Programming and Scripting 1 09-01-2009 01:45 AM
sed: deleting 5 lines after a specified pattern pcwiz Shell Programming and Scripting 3 08-14-2009 11:48 AM
Deleting lines using Sed Shazin Shell Programming and Scripting 2 07-27-2009 07:32 AM
sed logic before deleting pinnacle Shell Programming and Scripting 3 05-01-2009 02:47 PM
replacing strings with newlines : sed hkansal Shell Programming and Scripting 3 11-28-2008 01:24 PM
Help deleting lines with SED. Bandit390 Shell Programming and Scripting 5 10-24-2008 11:53 PM
deleting lines after pattern using sed larne Shell Programming and Scripting 2 10-13-2008 01:54 PM
replacing comma's with newlines using sed newbie_coder Shell Programming and Scripting 5 09-29-2008 12:44 AM


All times are GMT -4. The time now is 11:41 AM.

Unix & Linux Forums Content Copyright 1993-2018. All Rights Reserved.
UNIX.COM Login
Username:
Password:  
Show Password