Selectively deleting newlines with sed


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Selectively deleting newlines with sed
# 1  
Old 11-10-2018
Selectively deleting newlines with sed

I have a file that look like this:
Code:
>Muestra-1
agctgcgagctgcgaccc
gggttatata
ggaagagacacacacaccccc
>Muestra-2
agctgcg
agctgcgacccgggttatataggaagagac
acacacaccccc
>Muestra-3
agctgcgagctgcgaccc
gggttatata
ggaagagacacacacaccccc

I use the following sed script to remove newlines from lines not starting with >
Code:
sed ':a /^>/!N;s/\r\?\n\([^>]\)/\1/;ta'

I was trying to use b instead of t. So, this is what I did:
Code:
sed '/^>/!{:a;N;$!ba};s/\r\?\n//g'

but didnt get the desired result. Is there any way to use b in the second script to eliminate the newlines skipping those ones that start with >?
# 2  
Old 11-10-2018
Quote:
Originally Posted by Xterra
So, this is what I did:
Code:
sed '/^>/!{:a;N;$!ba};s/\r\?\n//g'

but didnt get the desired result. Is there any way to use b in the second script to eliminate the newlines skipping those ones that start with >?
The problem does not have anything to do with "t" or "b" but how sed actually works: lets say you have a sed-script like this:#

Code:
sed 'command1
     command2
     /regexp/ {
           command3
           command4
     }' /some/file

What happens is this: sed will read in the first line of the input file (this is called the "pattern space"), then apply the first line of its script to it ("command1"), then the next and so on until it reaches the end of the script. If still something is in the pattern space it will be printed to stdout, then the next line of input is read, setting the pattern space to it, then apply the first command ... So, in table format:

Code:
read line1 of input
apply "command1" to it
apply "command2" to the result of previous line
if /regexp/ matches
     apply "command3" to the result of previous line
     apply "command4" to the result of previous line
endif
read next line of input
apply "command1" to it
...

Now, what does your code do:
Code:
/^>/!            # do the following for all lines not starting with a ">"
     {:a                 # define a return point for any "t" or "b" command
     N                   # read next line immediately, not reurning to the beginning of the line
     $! ba               # if this is not the last line jump to a
     }
s/\r\?\n//g

Do you spot it? Once you are inside the condition it is never checked again, you only loop inside it, always adding more text to the pattern space but never doing anything with it - until you hit the last line. Also notice that "/^>/" is true for ANY pattern space content starting with ">". That means, for this:

Code:
> bla foo

but also for this, after adding a line:

Code:
> bla foo
more text

And the same goes the other way: not "/^>/" is true for this:

Code:
foo bar

but also for this:

Code:
foo bar
> a line starting with ">"

This means your logic is wrong, regardless of using "t" or "b". The difference is that "t" will branch only when the last s/...-command actually did something, whereas "b" will branch always. Say, this is the input file:

Code:
xxx
yyy
xxx

and this is your sed-script working on the file:

Code:
sed 's/xxx/XXX/
b end
s/yyy/YYY/
:end'

Then the substitution of "yyy" to "YYY" will never take place because ot is unconditionally skipped over. If you change the "b" to a "t" it will be executed because in the lines with no "xxx" the first substitution will do nothing and therefore the "t" will not branch to end.

I hope this helps.

bakunin
# 3  
Old 11-11-2018
got it ! Thanks
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Deleting [CR][LF] using sed

Dear all, I face a problem I can not solve: I have different lines in a file and some of them are ending with ie "\r\n" and the others are ending with "\n". My aim is to supress the string "\r\n" and concatenate that line with the following one. For example: Saying I have a file... (4 Replies)
Discussion started by: dae
4 Replies

2. Shell Programming and Scripting

sed a multiple line pattern that has newlines followed by text and r

Here is the text that I was to run sed on. In this text I want to insert a semi colon ';' before 'select a13.STORE_TYPE STORE_TYPE,' and after 'from ZZMR00 pa11' Input text: insert into ZZMQ01 select pa11.STATE_NBR STATE_NBR, pa11.STORE_TYPE STORE_TYPE, ... (9 Replies)
Discussion started by: v_vineeta11
9 Replies

3. Shell Programming and Scripting

sed replacing required newlines

hi i have a requirement to replace a string with another using sed and to get the result newline separated but after sed replacement the newline vanishes below is sample code #!/bin/ksh set -x string="name sam\nage 45 \nsport soccer" echo $string string=`echo $string | sed... (2 Replies)
Discussion started by: midhun19
2 Replies

4. Shell Programming and Scripting

sed remove newlines and spaces

Hi all, i am getting count from oracle 11g by spooling it to a file. Now there are some newline characters and blank spaces i need to remove these. pl provide me a awk/sed solution. the spooled file is attached. i tried this.. but not getting req o/p (6 Replies)
Discussion started by: rishav
6 Replies

5. Shell Programming and Scripting

deleting text records with sed (sed paragraphs)

Hi all, First off, Thank you all for the knowledge I have gleaned from this site! Deleting Records from a text file... sed paragraphs The following code works nearly perfect, however each time it is run on the log file it adds a newline at the head of the file, run it 5 times, it'll have 5... (1 Reply)
Discussion started by: Festus Hagen
1 Replies

6. Shell Programming and Scripting

Deleting lines using Sed

Hi All, Please can anyone help me as am deleting a line in a file with the below script: sed '/"$value"/d' redirects.virgin-atlantic.com.conf > olist where $value is a variable where the pattern to be matched is stored. I am not getting any error also but the line containing the pattern... (2 Replies)
Discussion started by: Shazin
2 Replies

7. Shell Programming and Scripting

sed logic before deleting

FileA NAME STATE CITY ---- ---- ----- abc ca ca bcc ny ny def nj nj (3 rows affected) Q1) I want to delete the second row with is ---- ---- -----. Can delete 2nd row using following sed '2d' FileA >FileB but incase the second record is not ---- ---- -----. then data will... (3 Replies)
Discussion started by: pinnacle
3 Replies

8. Shell Programming and Scripting

replacing strings with newlines : sed

Hi everyone, Since the previous time I received help from unix.com I have been encouraged to learn more. going through 1 of the articles(View Article) on sed I found, it pointed an interesting situation. Suppose the text is : Romeo and Ethel the Dancer Moves Audience to Tears. I... (3 Replies)
Discussion started by: hkansal
3 Replies

9. Shell Programming and Scripting

Help deleting lines with SED.

I take the /etc/passwd file and print it out, but I only want the lines that end with sh. I have cat /etc/passwd | sed '/sh/!d' Which prints out all lines that have sh somewhere in it. So I added $, which I thought matches the ends on lines, but its not working, like for example I have have... (5 Replies)
Discussion started by: Bandit390
5 Replies

10. Shell Programming and Scripting

replacing comma's with newlines using sed

Hi All, silly question that I'm sure is easy to answer for a more experienced coder... I have a file called test.txt containing the following text... need, to, break, this, line, into, individual, lines using sed, I'd like to make the file look like this... need to break this line... (5 Replies)
Discussion started by: newbie_coder
5 Replies
Login or Register to Ask a Question