Remove multiline text between brackets


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove multiline text between brackets
# 8  
Old 12-03-2013
This might fail on large files...
Code:
awk 'gsub (/{[^}]*} /,"")' RS= file

# 9  
Old 12-03-2013
Quote:
Originally Posted by RudiC
This might fail on large files...
Code:
awk 'gsub (/{[^}]*} /,"")' RS= file

It will also remove empty lines and paragraphs that do not contain curly brackets..

Last edited by Scrutinizer; 12-03-2013 at 10:17 AM..
This User Gave Thanks to Scrutinizer For This Post:
# 10  
Old 12-12-2013
The awk answer to this problem

This one intrigued me because I'm more of an awk person than a perl person...
Code:
awk '{print $2} NR==1' RS={ FS="} " ORS= file

I wasn't familiar with RS, FS and ORS as built in variables and now realize that's some very powerful stuff there. (by the way, I don't know how I made it so many years never knowing about FNR... argh!!).

To explain your example, RS is the record separator, which is how a line is defined. Normally I think the default RS is "\n" i.e. the newline. FS is the field separator and normally that's whitespace. ORS is output record separator.

In other words, define a new line by finding {. Break the "line" into two fields separated by "} " (note space after brackets... So "my big}date yesterday" is not a match). The first field is what's inside the {}, the second is what comes after. Print the second field (printing the first field would only show what was inside the brackets). Make the output record separator nothing (otherwise it would insert line breaks any time there's a {}. One downside of this though is that if the line break is within the {}, that break disappears. So

Yesterday we ate {some
giant} cake and
drank tea

becomes

Yesterday we ate cake and
drank tea

which is not the end of the world I suppose.

Great answer, Scrutinizer!

Last edited by Franklin52; 12-12-2013 at 03:39 AM.. Reason: Please use code tags
# 11  
Old 12-12-2013
Thanks climatron Smilie . See if this works better:
Code:
awk 'NR>1{gsub(/[^\n]/,x,$1)}1' RS={ FS="} ?" OFS= ORS= file


Last edited by Scrutinizer; 12-12-2013 at 03:23 PM..
# 12  
Old 12-12-2013
Code:
#!/usr/bin/env perl

open( $fh, "<", "yourfile") or die "Cannot open file: $!\n";

my $content = do { local $/; <$fh> };		# slurp in whole file

$content =~ s,{.*?},,smg;			        # remove from { to } multiline match

print $content."\n";

# 13  
Old 12-12-2013
And just to be a completist on this issue, building on scrutinizer's suggestion, here's the code that would do everything I asked for in the original message

awk 'NR>1{gsub(/[^\n]/,x,$1)}1' RS={ FS="} ?" OFS= ORS= filename | awk 'NF > 0 { gsub(/^[ \t]+|[ \t]+$/, ""); print }'

It gets rid of all text within brackets AND trims whitespace and removes empty lines.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to remove the text between all curly brackets from text file?

Hello experts, I have a text file with lot of curly brackets (both opening { & closing } ). I need to delete them alongwith the text between opening & closing brackets' pair. For ex: Input:- 59. Rh1 Qe4 {(Qf5-e4 Qd8-g8+ Kg6-f5 Qg8-h7+ Kf5-e5 Qh7-e7+ Ke5-f5 Qe7-d7+ Qe4-e6 Qd7-h7+ Qe6-g6... (6 Replies)
Discussion started by: prvnrk
6 Replies

2. Shell Programming and Scripting

Remove lines from multiline json

Hi , When extracting the data from API end point ,its giving multi line json .I want to remove certain lines with group": "tag" or tget and respect "item" values python test.py /data{" Id":" 7554317""group":"get", "item":"xx5e1"],"fields":} { "time": 1520460953, "... (4 Replies)
Discussion started by: akil
4 Replies

3. Shell Programming and Scripting

How to remove multiline HTML tags from a file?

I am trying to remove a multiline HTML tag and its contents from a few HTML files following the same basic pattern. So far using regex and sed have been unsuccessful. The HTML has a basic structure like this (with the normal HTML stuff around it): <div id="div1"> <div class="div2"> <other... (4 Replies)
Discussion started by: threesixtyfive
4 Replies

4. Shell Programming and Scripting

Remove everything inside of brackets

I need to use something bash related to remove everything inside of brackets. For example. In the following: abc<def>ghi<jkl>mno the result should be: abcghimno (4 Replies)
Discussion started by: locoroco
4 Replies

5. Shell Programming and Scripting

Remove whatever is mention in brackets

Hi all My previous question was complicated let me simplify it I have to just remove whatever is present in bracket () along with brackets ERCC1 (PA155) Platinum compounds (PA164713176) Allele A is not associated with response to Platinum compounds in women with Ovarian Neoplasms as... (2 Replies)
Discussion started by: Priyanka Chopra
2 Replies

6. Shell Programming and Scripting

Remove brackets repeats and separate in columns

Hi all, I want to remove the remove bracket sign ( ) and put in the separate column I also want to remove the repeated entry like in first row in below input (PA156) is repeated ESR1 (PA156) leflunomide (PA450192) (PA156) leflunomide (PA450192) CHST3 (PA26503) docetaxel... (4 Replies)
Discussion started by: manigrover
4 Replies

7. Shell Programming and Scripting

remove brackets and put it in a column and remove repeated entry

Hi all, I want to remove the remove bracket sign ( ) and put in the separate column I also want to remove the repeated entry like in first row in below input (PA156) is repeated ESR1 (PA156) leflunomide (PA450192) (PA156) leflunomide (PA450192) CHST3 (PA26503) docetaxel... (2 Replies)
Discussion started by: manigrover
2 Replies

8. Shell Programming and Scripting

Remove brackets { } in the data

Hello folks, I have a data file in which each line has 54 numbers, and every 3 numbers are bracketed. So totally 18 pairs of brackets in each line. A typical line is like: {29.187000274658203 -16.148000717163086 -0.9380000233650208} {30.63800048828125 -15.977999687194824... (5 Replies)
Discussion started by: rockytodd
5 Replies

9. Shell Programming and Scripting

Remove text between brackets

How can I use bash to remove all text between "<" and ">" in a file? (1 Reply)
Discussion started by: locoroco
1 Replies

10. Shell Programming and Scripting

Delete text between square brackets and also delete those square brackets using sed or awk

Hi All, I have a text file which looks like this: computer programming systems engineering I want to get rid of these square brackets and also the text that is inside these brackets. So that my final text file looks like this: computer programming systems engineering I am using... (3 Replies)
Discussion started by: shoaibjameel123
3 Replies
Login or Register to Ask a Question