any better way to remove line breaks


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting any better way to remove line breaks
# 1  
Old 10-06-2008
any better way to remove line breaks

Hi,

I got some log files which print the whole xml message in separate lines:
e.g.
2008-10-01 14:21:44,561 INFO do something
2008-10-01 14:21:44,561 INFO print xml : <?xml version="1.0" encoding="UTF-8"?>
<a>
<b>my data</b>
</a>
2008-10-01 14:21:44,563 INFO do something again

I want to convert the xml part into one single line, e.g.
2008-10-01 14:21:44,561 INFO do something
2008-10-01 14:21:44,561 INFO print xml : <?xml version="1.0" encoding="UTF-8"?><a><b>my data</b></a>
2008-10-01 14:21:44,563 INFO do something again

I once got a script like:
gzip -dc log.gz | sed -n -e ":a" -e "$ s/>\n/>/gp;N;b a"

but it's very slow and run into out of memory

is there any better way to do achieve it?
# 2  
Old 10-06-2008
Maybe replace all line breaks, then replace back if the next character is not a wedge.

Code:
gzip -dc log.gz | tr '\n' '§' | sed -e 's/§</</g' -e 's/§/\n/g'

The character § is unlikely to occur in the log file, but might be problematic if your locale doesn't handle it as a single byte. If your file doesn't contain any underscores, using an underscore instead is safer; or maybe you can come up with another character which doesn't occur in the file (literal vertical bar perhaps? exclamation mark?)

The notation \n might or might not be understood to mean newline by your tr and/or sed; read the manual page and/or experiment with other possible notations, including \012 (for tr) and literal newline:

Code:
gzip -dc log.gz | tr '
' '§' | sed -e 's/§</</g' -e 's/§/\
/g'

Yes, it looks weird, but it's valid string syntax in the shell. (Might want to try without the backslash before the newline in the sed script if it still doesn't work.)
# 3  
Old 10-06-2008
Code:
gzip -dc log.gz | awk '
   /^</ { printf "%s", $0; next }
        { print ""; printf "%s", $0 }
    END { print "" }'

##

# 4  
Old 01-13-2009
Great Thanks!!!

Initially, we implemented the sed solution. It does work, but hang up in some of our hosts (occupying 99.9% of cpu time and drive no output) when the log file size when up to 100MB.

Then, we switch to use awk and the results generated in seconds.Smilie

I'm ignorant on shell programming
May I know why there is such difference? Is the sed bounded by CPU or memory problem?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove single-line breaks only in document

Regarding copy/pasted text of copyright-free book from archive.org (link below), in attempt to expand single-line-break paragraph text (not section headings or paragraph breaks) to wider right margin, Justify or Wrap in LIbreOffice is not working, and Find/Replace the paragraph mark ($) wraps all... (2 Replies)
Discussion started by: p1ne
2 Replies

2. UNIX for Beginners Questions & Answers

Remove line breaks and extra spaces

Hi, I want to remove all extra spaces, line breaks . Need a new line entry only for term starting"array" For eg: my input is array(), array(), array(), and my expected output is array(), array(), array(), Is it possible using awk? (5 Replies)
Discussion started by: rsi.245
5 Replies

3. Shell Programming and Scripting

[BASH] read 'line' issue with leading tabs and virtual line breaks

Heyas I'm trying to read/display a file its content and put borders around it (tui-cat / tui-cat -t(ypwriter). The typewriter-part is a 'bonus' but still has its own flaws, but thats for later. So in some way, i'm trying to rewrite cat using bash and other commands. But sadly it fails on... (2 Replies)
Discussion started by: sea
2 Replies

4. UNIX for Dummies Questions & Answers

Page breaks and line breaks

Hi All, Need an urgent solution to an issue . We have created a ksh file or shell script which generates 1 DAT file. the DAT file contains extract of a select statement . Now the issue is , when we are executing the ksh file , the output is coimng with page breaks and line breaks . We have... (4 Replies)
Discussion started by: Ayaskant
4 Replies

5. Programming

Clean and keep line breaks

Hello, I want to keep line spaces in comments but clean more then 2 after each. Example: $sentence="This is my first sentence This will be in a new row This will be too in a new row but not separated with 3line breaks just with one "; And i want to... (1 Reply)
Discussion started by: AimyThomas
1 Replies

6. Shell Programming and Scripting

Remove line breaks in csv file using shell script

Hi All, I've a csv file in which the record is getting break into 1 line or more than one line. I want to combine those splits into one line and remove the unwanted character existing in the record i.e. double quote symbol ("). The line gets break only when the record contains double... (4 Replies)
Discussion started by: rajak.net
4 Replies

7. Shell Programming and Scripting

Remove line breaks after a match

I need to remove all line breaks in a document after a match, until there is a blank line. Example below, after the match "THE GREEN TABLE" remove line breaks until a blank line. Then, after the match "THE BLUE TABLE" do the same. Before: THE GREEN TABLE Lorem ipsum dolor sit amet,... (14 Replies)
Discussion started by: dockline
14 Replies

8. Shell Programming and Scripting

Piped input to sed 's/\n/ /' doesn't remove the line breaks..

Using ls input as example.. ls | sed 's/\n/ /'outputs with line breaks, where I was expecting the \n to disappear. I've tried \r as well wondering if terminal output used different breaks. Is there a way to remove the line breaks without saving to file and then working from there? ----------... (2 Replies)
Discussion started by: davidpbrown
2 Replies

9. Shell Programming and Scripting

Help with wc and line breaks

Hi everyone, I have gone through the forum trying to find an answer to this question but was unsuccessful. I am hoping that someone can help me with this please. I am trying to get my script to recognise line breaks from a file and to give me a result for wc of each line. So basically, if you... (7 Replies)
Discussion started by: stargazerr
7 Replies

10. Shell Programming and Scripting

How to remove page breaks from a flat file???

Hi All, I get a flat file with its last field data splitting onto a new line.I got this program from Vgersh which when run would cancatenate the split data back to the end of the previous records.But this program fails when it encounters a page break between the split data and the previous... (5 Replies)
Discussion started by: kumarsaravana_s
5 Replies
Login or Register to Ask a Question