Questions on removing unexpected line breaks


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Questions on removing unexpected line breaks
# 1  
Old 09-05-2012
Question Questions on removing unexpected line breaks

I am a newbie in Linux and I am having trouble with a piece of data on hand.
The source data is like

Code:
a|b|c|d
e|f|g
|h
i|j|k|l
m|n|o
|p
1|2|3|4
5|6|7|
8
a|b|c|d
e|f|g|h

For each line, there should be 4 fields separated by the "|", but unfortunately there are unexpected line breaks that make it a mess. Smilie
How to clean up the mess by reformating the lines to make it like

Code:
a|b|c|d
e|f|g|h
i|j|k|l
m|n|o|p
1|2|3|4
5|6|7|8
a|b|c|d
e|f|g|h

I guess it should be like checking the number of "|" in each line and if the number of "|" in a line is <3, then the line break of that line have to be removed.
But I have no idea on what should be used, say, sed?

Did someone encounter such issue before?
If so, could somone share how it could be tackled?

Thanks a lot.

Last edited by Scrutinizer; 09-05-2012 at 01:20 AM.. Reason: code tags
# 2  
Old 09-05-2012
To remove the line break in awk if the number of fields (NF) are less than 4, you could for example do this:
Code:
awk -F\| 'NF<4{getline p; $0=$0 FS p}1' file

# 3  
Old 09-05-2012
Try this...

Code:
tr '\n' " " < file  | sed -e 's/ //g' -e 's/.\{7\}/&\n/g'

# 4  
Old 09-05-2012
Question

Quote:
Originally Posted by Scrutinizer
To remove the line break in awk if the number of fields (NF) are less than 4, you could for example do this:
Code:
awk -F\| 'NF<4{getline p; $0=$0 FS p}1' file

I tried to use this command but the result is not as expected

Code:
a|b|c|d
e|f|g||h
i|j|k|l
m|n|o||p
1|2|3|4
5|6|7|
8|a|b|c|d
e|f|g|h

some lines get more than 4 fields!

---------- Post updated at 02:16 PM ---------- Previous update was at 02:14 PM ----------

Quote:
Originally Posted by pamu
Try this...

Code:
tr '\n' " " < file  | sed -e 's/ //g' -e 's/.\{7\}/&\n/g'

this one works well on the sample! Smilie

could you please kindly explain on what was the code doing?
my linux knowledge is so limited
# 5  
Old 09-05-2012
Quote:
Originally Posted by Nekki Basara

could you please kindly explain on what was the code doing?
my linux knowledge is so limited
Code:
tr '\n' " " < file  | sed -e 's/ //g' -e 's/.\{7\}/&\n/g'

tr '\n' " " < file # Here i replace new line "\n" to the space " ".
# the result of this all the lines come to one single line.

sed -e 's/ //g' #Here i replace space " " with "". To remove space from the string.

-e 's/.\{7\}/&\n/g' # Here I add new line after every 7 elements of the string...


Hope this helps you..Smilie

pamuSmilie

Last edited by pamu; 09-05-2012 at 03:37 AM.. Reason: icode tags..
# 6  
Old 09-05-2012
That sed solution might not work if your real data has strings within the pipes. Try:
Code:
awk '{while(gsub(/[|]/,"&")!=3 || $0 ~ /[|]$/){getline p;$0=$0 p}}1' file

Blank lines in input will be removed by this. If you want to retain them, use:
Code:
awk -F\|  'NF{while(gsub(FS,"&")!=3 || $0 ~ /[|]$/){getline p;$0=$0 p}}1' file

Also, I hope that the last field value is not null.

Last edited by elixir_sinari; 09-05-2012 at 04:32 AM..
This User Gave Thanks to elixir_sinari For This Post:
# 7  
Old 09-05-2012
Bug

Hi elixir_sinari,

Yes. You are right..

Thanks for giving more robust solution...Smilie
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing line breaks inside a field

Hi all, I have a csv input file with total 60 fields and the fields are not enclosed with double quotes.One of the field(50th field) in this file has line breaks in it which results in the row getting split into multiple lines.This is causing my load(to table) to fail.I tried to enforce double... (3 Replies)
Discussion started by: Bobby_2000
3 Replies

2. Linux

Line breaks in mail spool

Hi, I have an issue with the line breaks in the mail spool- /var/mail/user1. I have set up a script to go through the mail spool on one of the users and parse some parts of the mail, however there doesn't seem to exist the regular line endings CR, LF or both in the lines and this is breaking my... (4 Replies)
Discussion started by: night_watcher
4 Replies

3. Shell Programming and Scripting

[BASH] read 'line' issue with leading tabs and virtual line breaks

Heyas I'm trying to read/display a file its content and put borders around it (tui-cat / tui-cat -t(ypwriter). The typewriter-part is a 'bonus' but still has its own flaws, but thats for later. So in some way, i'm trying to rewrite cat using bash and other commands. But sadly it fails on... (2 Replies)
Discussion started by: sea
2 Replies

4. HP-UX

After using @, line breaks for a particular user in shell

Dear Concern, When we using @ sign, line breaks for a particular user in shell. Please advise how to resolve from the problem in HP UX. tabs@tabsdb02:/ccbs/users/tabs$ cat /etc/passwd|grep tabs tabs:RdCgOsmKee7Ps:221:201::/ccbs/users/tabs:/usr/bin/ksh... (3 Replies)
Discussion started by: makauser
3 Replies

5. UNIX for Dummies Questions & Answers

Page breaks and line breaks

Hi All, Need an urgent solution to an issue . We have created a ksh file or shell script which generates 1 DAT file. the DAT file contains extract of a select statement . Now the issue is , when we are executing the ksh file , the output is coimng with page breaks and line breaks . We have... (4 Replies)
Discussion started by: Ayaskant
4 Replies

6. Programming

Clean and keep line breaks

Hello, I want to keep line spaces in comments but clean more then 2 after each. Example: $sentence="This is my first sentence This will be in a new row This will be too in a new row but not separated with 3line breaks just with one "; And i want to... (1 Reply)
Discussion started by: AimyThomas
1 Replies

7. Shell Programming and Scripting

Remove line breaks after a match

I need to remove all line breaks in a document after a match, until there is a blank line. Example below, after the match "THE GREEN TABLE" remove line breaks until a blank line. Then, after the match "THE BLUE TABLE" do the same. Before: THE GREEN TABLE Lorem ipsum dolor sit amet,... (14 Replies)
Discussion started by: dockline
14 Replies

8. Shell Programming and Scripting

Help with wc and line breaks

Hi everyone, I have gone through the forum trying to find an answer to this question but was unsuccessful. I am hoping that someone can help me with this please. I am trying to get my script to recognise line breaks from a file and to give me a result for wc of each line. So basically, if you... (7 Replies)
Discussion started by: stargazerr
7 Replies

9. Shell Programming and Scripting

any better way to remove line breaks

Hi, I got some log files which print the whole xml message in separate lines: e.g. 2008-10-01 14:21:44,561 INFO do something 2008-10-01 14:21:44,561 INFO print xml : <?xml version="1.0" encoding="UTF-8"?> <a> <b>my data</b> </a> 2008-10-01 14:21:44,563 INFO do something again I want... (3 Replies)
Discussion started by: csmklee
3 Replies

10. Shell Programming and Scripting

Removing line breaks from a shell variable

Here is my snippet of code... getDescription() { DESCRIPTION=$(dbaccess dncsdb - << ! 2>/dev/null|sed -e 's/hctt_description//' -e '/^$/ d'|tr -d '\r' select hct_type.hctt_description from hct_type,hct_profile where hct_type.hctt_id=hct_profile.hctt_id and... (5 Replies)
Discussion started by: lyonsd
5 Replies
Login or Register to Ask a Question