Deleting new line characters


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Deleting new line characters
# 1  
Old 02-07-2012
Deleting new line characters

Hi,

I have a weird requirement. I am having a file with 12fields in it and the end of the line for each record is "\n" (Just \n and no carriage returns) and the field delimiter is "|". Problem is I can have new line characters in any field in the data and these new line characters can even come in the last field (This is the problem as it is very tricky to identify where the field is completing).

Can any one help me how can I delete these new line characters which are present in the data.

Please help me how can I do
1. If I have a decimal field as first field
2. Having string field as first field and this field as well having new line characters in it -- Looks to me it is almost impossible to cleane this file but want a confirmation from the expertise.

Let me know if I am not clear with my question.

Thanks in advance.
# 2  
Old 02-07-2012
In a correct record are there always exactly 11 pipe delimiters.
When there is a broken record, do the components always contain 11 pipe delimiters?

If so, the "awk" programmers should be able to help.

Please post you Operating System and version. There is much variation in awk/nawk/gawk .
# 3  
Old 02-07-2012
Try this:
Code:
sed -i -e 's:\\n::g' filename

This will replace the \n with nothing.

Does that work?

Last edited by Franklin52; 02-08-2012 at 06:23 AM.. Reason: Please use code tags for code and data samples, thank you
# 4  
Old 02-07-2012
@brianj
I can answer that ... no.

The \n referred to by the O/P is a newline character. Of course the normal line terminator in a unix text file is newline. Therefore newlines inside text file records are a complete no-no.
This situation often arises from bad field validation at the terminal when entering data into a database. When data is extracted from the database any embedded newlines get confused with real line terminators.
The real solution is to build a filter into your data extract program, or to do all the programming with a database programming language not unix Shell tools.
# 5  
Old 02-08-2012
Thnxs for the replies. Yes methyl. For correct record and for the broken records number of delimiters will be always 11. Could u let me know the script.

---------- Post updated at 05:18 PM ---------- Previous update was at 05:14 PM ----------

True the real solution is to get this fixed by the source but unfortunately I can't get this done by source. So no other option except using a shell.

---------- Post updated 02-08-12 at 04:12 AM ---------- Previous update was 02-07-12 at 05:18 PM ----------

Sorry methyl...Didn't look your message fully. OS I am using is
Linux-x86-gcc3p32
# 6  
Old 02-08-2012
Try this:
Code:
awk -F \| '{while(NF<12 && getline p)$0=$0p}1' infile

# 7  
Old 02-08-2012
Thanks Scrutinizer. This handles the new line characters present in all the fields except in last field but my requirement to handle these as well.

My input records are like:
Code:
3025|DUM_
MY_1|Class DUMMY_1
|5000|GBP|3025|2|14|T|1|1|Dummy
Invoice
for dummy1
3026|DUMMY_2|class DUMMY_2|5000|GBP|3026|2|0|T|1|1|N/A

I want the output to be:
Code:
3025|DUM_MY_1|ClassDUMMY_1|5000|GBP|3025|2|14|T|1|1|DummyInvoicefor dummy1
3026|DUMMY_2|class DUMMY_2|5000|GBP|3026|2|0|T|1|1|N/A

with te command given you the output is coming as
Code:
3025|DUM_MY_1|ClassDUMMY_1|5000|GBP|3025|2|14|T|1|1|Dummy
Invoicefor dummy13026|DUMMY_2|class DUMMY_2|5000|GBP|3026|2|0|T|1|1|N/A

some of the data in the last field is going to next record.

Only facility I have here is the first field is always a decimal field.

Last edited by Franklin52; 02-08-2012 at 07:14 AM.. Reason: Please use code tags for code and data samples, thank you
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Deleting a pattern in UNIX without deleting the entire line

Hi I have a file: r58778.3|SOURCES={KEY=f665931a...,fw,221-705}|ERRORS={16_1:T,30_1:T,56_1:C,57_1:T,59_1:A,101_1:A,115:-,158_1:C,186_1:A,204:-,271_1:T,305:-,350_1:C,368_1:G,442_1:C,472_1:G,477_1:A}|SOURCE_1="Contig_1092402550638"(f665931a359e36cea0976db191ff60ff09cc816e) I want to retain... (15 Replies)
Discussion started by: Alyaa
15 Replies

2. Shell Programming and Scripting

Deleting particular characters from each line in a file in bash

Hi All, I am struck with an issue. I need to delete '%' and 'G' from all lines in the input file. Below is what I want to do. InputFile 04/09/2012.21:58:17,well9,rootfs,3.9G,2.7G,1.1G,71%,/ 04/09/2012.21:58:17,well9,/dev/hda2,3.9G,2.7G,1.1G,71%,/... (6 Replies)
Discussion started by: vharsha
6 Replies

3. Shell Programming and Scripting

Deleting all characters before the last occurrence of /

Hi All, I have a text file with the following text in it: file:///About/accessibility.html file:///About/disclaimer.html file:///About/disclaimer.html#disclaimer file:///pubmed?term=%22Dacre%20I%22%5BAuthor%5D file:///pubmed?term=%22Madigan%20J%22%5BAuthor%5D... (8 Replies)
Discussion started by: shoaibjameel123
8 Replies

4. Shell Programming and Scripting

deleting rows that have certain characters

Hi, I want to delete rows whenever column one has the letters 'rpa'. The file is tab seperated. e.g. years 1 bears 1 cats 2 rpat 3 rpa99 4 rpa011 5 then removing 'rpa' containing rows based on the first column years 1 bears 1 cats 2 thanks (7 Replies)
Discussion started by: phil_heath
7 Replies

5. Shell Programming and Scripting

Help need in Deleting Characters

Hi, I have a log file whose size is number of characters in the file with multiple lines. Example: SQL*Loader: Release 10.2.0.4.0 - Production on Sat Sep 12 07:55:29 2009 Copyright (c) 1982, 2007, Oracle. All rights reserved. Control File: ../adm/ctl/institution.ctl Character Set... (4 Replies)
Discussion started by: rajeshorpu
4 Replies

6. UNIX for Dummies Questions & Answers

Need help with deleting certain characters on a line

I have a file that looks like this: It is a huge file and basically I want to delete everything at the > line except for the number after “C”. >c1154... (2 Replies)
Discussion started by: kylle345
2 Replies

7. Shell Programming and Scripting

Deleting Characters at specific position in a line if the line is certain length

I've got a file that would have lines similar to: 12345678 x.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 23456781 x.00 xx.00 xx.00 xx.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 34567812 x.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 45678123 x.00 xx.00 xx.00 xx.00 xx.00 x.00 xxx.00 xx.00 xx.00 xx.00 xx.00... (10 Replies)
Discussion started by: Cailet
10 Replies

8. Shell Programming and Scripting

deleting last characters of a word

Hi All is there a way to delete last n characters from a word like say i have employee_new i want to delete _new. and just get only employee I want this in AIX Shell scripting Thanks (3 Replies)
Discussion started by: rajaryan4545
3 Replies

9. Shell Programming and Scripting

Deleting First Two Characters On Each Line

How would one go about deleting the first two characters on each line of a file on Unix? I thought about using awk, but cannot seem to find if it can explicitly do this. In this case there might or might not be a field separator. Meaning that the data might look like this. 01999999999... (5 Replies)
Discussion started by: scotbuff
5 Replies

10. Shell Programming and Scripting

Deleting the blank line in a file and counting the characters....

Hi, I am trying to do two things in my script. I will really appreciate any help in this regards. Is there a way to delete a last line from a pipe delimited flat file if the last line is blank. If the line is not blank then do nothing..... Is there a way to count a word that are starting... (4 Replies)
Discussion started by: rkumar28
4 Replies
Login or Register to Ask a Question