Removing line breaks inside a field


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing line breaks inside a field
# 1  
Old 08-01-2017
Removing line breaks inside a field

Hi all,

I have a csv input file with total 60 fields and the fields are not enclosed with double quotes.One of the field(50th field) in this file has line breaks in it which results in the row getting split into multiple lines.This is causing my load(to table) to fail.I tried to enforce double quotes to this field using regular expression.This worked well for most of the rows but this didn't work for some of them.I am unable to find the reason for this issue.The command i used is:

Code:
cat input.csv | tr -d '\r' | tr '\n' '§' | sed -E 's/(§([^,]*,){49})([^",]+),/\1"\3",/g' | tr '§' '\n' > output.csv

Can someone please give me the command to remove all the line breaks in this field?

Last edited by Scrutinizer; 08-01-2017 at 01:24 PM.. Reason: code tags
# 2  
Old 08-01-2017
Please become accustomed to provide decent context info of your problem.
It is always helpful to support a request with system info like OS and shell, related environment (variables, options), preferred tools, adequate (representative) sample input and desired output data and the logics connecting the two, and, if existent, system (error) messages verbatim, to avoid ambiguities and keep people from guessing.

This is one of the prevalent problems in these fora - did you try searching for solutions? One approach would be to read / append lines until the field count is correct.
# 3  
Old 08-01-2017
Hi Bobby_2000,
Expanding a little bit on what RudiC has already said...

How big are the files you're trying to process?

What operating system are you using?

What output do you get from the following command?
Code:
getconf LINE_MAX

(Note that sed is only specified to work on text files and you are turning your input files into a single, partial line to be processed by sed. By definition, a text file can't have any lines with more bytes than the number printed by the above command and each line has to have a <newline> character line terminator. Some versions of sed will let you get by with some input files that have long lines, missing line terminators, or both; others won't.)

Please show us some sample input that produces output that doesn't match what you want (in CODE tags), show us the output you get with the pipeline you showed us in post #1 (in CODE tags) with that sample input, and show us the output you want (also in CODE tags) from that sample input.

And, please show us any diagnostics produced by your pipeline exactly as they are printed (also in CODE tags) if there are any.
# 4  
Old 08-01-2017
Code:
awk '/,/ {if (e) print e; if (NR>1 && !e) print ""; printf $0; e=""} ! /,/ {e=e $0} END {print e}' infile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How can awk ignore the field delimiter like comma inside a field?

We have a csv file as mentioned below and the requirement is to change the date format in file as mentioned below. Current file (file.csv) ---------------------- empname,date_of_join,dept,date_of_resignation ram,08/09/2015,sales,21/06/2016 "akash,sahu",08/10/2015,IT,21/07/2016 ... (6 Replies)
Discussion started by: gopal.biswal
6 Replies

2. Shell Programming and Scripting

[BASH] read 'line' issue with leading tabs and virtual line breaks

Heyas I'm trying to read/display a file its content and put borders around it (tui-cat / tui-cat -t(ypwriter). The typewriter-part is a 'bonus' but still has its own flaws, but thats for later. So in some way, i'm trying to rewrite cat using bash and other commands. But sadly it fails on... (2 Replies)
Discussion started by: sea
2 Replies

3. UNIX for Dummies Questions & Answers

Page breaks and line breaks

Hi All, Need an urgent solution to an issue . We have created a ksh file or shell script which generates 1 DAT file. the DAT file contains extract of a select statement . Now the issue is , when we are executing the ksh file , the output is coimng with page breaks and line breaks . We have... (4 Replies)
Discussion started by: Ayaskant
4 Replies

4. UNIX for Dummies Questions & Answers

Add a field separator (comma) inside a line of a CSV file

Hi... I can't find my little red AWK book and it's been a long while since I've awk'd. But I need to take a CSV file and convert the first word of the fifth field to its own field by replacing a space with a comma. This is for importing a spreadsheet of issues into JIRA... Example: a line... (9 Replies)
Discussion started by: Tawpie
9 Replies

5. Shell Programming and Scripting

Questions on removing unexpected line breaks

I am a newbie in Linux and I am having trouble with a piece of data on hand. The source data is like a|b|c|d e|f|g |h i|j|k|l m|n|o |p 1|2|3|4 5|6|7| 8 a|b|c|d e|f|g|h For each line, there should be 4 fields separated by the "|", but unfortunately there are unexpected line breaks... (13 Replies)
Discussion started by: Nekki Basara
13 Replies

6. Shell Programming and Scripting

awk, comma as field separator and text inside double quotes as a field.

Hi, all I need to get fields in a line that are separated by commas, some of the fields are enclosed with double quotes, and they are supposed to be treated as a single field even if there are commas inside the quotes. sample input: for this line, 5 fields are supposed to be extracted, they... (8 Replies)
Discussion started by: kevintse
8 Replies

7. Shell Programming and Scripting

Help with wc and line breaks

Hi everyone, I have gone through the forum trying to find an answer to this question but was unsuccessful. I am hoping that someone can help me with this please. I am trying to get my script to recognise line breaks from a file and to give me a result for wc of each line. So basically, if you... (7 Replies)
Discussion started by: stargazerr
7 Replies

8. Shell Programming and Scripting

any better way to remove line breaks

Hi, I got some log files which print the whole xml message in separate lines: e.g. 2008-10-01 14:21:44,561 INFO do something 2008-10-01 14:21:44,561 INFO print xml : <?xml version="1.0" encoding="UTF-8"?> <a> <b>my data</b> </a> 2008-10-01 14:21:44,563 INFO do something again I want... (3 Replies)
Discussion started by: csmklee
3 Replies

9. UNIX for Dummies Questions & Answers

removing last field of the line

I have a text file containing /database/sp/NTR_Vlr_Upload.sql /database/tables/StatsTables.sql /mib/ntr.mib /mib/ntr.v2.mib /scripts/operations/ntr/IMSITracer.ph i want the last field after "/" removed like /database/sp/ /database/tables/ /mib/ /mib/ ... (4 Replies)
Discussion started by: adddy
4 Replies

10. Shell Programming and Scripting

Removing line breaks from a shell variable

Here is my snippet of code... getDescription() { DESCRIPTION=$(dbaccess dncsdb - << ! 2>/dev/null|sed -e 's/hctt_description//' -e '/^$/ d'|tr -d '\r' select hct_type.hctt_description from hct_type,hct_profile where hct_type.hctt_id=hct_profile.hctt_id and... (5 Replies)
Discussion started by: lyonsd
5 Replies
Login or Register to Ask a Question