Sponsored Content
Top Forums UNIX for Beginners Questions & Answers Remove newline character from column spread over multiple lines in a file Post 303038972 by Prathmesh on Wednesday 18th of September 2019 01:16:26 PM
Old 09-18-2019
Remove newline character from column spread over multiple lines in a file

Hi,

I came across one issue recently where output from one of the columns of the table from where i am creating input file has newline characters hence, record in the file is spread over multiple lines. Fields in the file are separated by pipe (|) delimiter. As header will never have newline character, I am trying to compare if other rows have same number of fields as that of header and if number of fields in particular row is less than number of fields in header line then I am removing newline character at the end of the line. I was able to do this for row spread over two lines but, I am not getting correct output for lines spread over multiple lines.

Below is test input file and expected output file -

Input file -
Code:
$ cat input
id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is
KATHMANDU|Nepali
4|QATAR|DOHA
is capital of
QATAR|Urdu
5|INDIA|capital
of
INDIA
is DELHI|Hindi
$

Expected output file -
Code:
id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is KATHMANDU|Nepali
4|QATAR|DOHA is capital of QATAR|Urdu
5|INDIA|capital of INDIA is DELHI|Hindi

Below code worked for row spread over two lines -
Code:
$ awk -F"|" '{if(NR==1){COL=NF}}{if(NF < COL){ sub(/\n/, ""); T=$0; getline; print T $0; next}}1' input
id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is KATHMANDU|Nepali
4|QATAR|DOHA is capital of
QATAR|Urdu5|INDIA|capital
of INDIA
is DELHI|Hindiis DELHI|Hindi
$

I also tried below code but it is not giving expected output -
Code:
 $ awk -F"|" '{if(NR==1){COL=NF}}{
> L_NF=NF
> C_NR=NR
> NL=$0
> CNT=0
> while(L_NF != COL)
> {
> C_NF=NF
> sub(/\n/, "");
> getline;
> NL=NL" "$0;
> CNT=+1
> L_NF=C_NF+NF
> }
> print NL
> }
> {
> for(i=0;i<=CNT;i++)
> {
> next
> }
> {
> print $0
> }}' input
id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is  KATHMANDU|Nepali 4|QATAR|DOHA  is capital of
QATAR|Urdu 5|INDIA|capital  of
INDIA  is DELHI|Hindi is DELHI|Hindi
$

Can someone please help me in this?
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Can I spread commands over multiple lines?

Below an example of what I mean. The first attempt does what I want; the second doesn't, because bash assumes a line break means the end of an individual "command unix". Is there some way that I can convince bash to parse out, eg, to the closing parenthesis? I'm thinking this would allow for... (1 Reply)
Discussion started by: tphyahoo
1 Replies

2. Shell Programming and Scripting

removing pattern which is spread in multiple lines

I have several huge files wich contains oracle table creation scripts as follows: I would need to remove the pattern colored in red above. Any sed/awk/pearl code will be of much help. Thanks (2 Replies)
Discussion started by: sabyasm
2 Replies

3. Shell Programming and Scripting

How to remove a newline character at the end of filename

Hi All, I have named a file with current date,time and year as follows: month=`date | awk '{print $2}'` date=`date | awk '{print $3}'` year=`date | awk '{print $6}'` time=`date +%Hh_%Mm_%Ss'` filename="test_"$month"_"$date"_"$year"_"$time".txt" > $filename The file is created with a... (2 Replies)
Discussion started by: amio
2 Replies

4. Shell Programming and Scripting

To remove the newline character while appending into a file

Hi All, We append the output of a file's size in a file. But a newline character is appended after the variable. Pls help how to clear this. filesize=`ls -l test.txt | awk `{print $5}'` echo File size of test.txt is $filesize bytes >> logfile.txt The output we got is, File size of... (4 Replies)
Discussion started by: amio
4 Replies

5. Shell Programming and Scripting

Remove newline character conditionally

Hi All, I have 5000 records like this Request_id|Type|Status|Priority|Ticket Submitted Date and Time|Actual Resolved Date and Time|Current Ticket Owner Group|Case final Ticket Owner Group|Customer Severity|Reported Symptom/Request|Component|Hot Topic|Reason for Missed SLA|Current Ticket... (2 Replies)
Discussion started by: j_53933
2 Replies

6. Shell Programming and Scripting

[AWK] handeling data spread on multiple lines

Hello all, first off great forum. Now for my little problem. Using RHEL 5.4 and awk. Been doing code since a few month. So just starting. My problem is handeling data on multiple lines. { if ($1 != LASTKEY && h ~ /.*\/s_fr_/) { checkgecos( h, h ) h="" ... (2 Replies)
Discussion started by: maverick72
2 Replies

7. Shell Programming and Scripting

Remove \n <newline> character inside the records.

Hi, In my file, I have '\n' characters inside a single record. Because of this, a single records appears in many lines and looks like multiple records. In the below file. File 1 ==== 1,nmae,lctn,da\n t 2,ghjik,o\n ut,de\n fk Expected output after the \n removed File 2 =====... (5 Replies)
Discussion started by: machomaddy
5 Replies

8. Shell Programming and Scripting

Remove newline character between two delimiters

hi i am having delimited .dat file having content like below. test.dat(5 line of records) ====== PT2~Stag~Pt2 Stag Test. Updated~PT2 S T~Area~~UNCEF R20~~2012-05-24 ~2014-05-24~~ PT2~Stag y~Pt2 Stag Test. Updated~PT2 S T~Area~METR~~~2012-05-24~2014-05-24~~test PT2~Pt2 Stag Test~~PT2 S... (4 Replies)
Discussion started by: sushine11
4 Replies

9. Shell Programming and Scripting

Remove last newline character..

Hi all.. I have a text file which looks like below: abcd efgh ijkl (blank space) I need to remove only the last (blank space) from the file. When I try wc -l the file name,the number of lines coming is 3 only, however blank space is there in the file. I have tried options like... (14 Replies)
Discussion started by: Sathya83aa
14 Replies

10. Shell Programming and Scripting

How to remove newline character if it is the only character in the entire file.?

I have a file which comes every day and the file data look's as below. Vi abc.txt a|b|c|d\n a|g|h|j\n Some times we receive the file with only a new line character in the file like vi abc.txt \n (8 Replies)
Discussion started by: rak Kundra
8 Replies
COLRM(1)						    BSD General Commands Manual 						  COLRM(1)

NAME
colrm -- remove columns from a file SYNOPSIS
colrm [start [stop]] DESCRIPTION
The colrm utility removes selected columns from the lines of a file. A column is defined as a single character in a line. Input is read from the standard input. Output is written to the standard output. If only the start column is specified, columns numbered less than the start column will be written. If both start and stop columns are spec- ified, columns numbered less than the start column or greater than the stop column will be written. Column numbering starts with one, not zero. Tab characters increment the column count to the next multiple of eight. Backspace characters decrement the column count by one. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of colrm as described in environ(7). EXIT STATUS
The colrm utility exits 0 on success, and >0 if an error occurs. SEE ALSO
awk(1), column(1), cut(1), paste(1) HISTORY
The colrm command appeared in 3.0BSD. BSD
August 4, 2004 BSD
All times are GMT -4. The time now is 06:04 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy