Remove newline character from column spread over multiple lines in a file Post: 303038972

Sponsored Content

Top Forums UNIX for Beginners Questions & Answers Remove newline character from column spread over multiple lines in a file Post 303038972 by Prathmesh on Wednesday 18th of September 2019 01:16:26 PM

09-18-2019

Registered User

Remove newline character from column spread over multiple lines in a file

Hi,

I came across one issue recently where output from one of the columns of the table from where i am creating input file has newline characters hence, record in the file is spread over multiple lines. Fields in the file are separated by pipe (|) delimiter. As header will never have newline character, I am trying to compare if other rows have same number of fields as that of header and if number of fields in particular row is less than number of fields in header line then I am removing newline character at the end of the line. I was able to do this for row spread over two lines but, I am not getting correct output for lines spread over multiple lines.

Below is test input file and expected output file -

Input file -

Code:

$ cat input
id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is
KATHMANDU|Nepali
4|QATAR|DOHA
is capital of
QATAR|Urdu
5|INDIA|capital
of
INDIA
is DELHI|Hindi
$

Expected output file -

Code:

id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is KATHMANDU|Nepali
4|QATAR|DOHA is capital of QATAR|Urdu
5|INDIA|capital of INDIA is DELHI|Hindi

Below code worked for row spread over two lines -

Code:

$ awk -F"|" '{if(NR==1){COL=NF}}{if(NF < COL){ sub(/\n/, ""); T=$0; getline; print T $0; next}}1' input
id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is KATHMANDU|Nepali
4|QATAR|DOHA is capital of
QATAR|Urdu5|INDIA|capital
of INDIA
is DELHI|Hindiis DELHI|Hindi
$

I also tried below code but it is not giving expected output -

Code:

 $ awk -F"|" '{if(NR==1){COL=NF}}{
> L_NF=NF
> C_NR=NR
> NL=$0
> CNT=0
> while(L_NF != COL)
> {
> C_NF=NF
> sub(/\n/, "");
> getline;
> NL=NL" "$0;
> CNT=+1
> L_NF=C_NF+NF
> }
> print NL
> }
> {
> for(i=0;i<=CNT;i++)
> {
> next
> }
> {
> print $0
> }}' input
id|country|desscription|Language
1|UNITED STATES|WASHINGTON, D.C.|English
2|UNITED KINGDOM|Capital of UK is LONDON|English
3|NEPAL|Capital of NEPAL is  KATHMANDU|Nepali 4|QATAR|DOHA  is capital of
QATAR|Urdu 5|INDIA|capital  of
INDIA  is DELHI|Hindi is DELHI|Hindi
$

Can someone please help me in this?

Prathmesh

View Public Profile for Prathmesh

Find all posts by Prathmesh

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Can I spread commands over multiple lines?

Below an example of what I mean. The first attempt does what I want; the second doesn't, because bash assumes a line break means the end of an individual "command unix". Is there some way that I can convince bash to parse out, eg, to the closing parenthesis? I'm thinking this would allow for...

2. Shell Programming and Scripting

removing pattern which is spread in multiple lines

I have several huge files wich contains oracle table creation scripts as follows: I would need to remove the pattern colored in red above. Any sed/awk/pearl code will be of much help. Thanks

3. Shell Programming and Scripting

How to remove a newline character at the end of filename

Hi All, I have named a file with current date,time and year as follows: month=`date | awk '{print $2}'` date=`date | awk '{print $3}'` year=`date | awk '{print $6}'` time=`date +%Hh_%Mm_%Ss'` filename="test_"$month"_"$date"_"$year"_"$time".txt" > $filename The file is created with a...

4. Shell Programming and Scripting

To remove the newline character while appending into a file

Hi All, We append the output of a file's size in a file. But a newline character is appended after the variable. Pls help how to clear this. filesize=`ls -l test.txt | awk `{print $5}'` echo File size of test.txt is $filesize bytes >> logfile.txt The output we got is, File size of...

5. Shell Programming and Scripting

Remove newline character conditionally

6. Shell Programming and Scripting

[AWK] handeling data spread on multiple lines

Hello all, first off great forum. Now for my little problem. Using RHEL 5.4 and awk. Been doing code since a few month. So just starting. My problem is handeling data on multiple lines. { if ($1 != LASTKEY && h ~ /.*\/s_fr_/) { checkgecos( h, h ) h="" ...

7. Shell Programming and Scripting

Remove \n <newline> character inside the records.

Hi, In my file, I have '\n' characters inside a single record. Because of this, a single records appears in many lines and looks like multiple records. In the below file. File 1 ==== 1,nmae,lctn,da\n t 2,ghjik,o\n ut,de\n fk Expected output after the \n removed File 2 =====...

8. Shell Programming and Scripting

Remove newline character between two delimiters

hi i am having delimited .dat file having content like below. test.dat(5 line of records) ====== PT2~Stag~Pt2 Stag Test. Updated~PT2 S T~Area~~UNCEF R20~~2012-05-24 ~2014-05-24~~ PT2~Stag y~Pt2 Stag Test. Updated~PT2 S T~Area~METR~~~2012-05-24~2014-05-24~~test PT2~Pt2 Stag Test~~PT2 S...

9. Shell Programming and Scripting

Remove last newline character..

Hi all.. I have a text file which looks like below: abcd efgh ijkl (blank space) I need to remove only the last (blank space) from the file. When I try wc -l the file name,the number of lines coming is 3 only, however blank space is there in the file. I have tried options like...

10. Shell Programming and Scripting

How to remove newline character if it is the only character in the entire file.?

I have a file which comes every day and the file data look's as below. Vi abc.txt a|b|c|d\n a|g|h|j\n Some times we receive the file with only a new line character in the file like vi abc.txt \n

LEARN ABOUT MOJAVE

locale::language5.18

Locale::Language(3pm)					 Perl Programmers Reference Guide				     Locale::Language(3pm)

NAME

       Locale::Language - standard codes for language identification

SYNOPSIS

	  use Locale::Language;

	  $lang = code2language('en');	      # $lang gets 'English'
	  $code = language2code('French');    # $code gets 'fr'

	  @codes   = all_language_codes();
	  @names   = all_language_names();

DESCRIPTION

       The "Locale::Language" module provides access to standard codes used for identifying languages, such as those as defined in ISO 639.

       Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default ISO 639 two-
       letter codes will be used.

SUPPORTED CODE SETS

       There are several different code sets you can use for identifying languages. A code set may be specified using either a name, or a constant
       that is automatically exported by this module.

       For example, the two are equivalent:

	  $lang = code2language('en','alpha-2');
	  $lang = code2language('en',LOCALE_CODE_ALPHA_2);

       The codesets currently supported are:

       alpha-2, LOCALE_LANG_ALPHA_2
	   This is the set of two-letter (lowercase) codes from ISO 639-1, such as 'he' for Hebrew.  It also includes additions to this set
	   included in the IANA language registry.

	   This is the default code set.

       alpha-3, LOCALE_LANG_ALPHA_3
	   This is the set of three-letter (lowercase) bibliographic codes from ISO 639-2 and 639-5, such as 'heb' for Hebrew.	It also includes
	   additions to this set included in the IANA language registry.

       term, LOCALE_LANG_TERM
	   This is the set of three-letter (lowercase) terminologic codes from ISO 639.

ROUTINES

       code2language ( CODE [,CODESET] )
       language2code ( NAME [,CODESET] )
       language_code2code ( CODE ,CODESET ,CODESET2 )
       all_language_codes ( [CODESET] )
       all_language_names ( [CODESET] )
       Locale::Language::rename_language  ( CODE ,NEW_NAME [,CODESET] )
       Locale::Language::add_language  ( CODE ,NAME [,CODESET] )
       Locale::Language::delete_language  ( CODE [,CODESET] )
       Locale::Language::add_language_alias  ( NAME ,NEW_NAME )
       Locale::Language::delete_language_alias	( NAME )
       Locale::Language::rename_language_code  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Language::add_language_code_alias  ( CODE ,NEW_CODE [,CODESET] )
       Locale::Language::delete_language_code_alias  ( CODE [,CODESET] )
	   These routines are all documented in the Locale::Codes::API man page.

SEE ALSO

       Locale::Codes
	   The Locale-Codes distribution.

       Locale::Codes::API
	   The list of functions supported by this module.

       http://www.loc.gov/standards/iso639-2/
	   Source of the ISO 639-2 codes.

       http://www.loc.gov/standards/iso639-5/
	   Source of the ISO 639-5 codes.

       http://www.iana.org/assignments/language-subtag-registry
	   The IANA language subtag registry.

AUTHOR

       See Locale::Codes for full author history.

       Currently maintained by Sullivan Beck (sbeck@cpan.org).

COPYRIGHT

	  Copyright (c) 1997-2001 Canon Research Centre Europe (CRE).
	  Copyright (c) 2001-2010 Neil Bowers
	  Copyright (c) 2010-2013 Sullivan Beck

       This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.18.2							    2014-01-06						     Locale::Language(3pm)

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Can I spread commands over multiple lines?

Discussion started by: tphyahoo

2. Shell Programming and Scripting

removing pattern which is spread in multiple lines

Discussion started by: sabyasm

3. Shell Programming and Scripting

How to remove a newline character at the end of filename

Discussion started by: amio

4. Shell Programming and Scripting

To remove the newline character while appending into a file

Discussion started by: amio

5. Shell Programming and Scripting

Remove newline character conditionally

Discussion started by: j_53933

6. Shell Programming and Scripting

[AWK] handeling data spread on multiple lines

Discussion started by: maverick72

7. Shell Programming and Scripting

Remove \n <newline> character inside the records.

Discussion started by: machomaddy

8. Shell Programming and Scripting

Remove newline character between two delimiters

Discussion started by: sushine11

9. Shell Programming and Scripting

Remove last newline character..

Discussion started by: Sathya83aa

10. Shell Programming and Scripting

How to remove newline character if it is the only character in the entire file.?

Discussion started by: rak Kundra

LEARN ABOUT MOJAVE

locale::language5.18