Replace CRLF between pipe (|) delimiter with LF


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Replace CRLF between pipe (|) delimiter with LF
# 1  
Old 12-12-2017
Replace CRLF between pipe (|) delimiter with LF

Hi Folks!

Need a solution for the following :-

Source data
-------------
Code:
123|123|<CRLF><CRLF><CRLF>|321<CRLF>

Required output
------------------
Code:
123|123|<LF><LF><LF>|321<CRLF>

<CRLF> represents carriage return
<LF> represents line feed

Being hunting high and low for a proper awk or sed statement to get the ball rolling but could not yet turn up with anything proper.

Appreciate your expertise!

Zz
# 2  
Old 12-12-2017
Does <CRLF> represent a (binary control character) carriage return OR a <CR><LF> combination? Which should persist at line end? Are you aware that both the original data as well as the result will be difficult to be dealt with by usual *nix text tools?
Are there more lines like above in your file? How are those separated?

Last edited by RudiC; 12-12-2017 at 01:48 PM..
# 3  
Old 12-12-2017
Quote:
Originally Posted by RudiC
Does <CRLF> represent a (binary control character) carriage return OR a <CR><LF> combination? Which should persist at line end? Are you aware that both the original data as well as the result will be difficult to be dealt with by usual *nix text tools?
Are there more lines like above in your file? How are those separated?
Hi Rudi,

<CRLF> represents the binary control character for carriage return. ^M character from a vi perspective.

The record delimiter is <CRLF> and should remain as it is. It is the <CRLF> within a field (between two pipes) that should be converted to <LF>.

And yes, there are way more lines in the which has similar issues in the data unfortunately.

I do understand its going to be tricky handling with the usual unix tools but I am looking for a possibility if any just to try it out Smilie

Zz
# 4  
Old 12-12-2017
Sorry I keep nagging. How are the lines separated, and how differs that from the in-field control characters? Sure there's NO <LF> char?
Please post the output of
Code:
od -tx1c file

.
# 5  
Old 12-12-2017
Quote:
Originally Posted by RudiC
Sorry I keep nagging. How are the lines separated, and how differs that from the in-field control characters? Sure there's NO <LF> char?
Please post the output of
Code:
od -tx1c file

.
No worries Rudi! I am the needy one here hehe..

Code:
0000000  32  30  31  36  2d  31  31  2d  33  30  7c  32  30  31  36  2d
          2   0   1   6   -   1   1   -   3   0   |   2   0   1   6   -
0000020  32  30  31  37  7c  32  30  31  36  2d  31  31  2d  33  30  7c
          2   0   1   7   |   2   0   1   6   -   1   1   -   3   0   |
0000040  31  32  33  34  7c  73  6f  6d  65  66  69  6c  65  2e  74  78
          1   2   3   4   |   s   o   m   e   f   i   l   e   .   t   x
0000060  74  7c  50  72  6f  64  75  63  74  69  6f  6e  7c  4e  6f  7c
          t   |   P   r   o   d   u   c   t   i   o   n   |   N   o   |
0000100  7c  7c  4c  4f  7c  7c  43  65  6e  74  65  72  7c  7c  4e  6f
          |   |   L   O   |   |   C   e   n   t   e   r   |   |   N   o
0000120  7c  7c  7c  31  32  33  34  7c  49  6d  70  6f  72  74  61  6e
          |   |   |   1   2   3   4   |   I   m   p   o   r   t   a   n
0000140  74  7c  3c  20  24  32  30  20  4d  69  6c  6c  69  6f  6e  7c
          t   |   <       $   2   0       M   i   l   l   i   o   n   |
0000160  51  75  61  72  74  65  72  6c  79  7c  7c  7c  7c  32  30  31
          Q   u   a   r   t   e   r   l   y   |   |   |   |   2   0   1
0000200  31  2d  30  32  2d  32  34  7c  7c  7c  53  6f  6d  65  20  64
          1   -   0   2   -   2   4   |   |   |   S   o   m   e       d
0000220  65  73  63  72  69  70  74  69  6f  6e  20  68  65  72  65  7c
          e   s   c   r   i   p   t   i   o   n       h   e   r   e   |
0000240  0d  0a  0d  0a  0d  0a  0d  0a  0d  0a  0d  0a  55  70  64  61
         \r  \n  \r  \n  \r  \n  \r  \n  \r  \n  \r  \n   U   p   d   a
0000260  74  65  20  73  6f  6d  65  74  68  69  6e  67  7c  74  65  73
          t   e       s   o   m   e   t   h   i   n   g   |   t   e   s
0000300  74  66  69  6c  65  2e  74  78  74  7c  48  69  73  68  61  6d
          t   f   i   l   e   .   t   x   t   |   H   i   s   h   a   m
0000320  0d  0a
         \r  \n
0000322

Sample record of how it appears in the file. The CRLF can honestly appear in any one of the columns prior to the last.

Zz
# 6  
Old 12-12-2017
That's one single line, obviously. And, obviously, as anticipated, we're talking of <CR><LF> combinations. How do you tell one line from another? Do they all have the same field count? Do they all have the same <CR> count?
# 7  
Old 12-12-2017
Quote:
Originally Posted by RudiC
That's one single line, obviously. And, obviously, as anticipated, we're talking of <CR><LF> combinations. How do you tell one line from another? Do they all have the same field count? Do they all have the same <CR> count?
Yes Rudi. It is a single sample record.

The field count should be consistent. As in, if there are 7 fields there ought to be 6 pipes in the data and that is how you group a record as one.

For example :-

Code:
2016-11-30|2016-2017|2016-11-30|123|123.xlsm|Production|No|||AHB||Center||No|||2222|Unit Important|< $20 Million|Quarterly||||2011-02-24|||Some descripto|





Mandatory Fiel|xlsm|Hisham
2016-11-30|2016-2017|2016-11-30|3123|123.xlsm|Production|No|||AHB||Center||No|||2222|Unit Important|< $20 Million|Quarterly||||2011-02-24|||Some descripto|





Mandatory Fiel|xlsm|Hisham

Each record has 30 fields (29 pipes). So an entire record should contain 29 pipes for it be considered one single record.

Which is where the question comes. How can we remove the <CRLF> alone between the pipes.

Zz
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replace delimiter for a particular column in a pipe delimited file

I have an input file as below Emp1|FirstName|MiddleName|LastName|Address|Pincode|PhoneNumber 1234|FirstName1|MiddleName2|LastName3| Add1 || ADD2|123|000000000 Output : 1234|FirstName1|MiddleName2|LastName3| Add1 ,, ADD2|123|000000000 OR 1234,FirstName1,MiddleName2,LastName3, Add1 ||... (2 Replies)
Discussion started by: styris
2 Replies

2. UNIX for Beginners Questions & Answers

Views How to replace a CRLF char from a variable length file in the middle of a string in UNIX?

My sample file is variable length, with out any field delimiters. It has min of 18 chars length and the 'CRLF' is potentially between 12-14 chars. How do I replace this with a space? I still want to keep end of record, but just want to remove these new lines chars in the middle of the data. ... (7 Replies)
Discussion started by: chandrath
7 Replies

3. Shell Programming and Scripting

Problem in using cut command with pipe as a delimiter while using in a script

There is a text file in my project named as "mom.txt" in which i want to have contents like.................. LSCRM(Application Name): 1: This is my first application. 2: Today we did shell scripting automation for this app. 3: It was really a good fun in doing so. 4: Really good.| (Here i... (7 Replies)
Discussion started by: Abhijeet Anand
7 Replies

4. Shell Programming and Scripting

Removing duplicate lines on first column based with pipe delimiter

Hi, I have tried to remove dublicate lines based on first column with pipe delimiter . but i ma not able to get some uniqu lines Command : sort -t'|' -nuk1 file.txt Input : 38376KZ|09/25/15|1.057 38376KZ|09/25/15|1.057 02006YB|09/25/15|0.859 12593PS|09/25/15|2.803... (2 Replies)
Discussion started by: parithi06
2 Replies

5. Red Hat

Converting fixed width file to pipe delimiter in Linux(red-hat)

Hi, I am facing a typical scenario for AWK command . In HP- UNIX is behave as expected but in red hat linux same awk code is not give the same result. The below code is for convert the fixed width file to pipe delimiter file in HP-unix server. awk code: #!/bin/awk -f NR!=1... (11 Replies)
Discussion started by: brij_abhi
11 Replies

6. UNIX for Dummies Questions & Answers

[Solved] How to swap PIPE seperator delimiter?

I have file like below 1|4|OR|OLAP|INT|INT||CONSTANT|2012/08/07|9999/12/31|0|0|0|0|PRL|-358.1684563||||||||||36522|55791|LNR| 2|4|OR|OLAP|CLR|CLR||CONSTANT|2012/09/07|9999/12/31|0|0|0|0|PRL|-358.1684563||||||||||36522|57891|REGS|... (2 Replies)
Discussion started by: gkskumar
2 Replies

7. Shell Programming and Scripting

Replace pipe with Broken Pipe

Hi All , Is there any way to replace the pipe ( | ) with the broken pipe (0xA6) in unix (1 Reply)
Discussion started by: saj
1 Replies

8. UNIX for Dummies Questions & Answers

replacing space with pipe(delimiter)

Hello All, I have a file with thousands of records: eg: |000222|123456987|||||||AARONSON| JOHN P|||PRIMARY |P |000111|567894521|||||||ATHENS| WILLIAM k|||AAAA|L Expected: |000222|123456987|||||||AARONSON| JOHN |P|||PRIMARY |P |000111|567894521|||||||ATHENS| WILLIAM |k|||AAAA|L I... (6 Replies)
Discussion started by: OSD
6 Replies

9. Shell Programming and Scripting

Converting hex value 7C (for pipe) to CRLF in Unix

I am trying to convert a txt file that includes one long string of data. The lines are separated with hex value 7C (for pipe). I am trying to process this file using SQR (Peoplesoft) so I thought the easiest thing to do would be to replace the eol char with a CRLF in unix so I can just... (4 Replies)
Discussion started by: sfedak
4 Replies

10. UNIX for Dummies Questions & Answers

Cutting a portion of a line seperated by pipe delimiter

Hi, In the below line a|b|10065353|tefe|rhraqs|135364|5347575 dgd|rg|4333|fhra|grhrt|46423|urdsgd Here i want to cut the characters in between the second and third pipe delimiter and then between fifth and sixth delimiter and retain the rest of the line. My output should be ... (3 Replies)
Discussion started by: ragavhere
3 Replies
Login or Register to Ask a Question