How to cut a pipe delimited file and paste it with another file to form a comma separated outputfile


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to cut a pipe delimited file and paste it with another file to form a comma separated outputfile
# 15  
Old 10-05-2014
1st: This is a UNIX and Linux forum. Unless explicitly told otherwise, we do not expect input files using DOS <carriagereturn><newline> line terminators; we expect UNIX and Linux <newline> line terminators. So all of the code we suggested that adds quotes ends up with a final field that just contains a quoted <carriagereturn> character.

2nd: Running the following script with the A4.txt that you provided:
Code:
#!/bin/ksh
sed 's/|/","/g; s/^/"/; s/$/"/' A4.txt > A5sed.txt
awk -F'|' -v OFS='","' 'NF{$1 = $1; $0 = "\"" $0 "\""}1' A4.txt > A5q.txt
awk -F'|' -v OFS=',' '{$1 = $1}1' A4.txt > A5nq.txt

produces the following in the three output files (shown using cat -v to maek the carriage returns visible:
Code:
$ cat -v A5sed.txt
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      250000.00","      121232.87","           0.00","B64C8100                                ","08/04/2014                              ","2014-08-19-00.47.32.050493              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      500000.00","      242465.75","           0.00","B64C8100                                ","08/04/2014                              ","2014-08-19-00.47.32.050627              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      315285.83","      152882.45","           0.00","B64C8100                                ","01/01/0001                              ","2014-08-19-00.45.47.744917              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      520376.42","      250916.73","           0.00","B64C8100                                ","08/04/2014                              ","2014-08-19-00.47.20.793454              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","     1000131.51","      482246.54","           0.00","B64C8100                                ","08/04/2014                              ","2014-08-19-00.47.20.793644              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      150037.33","       72344.30","           0.00","B64C8100                                ","08/04/2014                              ","2014-08-19-00.46.47.306701              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      646358.39","      311668.70","           0.00","B64C8100                                ","08/04/2014                              ","2014-08-19-00.47.08.815658              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      110000.00","       53041.08","           0.00","B64C8100                                ","08/04/2014                              ","2014-08-19-00.46.50.346962              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      158213.08","      160383.33","         750.00","B64C8100                                ","08/23/2012                              ","2014-08-19-00.45.33.451061              ","^M"
"DEP","2","08/19/2014","SECOND TEST FILE DESCRIPTION                      ","      140383.43","      132266.13","        1400.00","B64C8100                                ","09/06/2012                              ","2014-08-19-00.45.33.451359              ","^M"
$

The contents of A5q.txt are identical to the contents of A5sed.txt.
Code:
$ cat -v A5nq.txt
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      250000.00,      121232.87,           0.00,B64C8100                                ,08/04/2014                              ,2014-08-19-00.47.32.050493              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      500000.00,      242465.75,           0.00,B64C8100                                ,08/04/2014                              ,2014-08-19-00.47.32.050627              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      315285.83,      152882.45,           0.00,B64C8100                                ,01/01/0001                              ,2014-08-19-00.45.47.744917              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      520376.42,      250916.73,           0.00,B64C8100                                ,08/04/2014                              ,2014-08-19-00.47.20.793454              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,     1000131.51,      482246.54,           0.00,B64C8100                                ,08/04/2014                              ,2014-08-19-00.47.20.793644              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      150037.33,       72344.30,           0.00,B64C8100                                ,08/04/2014                              ,2014-08-19-00.46.47.306701              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      646358.39,      311668.70,           0.00,B64C8100                                ,08/04/2014                              ,2014-08-19-00.47.08.815658              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      110000.00,       53041.08,           0.00,B64C8100                                ,08/04/2014                              ,2014-08-19-00.46.50.346962              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      158213.08,      160383.33,         750.00,B64C8100                                ,08/23/2012                              ,2014-08-19-00.45.33.451061              ,^M
DEP,2,08/19/2014,SECOND TEST FILE DESCRIPTION                      ,      140383.43,      132266.13,        1400.00,B64C8100                                ,09/06/2012                              ,2014-08-19-00.45.33.451359              ,^M
$

All of these are exactly what we would expect for the input file you provided!

To get the output you showed us, you had to use different commands than those we suggested you use. (Most likely you are using the wrong quotes in -F'|' or are using something like ¦ instead of |.)

To correctly process your DOS files on UNIX systems, change the DOS line terminators in your input file to UNIX line terminators using:
Code:
dos2unix input output

where input is a DOS file and output is the name of the file you want to create with corrected line terminators.
# 16  
Old 10-05-2014
If none of the above suggestions work and you have exactly copied the commands suggested, try changing:
Code:
awk -F'|' ...

to:
Code:
awk -F'[|]' ...

and see if that makes any difference with the awk on AIX systems.
# 17  
Old 10-06-2014
2 things Don

1)I still dint get the AWk working both for (q and nq).I also tried the
Code:
awk -F'[|]' -v OFS='","' 'NF{$1 = $1; $0 = "\"" $0 "\""}1' A4.txt > A5q.csv

.dont know if its somethng to do with AIX ..

but the SED works good wondering if it would take care of both q and nq if not can you plz help me with that sed code for q.

2)second thing is i dont know abt the DOS vs UNIX line terminators..However I dint get the necessity of using [CODE][dos2unix input output] as sed code would give me a perfect csv output which when again transferred to windows using winscp text mode opens perfectly aligned using microsoft excel.
Please tell me where I should use dos2unix.

Last edited by Don Cragun; 10-06-2014 at 03:54 AM.. Reason: Fix CODE tags again.
# 18  
Old 10-06-2014
Quote:
Originally Posted by etldev
2 things Don

1)I still dint get the AWk working both for (q and nq).I also tried the
Code:
awk -F'[|]' -v OFS='","' 'NF{$1 = $1; $0 = "\"" $0 "\""}1' A4.txt > A5q.csv

.dont know if its somethng to do with AIX ..

but the SED works good wondering if it would take care of both q and nq if not can you plz help me with that sed code for q.

2)second thing is i dont know abt the DOS vs UNIX line terminators..However I dint get the necessity of using [CODE][dos2unix input output] as sed code would give me a perfect csv output which when again transferred to windows using winscp text mode opens perfectly aligned using microsoft excel.
Please tell me where I should use dos2unix.
I know that awk (not AWk and not AWK) work OK on AIX. Something else is going on here.

I assume that you have the three commands I suggested in a file that you executed to get the results you got. Show us the output from the command:
Code:
od -bc file

where file is the name of the file containing those commands.

The sed command I gave you (copied from MadeInGermany's much earlier suggestion) changes all pipe symbols to "," and then adds " to the start and end of each line. You said that is what you want. I don't know what you mean by "help me with that sed code for q"???

You say that having the carriage return at the end of your input lines included between quotes in the last input field is what you want. If that is true, you don't need to worry about dos2unix. (I don't believe you, but if that is what you want, there is no reason to try to change it.)
# 19  
Old 10-06-2014
I'm very puzzled with Don's trick using
Code:
...{$1=$1 ; ... .

It makes specific expansions on $0 using FS and OFS variables

Of course it works very fine on my PC cygwin version, as I have the most recent version of awk
It looks rather like undocumented features witch have unpredictable effects on old version.

You can check your own version of awk using
Code:
awk -V

So far it seems the best issue is still to use the original suggestion made by MadeInGermany:

Code:
sed 's/|/","/g; s/^/"/; s/$/"/'

We can translate this to awk this way for instance, using the exact equivalent to sed 's' : gsub.
Let's add too Don's trick to preverve empty lines.

Try this :
Code:
$ awk -F'|' 'NF{ gsub(/\|/,"\",\"") ; $0 = "\"" $0 "\"" }1'

Regarding the dos2unix tool, you can easily use instead the short sed line :
Code:
$ sed 's/$/\r/'  unixfile > dosfile

but of course winscp does this conversion perfectly.

Jean-Paul

Last edited by blastit.fr; 10-06-2014 at 04:58 AM.. Reason: typoes
# 20  
Old 10-06-2014
Quote:
Originally Posted by blastit.fr
I'm very puzzled with Don's trick using
Code:
...{$1=$1 ; ... .

It makes specific expansions on $0 using FS and OFS variables

Of course it works very fine on my PC cygwin version, as I have the most recent version of awk
It looks rather like undocumented features witch have unpredictable effects on old version.

... ... ...

Jean-Paul
This is not some undocumented trick. From the standards:
Quote:
The symbol $0 shall refer to the entire record;
setting any other field causes the re-evaluation of $0.
Assigning to $0 shall reset the values of all other fields and the NF built-in variable.
Part of that re-evaluation includes using OFS as the field delimiter when that record is printed.

This might not work in /usr/bin/awk on Solaris systems (a 1975 vintage awk), but will work on any 1988 or later version of awk which will be installed as awk, gawk, or mawk on most systems; and as /usr/xp4/bin/awk, /usr/xp6/bin/awk, and /usr/bin/nawk on Solaris systems.
# 21  
Old 10-07-2014
the sed code works fine but the only issue I am faced with now is when the pipes are being converted to commas i.e when my file is made csv and when opened using Microsoft excel I am losing the leading zeroes in some fields..Any solution as to how we can preserve the leading zeroes from falling off in excel.

Code:
sed 's/|/","/g; s/^/"/; s/$/"/' A4.txt > A5.csv


Last edited by vbe; 10-07-2014 at 09:26 AM.. Reason: code -in between the code tags!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Linux convert Comma delimited file to pipe

I have file in linux with comma delimited and string fields in double quotations ", I need to convert them to pipe delimiter please share your inputs. Example: Input: "2017-09-30","ACBD,TVF","01234",NULL,18,NULL,"686091802","BANK OF ABCD, LIMITED, THE",790456 Output: ... (4 Replies)
Discussion started by: shieksir
4 Replies

2. UNIX for Dummies Questions & Answers

Need to convert a pipe delimited text file to tab delimited

Hi, I have a rquirement in unix as below . I have a text file with me seperated by | symbol and i need to generate a excel file through unix commands/script so that each value will go to each column. ex: Input Text file: 1|A|apple 2|B|bottle excel file to be generated as output as... (9 Replies)
Discussion started by: raja kakitapall
9 Replies

3. Shell Programming and Scripting

Comma separated file

Hi all, I have the following files types: FileA: 100, 23, 33, FileB: 22, 45, 78, and i want to make File C: 100,22 23,45 33,78 any nice suggestions for making it easy. (3 Replies)
Discussion started by: hen1610
3 Replies

4. Shell Programming and Scripting

Help with converting Pipe delimited file to Tab Delimited

I have a file which was pipe delimited, I need to make it tab delimited. I tried with sed but no use cat file | sed 's/|//t/g' The above command substituted "/t" not tab in the place of pipe. Sample file: abc|123|2012-01-30|2012-04-28|xyz have to convert to: abc 123... (6 Replies)
Discussion started by: karumudi7
6 Replies

5. Shell Programming and Scripting

How to convert a space delimited file into a pipe delimited file using shellscript?

Hi All, I have space delimited file similar to the one as shown below.. I need to convert it as a pipe delimited, the values inside the pipe delimited file should be as highlighted... AA ATIU2345098809 009697 005374 BB ATIU2345097809 005445 006518 CC ATIU9685098809 003215 003571 DD... (7 Replies)
Discussion started by: nithins007
7 Replies

6. Shell Programming and Scripting

How to format file into comma separated text file?

Hi Guys, I have text file which is tab/space separated but I want it to re-format into a comma separated and trim the spaces in between. Can someone spare me a perl or sed script that can do the job? INPUT FILE: 500010245623 500 21-APR-11 05.58.21 PM ... (14 Replies)
Discussion started by: pinpe
14 Replies

7. Shell Programming and Scripting

Converting comma separated to pipe delimited file

Hi, I came across a very good script to convert a comma seperated to pipe delimited file in this forum. the script serves most of the requirement but looks like it does not handle embedded double quotes and commas i.e if the input is like 1234, "value","first,second", "LDC5"monitor",... (15 Replies)
Discussion started by: anijan
15 Replies

8. Shell Programming and Scripting

Cut and paste data in matrix form

I have large formatted data file with five columns. This has to be rearranged in lower order matrix form as shown below for sample data. 1 2 3 4 5 1.0 3.0 2.0 5.0 3.0 2.0 4.0 3.0 1.0 6.0 2.0 3.0 4.0 5.0 1.0 1.0 4.0 2.0 3.0 5.0 3.0 5.0 4.0 2.0 8.0 1.0 3.0 2.0 4.0 5.0 2.0... (7 Replies)
Discussion started by: dhilipumich
7 Replies

9. Shell Programming and Scripting

convert a pipe delimited file to a':" delimited file

i have a file whose data is like this:: osr_pe_assign|-120|wg000d@att.com|4| osr_evt|-21|wg000d@att.com|4| pe_avail|-21|wg000d@att.com|4| osr_svt|-11|wg000d@att.com|4| pe_mop|-13|wg000d@att.com|4| instar_ready|-35|wg000d@att.com|4| nsdnet_ready|-90|wg000d@att.com|4|... (6 Replies)
Discussion started by: priyanka3006
6 Replies

10. Shell Programming and Scripting

Converting Tab delimited file to Comma delimited file in Unix

Hi, Can anyone let me know on how to convert a Tab delimited file to Comma delimited file in Unix Thanks!! (22 Replies)
Discussion started by: charan81
22 Replies
Login or Register to Ask a Question