Minimum whitespace separated CSV file generation


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Minimum whitespace separated CSV file generation
# 1  
Old 04-25-2009
Minimum whitespace separated CSV file generation

Hi,

I have a flat text file consisting of rows (each field separated by '|') from a table. e.g;

NSW|Gulliver Travels|236||5000|BW

This has to be converted to the following format

NSW "Gulliver Travels" 236 5000 BW

No data field has to be left as a blank character so that we have three blank characters in effect between the two consecutive populated fields in the above scenario.

I have actually other flat files also with the same requirement. The no. of fields are different in the other files. So I am in need of a unix script to do this which can be applied directly to all the flat files irrespective of the no. of columns or fields. I have been trying with awk and I accomplished only the following which is not desirable.

NSW Gulliver Travels 236 5000 BW

Also I used hard-coding for this.
Any pointers to solve this are greatly appreciated and would help me a lot.
# 2  
Old 04-25-2009
Maybe something like this:

Code:
$
$ cat data.txt
NSW|Gulliver Travels|236||5000|BW
PQR|Robinson Crusoe|999|123||XY
XYZ|The Merchant of Venice|999 abc|123||XY ZZ
XYZ LMN|Alice in Wonderland|xyz 777|123|345|PQ RS ZZ
$
$ perl -ne '{$l=""; chomp; @x=split/\|/; foreach $i (@x) { if ($i =~ / /) {$l = $l."\"".$i."\" "; } elsif ($i ne "") {$l = $l.$i." ";}}print substr($l,0,-1)."\n"}' data.txt
NSW "Gulliver Travels" 236 5000 BW
PQR "Robinson Crusoe" 999 123 XY
XYZ "The Merchant of Venice" "999 abc" 123 "XY ZZ"
"XYZ LMN" "Alice in Wonderland" "xyz 777" 123 345 "PQ RS ZZ"
$
$

HTH,
tyler_durden
# 3  
Old 04-25-2009
Here is one method when you have a known set of columns. You can use any variable names you like for the 'read' command, I just picked some as if it was a library card file.
Code:
IFS="|"
while read type title shelf row section author; do
  echo "$type \"$title\" $shelf $row $section $author" | sed 's/  */ /g'
done < pipe_delimited_file

# 4  
Old 04-26-2009
blank fields ignored

hi tyler, thanks a lot for your quick reply. The script works well but the only problem is that it doesn't take into account the blank fields. These blank fields are ignored in the output and the immediate populated field gets displayed. For e.g;

Input

D BRUX EXCC CC||JLCSVD|1|VIC|
T BGNW BSAT SA|||7|VIC|
T PNTO SCTC E0|NGS PK B/B|JLCNSD||NSW|
P GLSE P4BR Z5||||NSW|
P GLSE P4BR Z5|PK B/B NON AUG|JLCQLD|30||
D DANB EXCC CC|OTHER|JLCSVD|1|VIC|
T PNTO SPTG E0|NGS PK B/B|JLCNSD|3|NSW|

Output

"D BRUX EXCC CC" JLCSVD 1 VIC
"T BGNW BSAT SA" 7 VIC
"T PNTO SCTC E0" "NGS PK B/B" JLCNSD NSW
"P GLSE P4BR Z5" NSW
"P GLSE P4BR Z5" "PK B/B NON AUG" JLCQLD 30
"D DANB EXCC CC" OTHER JLCSVD 1 VIC
"T PNTO SPTG E0" "NGS PK B/B" JLCNSD 3 NSW

Ideal output should have been

"D BRUX EXCC CC" JLCSVD 1 VIC (3 blank characters)
"T BGNW BSAT SA" 7 VIC (6 blank characters)
"T PNTO SCTC E0" "NGS PK B/B" JLCNSD NSW (3 blanks again)
"P GLSE P4BR Z5" NSW (9 blank characters)
"P GLSE P4BR Z5" "PK B/B NON AUG" JLCQLD 30
"D DANB EXCC CC" OTHER JLCSVD 1 VIC
"T PNTO SPTG E0" "NGS PK B/B" JLCNSD 3 NSW

Is there a way to address this also in your script. Thanks again for ur inputs.

ldapswandog- thanks to u as well for ur inputs.
# 5  
Old 04-26-2009
Code:
awk -F"|" 'BEGIN{dq="\042";}
{ 
  for(i=1;i<=NF;i++){
   if ($i ~ / /){
    $i = dq $i dq
   }
  }
}'  file

# 6  
Old 04-26-2009
Quote:
Originally Posted by vharsha
hi tyler, thanks a lot for your quick reply. The script works well but the only problem is that it doesn't take into account the blank fields. These blank fields are ignored in the output and the immediate populated field gets displayed. For e.g;

Input

D BRUX EXCC CC||JLCSVD|1|VIC|
T BGNW BSAT SA|||7|VIC|
T PNTO SCTC E0|NGS PK B/B|JLCNSD||NSW|
P GLSE P4BR Z5||||NSW|
P GLSE P4BR Z5|PK B/B NON AUG|JLCQLD|30||
D DANB EXCC CC|OTHER|JLCSVD|1|VIC|
T PNTO SPTG E0|NGS PK B/B|JLCNSD|3|NSW|

Output

"D BRUX EXCC CC" JLCSVD 1 VIC
"T BGNW BSAT SA" 7 VIC
"T PNTO SCTC E0" "NGS PK B/B" JLCNSD NSW
"P GLSE P4BR Z5" NSW
"P GLSE P4BR Z5" "PK B/B NON AUG" JLCQLD 30
"D DANB EXCC CC" OTHER JLCSVD 1 VIC
"T PNTO SPTG E0" "NGS PK B/B" JLCNSD 3 NSW

Ideal output should have been

"D BRUX EXCC CC" JLCSVD 1 VIC (3 blank characters)
"T BGNW BSAT SA" 7 VIC (6 blank characters)
"T PNTO SCTC E0" "NGS PK B/B" JLCNSD NSW (3 blanks again)
"P GLSE P4BR Z5" NSW (9 blank characters)
"P GLSE P4BR Z5" "PK B/B NON AUG" JLCQLD 30
"D DANB EXCC CC" OTHER JLCSVD 1 VIC
"T PNTO SPTG E0" "NGS PK B/B" JLCNSD 3 NSW

Is there a way to address this also in your script. Thanks again for ur inputs.

ldapswandog- thanks to u as well for ur inputs.

try this in awk:

$ awk -F "|" '{
for (i=1;i<=NF;i++) {
if ( $i == "") printf (" ");
if (split($i,t," ") > 1)printf(" \"%s\"",$i)
else printf(" %s", $i)
}printf "\n"
}' filename


cheers,
Devaraj Takhellambam
# 7  
Old 04-26-2009
it works

thanks a many Devaraj it works. u made my evening.
cheers.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare 2 files of csv file and match column data and create a new csv file of them

Hi, I am newbie in shell script. I need your help to solve my problem. Firstly, I have 2 files of csv and i want to compare of the contents then the output will be written in a new csv file. File1: SourceFile,DateTimeOriginal /home/intannf/foto/IMG_0713.JPG,2015:02:17 11:14:07... (8 Replies)
Discussion started by: refrain
8 Replies

2. UNIX for Dummies Questions & Answers

[solved] Comma separated values to space separated

Hi, I have a large number of files which are written as csv (comma-separated values). Does anyone know of simple sed/awk command do achieve this? Thanks! ---------- Post updated at 10:59 AM ---------- Previous update was at 10:54 AM ---------- Guess I asked this too soon. Found the... (0 Replies)
Discussion started by: lost.identity
0 Replies

3. Shell Programming and Scripting

Need Help - comma inside double quote in comma separated csv,

Hello there, I have a comma separated csv , and all the text field is wrapped by double quote. Issue is some text field contain comma as well inside double quote. so it is difficult to process. Input in the csv file is , 1,234,"abc,12,gh","GH234TY",34 I need output like below,... (8 Replies)
Discussion started by: Uttam Maji
8 Replies

4. Shell Programming and Scripting

How to redirect in comma separated csv from grep

Hi, I am newbie in unix. Could someone tell me how do I redirect my grep output to a csv/excel ? I have used below command but the outputs are appearing in one column Not in different column. grep -e cmd -e cmd1 test.txt | cut -f 5 | sort | uniq -c> op.csv I do not understand how do I... (14 Replies)
Discussion started by: kmajumder
14 Replies

5. Shell Programming and Scripting

Text file to CSV with field data separated by blank lines

Hello, I have some data in a text file where fields are separated by blank lines. There are only 6 fields however some fields have several lines of data as I will explain. Also data in a particular field is not consistently the same size but does end on a blank line. The first field start with... (6 Replies)
Discussion started by: vestport
6 Replies

6. UNIX for Dummies Questions & Answers

Getting the minimum of each column in a file

Hi, I have a file like: 0.000000 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 124.085533 33.097845 33.363764 0.000000 266.483441 262.519130 266.380993 274.989622 289.594799 309.523518 336.124848 372.386124 413.522043 429.984825 421.621810... (6 Replies)
Discussion started by: cosmologist
6 Replies

7. Shell Programming and Scripting

How to match (whitespace digits whitespace) sequence?

Hi Following is an example line. echo "192.22.22.22 \"33dffwef\" 200 300 dsdsd" | sed "s:\(\ *\ \):\1:" I want it's output to be 200 However this is not the case. Can you tell me how to do it? I don't want to use AWK for this. Secondly, how can i fetch just 300? Should I use "\2"... (3 Replies)
Discussion started by: shahanali
3 Replies

8. Shell Programming and Scripting

removing whitespace from middle of file -help

I have a file in which I clean out a bunch of nonsense text as well as path information. What I end up with is something like the following: johnson.........................................................933 Where the periods represent the whitespace The file comes out originally with... (2 Replies)
Discussion started by: roninuta
2 Replies

9. Shell Programming and Scripting

rename file with whitespace embedded

Say a directory contains files 1) "file name 1.xxx" 2) "file name2.yyy" 3) etc Using a cshell script, is there a way to (1)search for all files that contain " " in the filename and then (2)rename the files so that you replace the whiitespace " " with a "_". First problem I run into is... (6 Replies)
Discussion started by: orlando47
6 Replies
Login or Register to Ask a Question