The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Replacing characters in csv file finwhiz UNIX for Dummies Questions & Answers 1 03-31-2008 05:25 AM
replacing the characters in a file trichyselva UNIX for Dummies Questions & Answers 2 01-03-2008 08:02 AM
replacing characters palmer18 UNIX for Dummies Questions & Answers 3 08-20-2007 09:58 AM
Replacing characters in file with line break johnemb Shell Programming and Scripting 10 04-26-2007 07:38 AM
replacing few characters in a file purnakarthik UNIX for Dummies Questions & Answers 1 01-25-2007 05:17 PM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 05-17-2007
nelson553011 nelson553011 is offline
Registered User
  
 

Join Date: Sep 2004
Location: Minnesota
Posts: 13
Help Replacing Characters in Flat File

I was wondering if somebody could help me with something on UNIX. I have a file that looks like this -

"nelson,bill","bill","123 Main St","Mpls","MN",55444,8877,william

I want to replace all comma with pipes (|), except if the comma is within double quotes. (The first field is an example of this.) I can't do do a sed looking for "," and replacing that with a pipe because not all fields have double quotes around them. Side note - I do not have access to Perl, so that's not an option.
  #2 (permalink)  
Old 05-17-2007
Shell_Life's Avatar
Shell_Life Shell_Life is offline
Registered User
  
 

Join Date: Mar 2007
Location: Bahia, Brazil
Posts: 695
Nelson,
See if this works for you:
Code:
sed -e 's/",/"|/g' -e 's/\([0-9]\),/\1|/g' input_file
  #3 (permalink)  
Old 05-17-2007
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
Shell,
won't work for this pattern:
Code:
"nelson,bill","bill","123 Main St","Mpls","MN",55444,8877,william,foo
  #4 (permalink)  
Old 05-17-2007
Shell_Life's Avatar
Shell_Life Shell_Life is offline
Registered User
  
 

Join Date: Mar 2007
Location: Bahia, Brazil
Posts: 695
Vgersh,
I agree with you, but the sample is not precise as it has two strings
treated in two different ways:
Quote:
"bill"
william
I tried to solve the problem based on the sample data.
Thank you for analizing it.
  #5 (permalink)  
Old 05-17-2007
nelson553011 nelson553011 is offline
Registered User
  
 

Join Date: Sep 2004
Location: Minnesota
Posts: 13
Thanks...But what if

That worked for the sample that I gave you, but I thought of another scenario that I need to account for. What would happen if I added another comma to the end and then some more text? Sample -

"nelson,bill,jr","bill","123 Main St","Mpls","MN",55444,8877,william,bill

I tried modifying your sed command and couldn't figure out how to make it work.
  #6 (permalink)  
Old 05-17-2007
vgersh99's Avatar
vgersh99 vgersh99 is online now Forum Staff  
Moderator
  
 

Join Date: Feb 2005
Location: Boston, MA
Posts: 5,119
echo '"nelson,bill,jr","bill","123 Main St","Mpls","MN",55444,8877,william,bill' | nawk -f doCSV.awk

doCSV.awk:
Code:
BEGIN { FS=SUBSEP; OFS="|" }

{
  result = setcsv($0, ",")
  print
}

function setcsv(str, sep, i) {
  gsub(/""/, "\035", str)
  gsub(sep, FS, str)

  while (match(str, /"[^"]*"/)) {
    middle = substr(str, RSTART+1, RLENGTH-2)
    gsub(FS, sep, middle)
    str = sprintf("%.*s%s%s", RSTART-1, str, middle,
      substr(str, RSTART+RLENGTH))
  }

  if (index(str, "\"")) {
    return ((getline) > 0) ? setcsv(str (RT != "" ? RT : RS) $0,
sep) : !setcsv(str "\"", sep)
  } else {
    gsub(/\035/, "\"", str)
    $0 = str

    for (i = 1; i <= NF; i++)
      if (match($i, /^"+$/))
        $i = substr($i, 2)

    $1 = $1 ""
    return 1
  }
}
  #7 (permalink)  
Old 05-17-2007
Shell_Life's Avatar
Shell_Life Shell_Life is offline
Registered User
  
 

Join Date: Mar 2007
Location: Bahia, Brazil
Posts: 695
Nelson,
Per my previous reply, your sample led me to believe the following:
1) Every non-numeric field would be surrounded by double quotes.
except for "william".
2) Every numeric field would not be surrounded by double quotes.
If the specs give reason to question, no final solution will be found.
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 06:51 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0