Replacing all but the first and last double quote in a line with a single quote with awk
From:
Is there an easy syntax I'm overlooking? There will always be an odd number of quotes (0,2,4,6,etc). If present the first quote will follow a comma and if present the final quote will precede a LF.
Prior to this, I have replaced any LF (but not CR+LF) with |, than replaced any CR+LF with LF and replaced any , in the rightmst field with ; (by using a loop to look for more csv columns and OFS=;). I was not aware of the "" mean " thing in the .csv standard however.
Mike
Last edited by Michael Stora; 05-21-2015 at 04:14 PM..
I think I understand the code but what does the 1do? Also why is the or in quotes?
Unfortunately I often have single quotes at the end of a line, so this often fails. The following example is based on actual data:
I also think this would break with commas inside quotes (as I initially posed the question before editing). Fortunately I already replaced them with semicolons.
Mike
PS. kicked myself many times for not using .tsb instead of .csv at the beginning of this project . . .
Last edited by Michael Stora; 05-21-2015 at 07:01 PM..
Another way of doing this is like this, which is perhaps a bit clearer:
Which replaces all the double quotes with single quotes and then changes the first and the last single quote in the last field back to double quotes..
The 1 means: the condition is true, no action was specified , so perform the default action, which is {print $0}
If you have single quotes in you input, then a third, intermediate character is needed that is not in your input. For this we can use any character that is not in your input. This example uses a newline character (which is equal to RS), which cannot be in the input, since awk is reading line by line and strips the newline.
With comma's inside double quotes this becomes more complicated, since you would need to combine with the earlier solution..
--
Edit: only just noted that in you original example comma's inside double quotes need to be converted to semicolons, but you already changed them, like you said..
Last edited by Scrutinizer; 05-22-2015 at 01:40 AM..
I got around using an intermediate character (in the past I have used some of the old ASCII punch card/paper tape control characters 28-32) by brute forcing first and last quote removal. There should be no need to exclude a character now.
Took me a long time to figure out that you could not escape ' characters in AWK with \' or a whole bunch of other things with and without a variable declaration (but you can do so in BASH with the awk -v option), so I used \x27
Excel column 22:
.CSV column 22:
Output of script column 22:
Mike
Last edited by Michael Stora; 05-22-2015 at 06:43 PM..
You can do that with awk, it is not an awk thing, it is a shell thing.. In shell you cannot escape characters that are in single quotes. And since the actual awk script is enclosed in single quotes...
What you can do is have a file that contains the awk script and call it like this:
Then you do not need to worry about escaping quotes...
For example:
Last edited by Scrutinizer; 05-23-2015 at 04:41 AM..
Hello,
I'd like to print line if column 5th doesn't match with exm. But to reach there I have to make sure I match single quote.
I'm struggling to match that.
I've input file like:
Warning: Variants 'exm480340' and '5:137534453:G:C' have the same position.
Warning: Variants 'exm480345'... (9 Replies)
Hi All ,
We have source data file as csv file and since data could contain commas ,each attribute is quoted into double quotes.However problem is that some of the attributa data also contain double quotes which is converted to double double quote while creating csv file
XLs data :
... (2 Replies)
Hi Froum.
I have tried in vain to find a solution for this problem - I'm trying to replace any double quotes within a quoted string with a single quote, leaving everything else as is.
I have the following data:
Before:
... (32 Replies)
Platform : RHEL 5.8
I want to end each line of this file with a single quote.
$ cat hello.txt
blueskies
minnie
mickey
gravity
snoopyAt VI editor's command mode, I have used the following command to replace the last character with a single quote.
~
~
~
:%s/$/'/gNow, the lines in the... (10 Replies)
i want to replace mistaken quotes in line starting with tag 300 and relocate the quote in the correct position so the input is
223;25
224;20100428064823;1;0;0;0;0;0;0;0;8;1;3;9697;18744;;;;;;;;;;;;
300;X;Event:... (3 Replies)
Could you please help in unix scripting for below scenario...
In my input file, there might be a chance of having a string ( Ex:"99999") after 5th double quote for each record. I need to replace it with a space.
Ex : Input :
"abcdef","12345","99999","0986"... (3 Replies)
Hi,
I've been trying to write a regex to use in egrep (in a shell script) that'll fetch the names of all the files that match a particular pattern. I expect to match the following line in a file:
Name = "abc"
The regex I'm using to match the same is:
egrep -l '(^) *= *" ** *"$' /PATH_TO_SEARCH... (6 Replies)
Hi all,
It is a very stupid problem but I am not able to find a solution to it.
I am using awk to get a column from a file and I want to get the output field in between single quotes. For example,
Input.txt
123 abc
321 ddff
433 dfg
........
I want output file to be as
... (6 Replies)
i m trying the following command but its not working:
sed 's/find/\'replace\'/g' myFile
but the sed enters into new line
# sed 's/find/re\'place/g' myFile
>
I havn't any idea how to put single quote in my replace string. Your early help woud be appreciated. Thanx (2 Replies)
Hi there
I have a data file like so below
'A/1';'T100002';'T100002';'';'01/05/2004';'31/05/2004';'01/06/2004';'08/06/2004';'1.36';'16';'0.22';'0';'0';'1.58';'0';'0';'0';'0';'0';'0';'clientes\resumen\200405\resumen_T100002_T100002_1.pdf';'';'0001';'S';'20040501';'';'02';'0';'S';'N'... (3 Replies)