Removing " from a text using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removing " from a text using awk
# 15  
Old 10-24-2013
Sorry Jotne...

As I know little awk and its derivatives I decided to use shell builtins and /bin commands
only to see how easy it is...
Code:
#!/bin/sh
# Using shell builtins and /bin ONLY...
# Generate the string.
echo 'The quick brown fox ''"jumps"'', over the lazy \"dog\".\c' > /tmp/text
# Load the file into a string variable.
text=`cat < /tmp/text`
# Show it...
echo "$text"
newtext=""
decimal=0
subscript=0
length=$[ ${#text} - 1 ]
while [ $subscript -le $length ]
do
	decimal=`printf "%d" \'${text:$subscript:1}`
	if [ $decimal -eq 34 ]
	then
		subscript=$[ $subscript + 1 ]
	fi
	if [ "${text:$subscript:2}" == '\"' ]
	then
		newtext=$newtext'\"'
		subscript=$[ $subscript + 2 ]
	else
		newtext=$newtext${text:$subscript:1}
		subscript=$[ $subscript + 1 ]
	fi
done
# Print the final string... 
echo "$newtext"

Results on OSX 10.7.5 using /bin/sh...
Code:
Last login: Thu Oct 24 17:44:02 on ttys000
AMIGA:barrywalker~> ./quotes.sh
The quick brown fox "jumps", over the lazy \"dog\".
The quick brown fox jumps, over the lazy \"dog\".
AMIGA:barrywalker~>

I liked this challenge but only wish I knew more about awk...
# 16  
Old 10-24-2013
@alister, very nice! The only caveat would be if there are character sequences without double quotes that are too long the maximum record length gets exceeded....

@wisecracker. that is OT considering that this thread is specifically about awk.

Last edited by Scrutinizer; 10-24-2013 at 04:50 PM..
# 17  
Old 10-25-2013
Quote:
Originally Posted by Scrutinizer
That will still pose a problem if there are two consecutive double quotes ("") in the input file
the revised script below seems to work fine (I not found example that not work):
Code:
awk '{gsub(/^"|[^\\]"/,"&\\");gsub(/^"|"\\"?/,"")}1' file

Advantage: no need to search a string that not exist...

Problem for me (at this moment): I don't know how to explain why it work fine. Smilie

Regards.

Last edited by disedorgue; 10-25-2013 at 08:44 AM..
# 18  
Old 10-25-2013
Quote:
Originally Posted by disedorgue
the revised script below seems to work fine (I not found example that not work):
Code:
awk '{gsub(/^"|[^\\]"/,"&\\");gsub(/^"|"\\"?/,"")}1' file

Advantage: no need to search a string that not exist...

Problem for me (at this moment): I don't know how to explain why it work fine. Smilie

Regards.
This does however pose a problem with \"\
This User Gave Thanks to Scrutinizer For This Post:
# 19  
Old 10-25-2013
Thanks to have found this problem with \"\ and I don't think that a solution exist in this way without use a string that does not exist in the file.

Regards.
# 20  
Old 10-25-2013
Quote:
Originally Posted by Scrutinizer
@alister, very nice! The only caveat would be if there are character sequences without double quotes that are too long the maximum record length gets exceeded....
True, but that can be an issue regardless of record delimiter. I hope that implementations with a hardcoded limit are smart enough to fail loudly (scream and stop) in such cases.

The caveat I had in mind was the case of input data ending with a backslash (which would produce a spurious trailing double quote). This would not be a valid text file, but "text" that does not end with a newline isn't unheard of.

Regards,
Alister
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Awk: Performing "for" loop within text block with two files

I am hoping to pull multiple strings from one file and use them to search within a block of text within another file. File 1PS001,001 HLK PS002,004 MWQ PS004,002 RXM PS004,006 DBX PS004,006 SBR PS005,007 ML PS005,009 DBR PS005,011 MR PS005,012 SBR PS006,003 RXM PS006,003 >SJ PS006,010... (11 Replies)
Discussion started by: jvoot
11 Replies

2. Shell Programming and Scripting

awk - help in removing some text

Hey Guys, Earlier I asked a question under Solaris, which I got great help... thanks. Although I got the script working for what it really needed to do, I am looking for a bit of help to change the output for nicer reading. my script gets a list of zones under a global-zone and puts this... (4 Replies)
Discussion started by: dakelly
4 Replies

3. UNIX for Dummies Questions & Answers

Removing PATTERN from txt without removing lines and general text formatting

Hi Everybody! First post! Totally noobie. I'm using the terminal to read a poorly formatted book. The text file contains, in the middle of paragraphs, hyphenation to split words that are supposed to be on multiple pages. It looks ve -- ry much like this. I was hoping to use grep -v " -- "... (5 Replies)
Discussion started by: AxeHandle
5 Replies

4. Shell Programming and Scripting

Difference between /text/ and "text" in awk

Why does this search give different result in awk I do see a mix of this in the example around the net. What to use and why? data 1 = red 2 = green 3 = blue 4 = black awk '$3 == /blue/' data "no result" awk '$3 == "blue"' data 3 = blue awk '$3 ~ /blue/' data 3 = blue (9 Replies)
Discussion started by: Jotne
9 Replies

5. Shell Programming and Scripting

Removing "^M" from the end of a String (i.e. "Ctrl+M")?

Hello All, I have an Expect script that ssh's to a remote server and runs some commands before exiting. One of the commands I run is the "hostname" Command. After I run this command I save the output using this line in the code below... Basically it executes the hostname command, then I... (2 Replies)
Discussion started by: mrm5102
2 Replies

6. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

7. Shell Programming and Scripting

Extract text between two specified "constant" texts using awk

Hi All, From the title you may know that this question has been asked several times and I have done lot of Googling on this. I have a Wikipedia dump file in XML format. All the contents are in one XML file i.e. all different topics have been put in one XML file. Now I need to separate them and... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

8. Shell Programming and Scripting

removing the "\" and "\n" character using sed or tr

Hi All, I'm trying to write a ksh script to parse a file. When the "\" character is encountered, it should be removed and the next line should be concatenated with the current line. For example... this is a test line #1\ should be concatenated with line #2\ and line number 3 when this... (3 Replies)
Discussion started by: newbie_coder
3 Replies

9. Shell Programming and Scripting

Removing " " chars using Awk

HI Friends, I am trying to elliminate the " " characters from the word: "hello" using awk. I need the output to be just = hello (without " " chars). Is there any way to do this ? Thanks! (3 Replies)
Discussion started by: vijaya2006
3 Replies

10. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies
Login or Register to Ask a Question