Removal of multiple characters with in double quotes


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Removal of multiple characters with in double quotes
# 8  
Old 01-12-2018
Hi Jag_1981,
No. You changed the format of the input data in between post #4 and post #6!

In your sample input data in post #4 in this thread, records are separated by a blank line. But, the sample data that you say is not working has no separator between records and, therefore, there is no way to determine where one record ends and the next begins.

Your sample output also differs by removing the blank lines between output records.

Please do not blame RudiC for providing you with bad code when the code he provided works perfectly with the description of what was to be done on the sample input you originally provided.

Note also that your output samples are not consistent with your input samples in either post #4 or post #6. You say you want double-quoted <newline>s and <vertical-bar>s to be removed, but that is only true part of the time. In the sample outputs that you have shown us some of those characters are replaced by <space> characters instead of being removed. For an example look at Tour World versus TourWorld in both of those posts and RudiC's code removes all of them as requested in your problem statements, but it doesn't match (and can't match) the sample output you provided in either case. If some <newline> characters are to be replaced by <space> instead of being removed, you need to CLEARLY specify the logic that can be used to determine which action is to be taken. You also sometimes replace a single <space> with two adjacent <space> characters in one place in post #4.

Similarly, if there aren't any blank lines between records in your input file, you need to clearly specify how the end of a record is supposed to be identified. Please help us help you by clearly specifying what is supposed to happen and by providing sample inputs and outputs that match the behavior that you describe.
# 9  
Old 01-12-2018
Dear Don/RudiC,

My sincere thanks for being patience with me as well as helping me with my need.

I understand fully now that by sharing incorrect or partial input/output file without paying full attention to the same, I am wasting your valuable time.

I am attempting to summarize again my need with below details.

1. My input file is Pipe (|) Delimited CSV file.
2. It has multiple records and end of record is identified by new line character.
3. There is no blank lines between each record ( either in input or output file)
4. I want only double-quoted <newline>s and <vertical-bar>s to be removed. (replaced by Null)
5. The double quotes itself should be removed. (Replaced by Null)

Sample Input File:

Code:
111|"IKJA - SPORTS"|00IIQ|Normal|100 Hall Road|
123|"ABCD RENT-A-
CAR XYZ LTD"|00N0H|Enterprise Lake|"
100 View Way"|
244|"DEFG Travel | Tour
World LTD"|"AK|0Q"|Praire Lake|"
105 NE Main St"|

Expected Output file:

Code:
111|IKJA - SPORTS|00IIQ|Normal|100 Hall Road|
123|ABCD RENT-A-CAR XYZ LTD|00N0H|Enterprise Lake|100 View Way|
244|DEFG Travel  TourWorld LTD|AK0Q|Praire Lake|105 NE Main St|

# 10  
Old 01-12-2018
With the sample input shown in post #9 stored in a file named file.csv, the following code:
Code:
awk -F'\n' -v dq='"' '
{	record = record $0
	#printf("record in:\n%s\n", record)
}
(n = split(record, f, dq)) % 2 {
	#printf("split into %d fields\n", n)
	for(i = 2; i <= n; i += 2) {
		gsub(/[|]/, "", f[i])
		#printf("f[%d] updated to: \"%s\"\n", i, f[i])
	}
	for(i = 1; i <= n; i++)
		printf("%s%s", f[i], (i == n) ? ORS : "")
	record = ""
}' file.csv

produces the output requested in post #9:
Code:
111|IKJA - SPORTS|00IIQ|Normal|100 Hall Road|
123|ABCD RENT-A-CAR XYZ LTD|00N0H|Enterprise Lake|100 View Way|
244|DEFG Travel  TourWorld LTD|AK0Q|Praire Lake|105 NE Main St|

But note that it removes <newline> and <vertical-bar> characters found between pairs of <double-quote> characters; it does NOT replace them with <NUL> characters. Replacing those characters with <NUL> characters would give you a binary file instead of a text file.

If someone else wants to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk or nawk.

If you uncomment the commented out printf() statements, you can get an inside view at how it accumulates records and removes unwanted <vertical-bar>s and <newline>s.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removal of comma within double quotes

Hi All, I am getting .csv file whenever there is a comma present between a field that field get enclosed with double quotes For eg as below abc,123,xxyy,2178 fgh,123,"x,x"yy",2178 ghi,123,"x,xyy",2178 jkl,123,xx"yy,2178 whereas I want my data as per below abc,123,xxyy,2178... (1 Reply)
Discussion started by: H_bansal
1 Replies

2. Shell Programming and Scripting

Extract multiple columns base on double quotes as delimiter

Hi All, I have my data like below "1","abc,db","hac,aron","4","5" Now I need to extract 1,2,4th columns Output should be like "1",abc,db","4" Am trying to use cut command but not able to get the results. Thanks in advance. (4 Replies)
Discussion started by: weknowd
4 Replies

3. Shell Programming and Scripting

Replace Double quotes within double quotes in a column with space while loading a CSV file

Hi All, I'm unable to load the data using sql loader where there are double quotes within the double quotes As these are optionally enclosed by double quotes. Sample Data : "221100",138.00,"D","0019/1477","44012075","49938","49938/15043000","Television - 22" Refurbished - Airwave","Supply... (6 Replies)
Discussion started by: mlavanya
6 Replies

4. Shell Programming and Scripting

Issue with Single Quotes and Double Quotes for prompt PS1

Hi, Trying to change the prompt. I have the following code. export PS1=' <${USER}@`hostname -s`>$ ' The hostname is not displayed <abc@`hostname -s`>$ uname -a AIX xyz 1 6 00F736154C00 <adcwl4h@`hostname -s`>$ If I use double quotes, then the hostname is printed properly but... (3 Replies)
Discussion started by: bobbygsk
3 Replies

5. Shell Programming and Scripting

Multiple double quotes

hi Need to run below command on remote server: cmd -a "1 2" -b 3 If i run below, there's clash matching double quotes and fail. ssh $server "cmd -a "1 2" -b 3" I have few ideas which worked (like keeping the entire cmd in a file and copy it to remote server and then run that file)... (1 Reply)
Discussion started by: reddyr
1 Replies

6. UNIX for Dummies Questions & Answers

how to use grep: finding a string with double quotes and multiple digits

I have a file with a lot of lines (a lot!) that contain 10 digits between double quotes. ie "1726937489". The digits are random throughout, but always contain ten digits. I can not for the life of me, (via scouring the internet and grep how-to manuals) figure out how to find this when I search.... (3 Replies)
Discussion started by: titusbass
3 Replies

7. UNIX for Dummies Questions & Answers

grep single quotes or double quotes

Unix superusers, I am new to unix but would like to learn more about grep. I am very familiar with regular expressions as i have used them for searching text files in windows based text editors. Since I am not very familiar with Unix, I dont understand when one should use GREP with the... (2 Replies)
Discussion started by: george_vandelet
2 Replies

8. Shell Programming and Scripting

Removal of new line character in double quotes

Hi, Could you please help me in removal of newline chracter present in between the double quotes and replacing it with space. For example ... Every field is wrapped with double quotes with comma delimiter, so I need to travese from first double quote occerence to till second double... (7 Replies)
Discussion started by: vsairam
7 Replies

9. Shell Programming and Scripting

Removal of comma(,) present inbetween double quotes(" ")

Hi Experts, I have a file with some of the records contain double quotes. If I found a double quote(") in any particular record , I need to look for the next double quote in that particular record and in between these quotes, if any comma(,) is there I need to replace with Tilde (~) in the same... (12 Replies)
Discussion started by: vsairam
12 Replies

10. Shell Programming and Scripting

Replace multiple blanks within double quotes

I have various column names within double quotes, separated by commas. Example: "column one", "column number two", "this is column number three", anothercolumn, yetanothercolumn I need to eliminate the double quotes and replace the blanks within the double quotes by underscores, giving: ... (5 Replies)
Discussion started by: jgrogan
5 Replies
Login or Register to Ask a Question