How to delete a columns of a CSV file which has cell values with a string enclosed in " , "?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to delete a columns of a CSV file which has cell values with a string enclosed in " , "?
# 1  
Old 05-17-2014
How to delete a columns of a CSV file which has cell values with a string enclosed in " , "?

Hi

How can I delete a columns from a CSV file which has comma separated value with a string enclosed in double quotes or square bracket and a comma in between?

I have a csv file with below format.

Code:
 
Template,Target Server,Target Component,Rule Group,Rule,Rule Reference Number,Rule Definition,Mismatched Conditions,Result,Documented Exception,Compliant,Exception Name,Exception Reference,Expiration Date
 
 
RHEL_Pwd,server1,RHEL_Pwd (server1),/User w3server,Password expires,,"""Entry:RHEL_Pwd//w3server-Password expires"".""value (All OS)"" equals ""never""",,Pass,,Y,,,
RHEL_Pwd,server1,RHEL_Pwd (server1),/User w3server,Password inactive,,"""Entry:RHEL_Pwd//w3server-Password inactive"".""value (All OS)"" equals ""never""",,Pass,,Y,,,
RHEL_Pwd,server1,RHEL_Pwd (server1),/User w3server,Max Days Between Changes,,"""Entry:RHEL_Pwd//w3serverMax pwd days"".""Value1 (All OS)"" equals 999989 OR
""Entry:RHEL_Pwd//w3server-Max pwd days"".""value (All OS)"" = ""999989""","""Entry:RHEL_Pwd//w3serverMax pwd days"".""Value1 (All OS)"" = 999,989 [[ ""Entry:RHEL_Pwd//w3serverMax pwd days"".""Value1 (All OS)"" [99,998] = 999,989 ]] OR
""Entry:RHEL_Pwd//w3serverMax pwd days"".""value (All OS)"" = ""999989"" [[ ""Entry:RHEL_Pwd//w3serverMax pwd days"".""value (All OS)"" [""99998""] = ""999989"" ]]
",Fail,,N,,,

I am looking for something like this

Code:
 
Template,Target Server,Target Component,Rule Group,Rule,Rule Reference Number,Result,Documented Exception,Compliant,Exception Name,Exception Reference,Expiration Date
 
RHEL_Pwd,server1,RHEL_Pwd (server1),/User w3server,Password expires,,Pass,,Y,,,
RHEL_Pwd,server1,RHEL_Pwd (server1),/User w3server,Password inactive,,Pass,,Y,,,
RHEL_Pwd,server1,RHEL_Pwd (server1),/User w3server,Max Days Between Changes,,Fail,,N,,,

I have tried to remove unwanted data from columns position (7,8) but no luck. How i would remove column (rule definition and mismatched condition) from the new csv file ?
Any suggestion would be greatly appreciated.

---------- Post updated at 05:27 PM ---------- Previous update was at 04:52 PM ----------

Just to add a point here... I think there is some carriage return or blank or white space present in "rule definition and mismatched conditions".

I am very new to this scripting and not have enough knowledge how to handle carriage return or white or blank spaces.

few things I tried as of now

Code:
cut -d"," -f1,2,3,4,5,6,9,10,11,12,13 file.csv > file1.csv

Code:
awk -F, 'NR==1{NF-=3}NF>1{sub(/",.*/,"\"")}1' OFS=, file.csv >file.csv


Last edited by Scrutinizer; 05-17-2014 at 04:13 PM.. Reason: Additional CODE tags
# 2  
Old 05-17-2014
Hi.

There are numerous suggestions for tools dealing with CSV files at unix - Linux Tool To Parse CSV files - Stack Overflow

If you are on a Debian-based Linux system, there is: csvtool (1) - tool for performing manipulations on CSV files available for install:
Code:
Package: csvtool
Description-en: handy command line tool for handling CSV files
 OCaml CSV is a library to read and write CSV (comma-separated values)
 files.  It also supports all extensions used by Excel - eg. quotes,
 newlines, 8 bit characters in fields, etc.
 .
 This package contains csvtool, a handy command line tool for handling
 CSV files from shell scripts.

Best wishes ... cheers, drl
# 3  
Old 05-17-2014
Would that work?

Code:
awk '{$7=$8=":"; gsub(/:,/, "")};1' OFS=, FS=, file.cvs > another.file.cvs

# 4  
Old 05-17-2014
I think this will include the content of 7th and 8th postion to the new file. But I am looking to create a new csv file with all the data except 7th and 8th position.(rule definition and mismatched conditions).

---------- Post updated at 01:56 AM ---------- Previous update was at 01:52 AM ----------

Thanks drl...Will I be able to access the commands that are used by this tool ? or this is something that will process the original data and give the result back in a new file. If that is the case then it is not going to serve my purpose.
# 5  
Old 05-17-2014
Hi.

Here is an example with 5 columns (I usually say fields), and the commands to csvtool will copy 1-2 and 5, effectively deleting columns 3 and 4. Note that some columns contain embedded commas and spaces:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate CSV file manipulation, csvtool.

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C csvtool

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Results, copy columns 1,2,5, omitting 3,4:"
csvtool col 1,2,5 $FILE

exit 0

producing:
Code:
$ ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
csvtool - ( /usr/bin/csvtool, 2008-05-22 )

-----
 Input data file data1:
"a1","b1,c1 d1",e1,"f1 g1,h1","i1"
"a2","b2,c2 d2",e2,"f2 g2,h2","i2"
"a3","b3,c3 d3",e3,"f3 g3,h3","i3"

-----
 Results, copy columns 1-2,5, omitting 3,4:
a1,"b1,c1 d1",i1
a2,"b2,c2 d2",i2
a3,"b3,c3 d3",i3

Quote:
Originally Posted by Litu19
Thanks drl...Will I be able to access the commands that are used by this tool ? or this is something that will process the original data and give the result back in a new file. If that is the case then it is not going to serve my purpose.
Sorry, I don't understand this statement / question.

See man page for details ... cheers, drl
# 6  
Old 05-18-2014
Hi drl,

Thanks for such a nice brief along with example. I understand that csvtool is an application which takes data as an input and based on the instruction it will process the input data and give the result in as an output.

Let me explain the scenario here what we are looking here.

From an automated application we are getting an input data file in csv format and we need to format that csv file in a specific format as I mentioned above. Here we maynt use any other application like csv tool to format the data (since the file may not be accessible to any other third party application). We are looking for some scripting command to format the input csv file and generate another csv file in side that automated tool which intially generated the input file.

Below is the part of the script which is generating the csv file and creating another csv file from the initial csv file.

Code:
 
// command for generating the initial csv file (file.csv)
blcli_execute Utility exportComplianceRunLatest "${COMPLIANCE_JOB_NAME}""$LOG_LOCATION/$LOG_NAME.csv""CSV"
 
//generating new csv file from initial csv file by removing carriage return. 
 
sed -e ':a' -e 'N' -e '$!ba' -e 's/]]\n/]]/g' file.csv > file1.csv


Currently these 2 commands are used in a shell script which is executing in the automation tool I have explained above. We have observing some problem in second command used to remove the carriage return. Instead of using the second command can you help me to create a new csv file with some specific columns/fields.

I hope you understood my limitation of not using csvtool. We can only use some scriting command to create a new csv file.



---------- Post updated at 04:47 PM ---------- Previous update was at 02:32 PM ----------

In our case we have csv file and data in below format.

Code:
 
file.csv
 
Name,age,permanent_address,current_address,job_location
 
Joel,18,"lanjipalli, ram nagar
1st lane.street4","baner gaon, bitthal
temple Pune", India
 
Litu,24,,"lteligudipalli, hariram nagar
1st lane.street no 8","nuonar gaon, bitthal
temple Pune", India
 
 
expected result(file.csv)
 
name,age,job_location
Joel,18,India
litu,24,India

Note: Here we have space, carriage return and white spaces in permanent and current location.

I think we have issue that we have carriage return in place for 3rd and 4th field in csv file.
# 7  
Old 05-18-2014
Hi.

Thanks for your extended reply.

Let me review what I've understood so far.
Quote:
How can I delete a columns from a CSV file which has comma separated value with a string enclosed in double quotes or square bracket and a comma in between?
I then supplied a demonstration script to illustrate one solution with a utility called csvtool.
Quote:
Here we have space, carriage return and white spaces in permanent and current location.
You have stated a fact. You have implied that you wish to handle these with the use of awk, cut, and/or sed. You seem to be saying that those commands are OK to use, but that csvtool is not acceptable. I don't understand your basis for that, and it seems to me that it is a distinction without a difference Distinction without a difference - Wikipedia, the free encyclopedia

The csvtool command can, in fact, handle embedded spaces, commas, and newlines. Here's an example of that using exactly the same csvtool command, but with a data file that has an embedded newline in a column that will be kept, and in a column that will be effectively deleted:
Code:
$ ./s1 data2

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian 5.0.8 (lenny, workstation) 
bash GNU bash 3.2.39
csvtool - ( /usr/bin/csvtool, 2008-05-22 )

-----
 Input data file data2:
"a1","b1,c1 d1",e1,"f1 g1,h1","i1"
"a2","b2,c2 
d2",e2,"f2 g2,
h2","i2"
"a3","b3,c3 d3",e3,"f3 g3,h3","i3"

-----
 Results, copy columns 1-2,5, omitting 3,4:
a1,"b1,c1 d1",i1
a2,"b2,c2 
d2",i2
a3,"b3,c3 d3",i3

The output is as I would expect: the column to be kept with an embedded newline is still there, and the column also with an embedded newline to be deleted, is gone.

Because we don't seem to be communicating, I will withdraw and wait for someone else to reply who understands your situation better than I do.

Best wishes ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

2. Shell Programming and Scripting

Delete all log files older than 10 day and whose first string of the first line is "MSH" or "<?xml"

Dear Ladies & Gents, I have a requirement to delete all the log files in /var/log/test directory that are older than 10 days and their first line begin with "MSH" or "<?xml" or "FHS". I've put together the following BASH script, but it's erroring out: for filename in $(find /var/log/test... (2 Replies)
Discussion started by: Hiroshi
2 Replies

3. Shell Programming and Scripting

How to delete the commas in a .CSV file that are enclosed in a string with double quotes?

Okay, I would like to delete all the commas in a .CSV file (TEST.CSV) or at least substitute them with empty space, that are enclosed in double quote. Please see the sample file as below: column 1,column 2,column 3,column 4,column 5,column 6,column 7,column 8,column 9,column 10... (8 Replies)
Discussion started by: dhruuv369
8 Replies

4. Shell Programming and Scripting

Working with CSV files values enclosed with ""

I have a CSV file as shown below "1","SANTHA","KUMAR","SAM,MILLER","DEVELOPER","81,INDIA" "2","KAPIL","DHAMI","ECO SPORT","DEVELOPER","82,INDIA" File is comma delimited.All the field values are enclosed by double quotes. But while using awk or cut, it interprets the comma which is present in... (6 Replies)
Discussion started by: santhansk
6 Replies

5. Shell Programming and Scripting

How to delete a column/columns of a CSV file which has cell values with a string enclosed in " , "?

How can I delete a column from a CSV file which has comma separated value with a string enclosed in double quotes and a comma in between? I have a file 44.csv with 4 lines including the header like the below format: column1, column2, column3, column 4, column5, column6 12,455,"string with... (6 Replies)
Discussion started by: dhruuv369
6 Replies

6. Shell Programming and Scripting

how to use "cut" or "awk" or "sed" to remove a string

logs: "/home/abc/public_html/index.php" "/home/abc/public_html/index.php" "/home/xyz/public_html/index.php" "/home/xyz/public_html/index.php" "/home/xyz/public_html/index.php" how to use "cut" or "awk" or "sed" to get the following result: abc abc xyz xyz xyz (8 Replies)
Discussion started by: timmywong
8 Replies

7. UNIX for Dummies Questions & Answers

replace "," with "." only in specific columns of a file?

Hi all, I have this text file containing 9 columns separated by space. The 8th columns contains the numbers. C1 C2 C3 C4 C5 C6 C7 C8 C9 er rt yt gh iu nk il 0.07 xs yt lr ty bg iu zk nh 0,0005 lt ...etc. I want to replace the comma with full stop only in 8th coloumn. the output... (8 Replies)
Discussion started by: Unilearn
8 Replies

8. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

9. UNIX for Advanced & Expert Users

How to add two values in the same cell of CSV file

I need help to create a csv file with Unix command. In csv file, i need to put two values in the same cell. Rite now, whts happening is, if i put 2 values in the same cell, its comming as " asd, zxc" but i want it in different line but in same cell. asd zxc Please reply me ASAP. (1 Reply)
Discussion started by: Prashant Jain
1 Replies
Login or Register to Ask a Question