CSV with commas in field values, remove duplicates, cut columns
Hi
Description of input file I have:
-------------------------
1) CSV with double quotes for string fields.
2) Some string fields have Comma as part of field value.
3) Have Duplicate lines
4) Have 200 columns/fields
5) File size is more than 10GB
Description of output file I need:
-------------------------------
1) Can be of CSV or Pipe delimited
2) But Comma within field value should remain
3) No Duplicate lines
4) I need only first 150 columns
Hello everyone I'm new here and this is my first post so first of all I want to say that this is a great forum and I have managed to found most of my answers in these forums : )
So with that I ask you my first question:
I have an excel file which I saved as a csv. However the excel file... (3 Replies)
Hi team,
I have 20 columns csv files. i want to find the duplicates in that file based on the column1 column10 column4 column6 coulnn8 coulunm2 . if those columns have same values . then it should be a duplicate record.
can one help me on finding the duplicates,
Thanks in advance.
... (2 Replies)
Hi All,
I have a text file with three columns. I would like a simple script that removes lines in which column 1 has duplicate entries, but use the largest value in column 3 to decide which one to keep. For example:
Input file:
12345a rerere.rerere len=23
11111c fsdfdf.dfsdfdsf len=33 ... (3 Replies)
I have a .CSV file (file.csv) whose data are all enclosed in double quotes. Sample format of the file is as below:
column1,column2,column3,column4,column5,column6, column7, Column8, Column9, Column10
"12","B000QRIGJ4","4432","string with quotes, and with a comma, and colon: in... (3 Replies)
I am trying to see if I can use awk to remove duplicates from a file. This is the file:
-==> Listvol <==
deleting /vol/eng_rmd_0941
deleting /vol/eng_rmd_0943
deleting /vol/eng_rmd_0943
deleting /vol/eng_rmd_1006
deleting /vol/eng_rmd_1012
rearrange /vol/eng_rmd_0943
... (6 Replies)
i have data as below
123,"paul phiri",paul@yahoo.com,"po.box 23, BT","Eco Bank,Blantyre,Malawi"
i need an output to be
123,"paul phiri",paul@yahoo.com,"po.box 23 BT","Eco Bank Blantyre Malawi" (5 Replies)
Hi,
I have a file of csv data, which looks like this:
file1:
1AA,LGV_PONCEY_LES_ATHEE,1,\N,1,00020460E1,0,\N,\N,\N,\N,2,00.22335321,0.00466628
2BB,LES_POUGES_ASF,\N,200,200,00006298G1,0,\N,\N,\N,\N,1,00.30887539,0.00050312... (10 Replies)
In the attached file I am trying to remove all the "" and , (quotes and commas) from $2 and $3 and the "" (quotes) from $4.
I tried the below as a start:
awk -F"|" '{gsub(/\,/,X,$2)} 1' OFS="\t" enhancer.txt > comma.txt
Thank you :). (6 Replies)
how to remove unwanted commas from a .csv file
Input file format
"Server1","server-PRI-Windows","PRI-VC01","Microsoft Windows Server 2012, (64-bit)","Powered On","1,696.12","server-GEN-SFCHT2-VMS-R013,server-GEN-SFCHT2-VMS-R031,server-GEN-SFCHT2-VMS-R023"... (5 Replies)
Discussion started by: ranjancom2000
5 Replies
LEARN ABOUT ULTRIX
cut
cut(1) General Commands Manual cut(1)Name
cut - cut out selected fields of each line of a file
Syntax
cut -clist [file1 file2...]
cut -flist [-dchar] [-s] [file1 file2...]
Description
Use the command to cut out columns from a table or fields from each line of a file. The fields as specified by list can be fixed length,
that is, character positions as on a punched card (-c option), or the length can vary from line to line and be marked with a field delim-
iter character like tab (-f option). The command can be used as a filter. If no files are given, the standard input is used.
Use to make horizontal ``cuts'' (by context) through a file, or to put files together in columns. To reorder columns in a table, use and
Options
list Specifies ranges that must be a comma-separated list of integer field numbers in increasing order. With optional - indicates
ranges as in the -o option of nroff/troff for page ranges; for example, 1,4,7; 1-3,8; -5,10 (short for 1-5,10); or 3- (short
for third through last field).
-clist Specifies character positions to be cut out. For example, -c1-72 would pass the first 72 characters of each line.
-flist Specifies the fields to be cut out. For example, -f1,7 copies the first and seventh field only. Lines with no field delim-
iters are passed through intact (useful for table subheadings), unless -s is specified.
-dchar Uses the specified character as the field delimiter. Default is tab. Space or other characters with special meaning to the
shell must be quoted. The -d option is used only in combination with the -f option, according to XPG3 and SVID2/SVID3.
-s Suppresses lines with no delimiter characters. Unless specified, lines with no delimiters are passed through untouched.
Either the -c or -f option must be specified.
Examples
Mapping of user IDs to names:
cut -d: -f1,5 /etc/passwd
To set name to the current login name for the csh shell:
set name=`who am i | cut -f1 -d" "`
To set name to the current login name for the sh, sh5, and ksh shells:
name=`who am i | cut -f1 -d" "`
Diagnostics
"line too long" A line can have no more than 511 characters or fields.
"bad list for c/f option"
Missing -c or -f option or incorrectly specified list. No error occurs if a line has fewer fields than the list calls
for.
"no fields" The list is empty.
See Alsogrep(1), paste(1)cut(1)