How to align/sort the column pairs of an csv file, based on keyword word specified in another file? Post: 303034663

Sponsored Content

Top Forums UNIX for Beginners Questions & Answers How to align/sort the column pairs of an csv file, based on keyword word specified in another file? Post 303034663 by RudiC on Thursday 2nd of May 2019 03:59:33 AM

05-02-2019

Registered User

No surprise things are going amiss. You changed both the field separator AND the file structure several times. In post #1, you had <TAB> field separators, and an empty field between the key string and the numerical value pertaining to it. In post #3, there were just a handful spaces separating the fields. And then, in post #7, comma separators, and no empty field. As Don Cragun already pointed out, that won't fly without the necessary code adaptions.
With your last input sample file, try

Code:

awk -F"," '
NR==FNR {for (i=1; i<=3; i++)   {IX = (i-1)*2+1
                                 split ($IX, T, "_")
                                 O[T[1] FS i] = $IX FS $(IX+1)
                                }
         next
        }
        {for (i=1; i<=3; i++)  printf "%s%s", O[$1 FS i] (O[$1 FS i]?_:FS) , i==3?ORS:FS
        }
'  org1.csv key.txt
xop_thy,80,xop_nmg,50,xop_nth,40
avr_irt,70,avr_njk,50,avr_ngt,50
str_tgt,80,str_nhj,60,str_nyu,60
cyv_gty,40,,,,
vir_plo,20,vir_thk,40,vir_tyk,80
cop_thy,70,cop_thl,40,,
,,ijk_yuc,80,ijk_yuc,70
 ,,,,irt_hgt,80

And, don't change neither code nor input structure without exactly knowing what you are doing!

EDIT: With your "new and simpler" file structure, above can be simplified to

Code:

awk -F"," '
NR==FNR {for (i=1; i<=5; i+=2)  {split ($i, T, "_")
                                 O[T[1] FS i] = $i FS $(i+1)
                                }
         next
        }
        {for (i=1; i<=5; i+=2)  printf "%s%s", O[$1 FS i] (O[$1 FS i]?_:FS) , i==5?ORS:FS
        }
' org1.csv key.txt

Last edited by RudiC; 05-02-2019 at 05:50 AM..

RudiC

View Public Profile for RudiC

Find all posts by RudiC

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sorting csv file based on column selected

Hi all, in my csv file it'll look like this, and of course it may have more columns US to UK;abc-hq-jcl;multimedia UK to CN;def-ny-jkl;standard DE to DM;abc-ab-klm;critical FD to YM;la-yr-tym;standard HY to MC;la-yr-ytm;multimedia GT to KJ;def-ny-jrt;critical I would like to group...

2. Shell Programming and Scripting

Sort file based on column

Hi, My input file is $cat samp 1 siva 1 raja 2 siva 1 siva 2 raja 4 venkat i want sort this name wise...alos need to remove duplicate lines. i am using cat samp|awk '{print $2,$1}'|sort -u it showing raja 1

3. Shell Programming and Scripting

Pick the column value based on another column from .csv file

My scenario is that I need to pick value from third column based on fourth column value, if fourth column value is 1 then first value of third column.Third column (2|3|4|6|1) values are cancatenated. Main imp point, in my .csv file, third column is having price value with comma (1,20,300), it has...

4. UNIX for Dummies Questions & Answers

Sort csv file by duplicated column value

hello, I have a large file (about 1gb) that is in a file similar to the following: I want to make it so that I can put all the duplicates where column 3 (delimited by the commas) are shown on top. Meaning all people with the same age are listed at the top. The command I used was ...

5. Linux

Filter a .CSV file based on the 5th column values

I have a .CSV file with the below format: "column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10 "12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""...

6. Shell Programming and Scripting

Fetching values in CSV file based on column name

input.csv: Field1,Field2,Field3,Field4,Field4 abc ,123 ,xyz ,000 ,pqr mno ,123 ,dfr ,111 ,bbb output: Field2,Field4 123 ,000 123 ,111 how to fetch the values of Field4 where Field2='123' I don't want to fetch the values based on column position. Instead want to...

7. Shell Programming and Scripting

Get maximum per column from CSV file, based on date column

Hello everyone, I am using ksh on Solaris 10 and I'm gathering data in a CSV file that looks like this: 20170628-23:25:01,1,0,0,1,1,1,1,55,55,1 20170628-23:30:01,1,0,0,1,1,1,1,56,56,1 20170628-23:35:00,1,0,0,1,1,2,1,57,57,2 20170628-23:40:00,1,0,0,1,1,1,1,58,58,2...

8. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Hi, I tried filtering the records in a csv file using "awk" command listed below. awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile The output always has all the entries. The same command worked for different users from one of the forum links. content of file I was...

9. UNIX for Beginners Questions & Answers

How to sort a column in excel/csv file?

I have to sort the 4th column of an excel/csv file. I tried the following command sort -u --field-separator=, --numeric-sort -k 2 -n dinesh.csv > test.csv But, it's not working. Moreover, I have to do the same for more than 30 excel/csv file. So please help me to do the same.

LEARN ABOUT NETBSD

join

JOIN(1) 						    BSD General Commands Manual 						   JOIN(1)

NAME

     join -- relational database operator

SYNOPSIS

     join [-a file_number | -v file_number] [-e string] [-j file_number field] [-o list] [-t char] [-1 field] [-2 field] file1 file2

DESCRIPTION

     The join utility performs an ``equality join'' on the specified files and writes the result to the standard output.  The ``join field'' is
     the field in each file by which the files are compared.  The first field in each line is used by default.	There is one line in the output
     for each pair of lines in file1 and file2 which have identical join fields.  Each output line consists of the join field, the remaining
     fields from file1 and then the remaining fields from file2.

     The default field separators are tab and space characters.  In this case, multiple tabs and spaces count as a single field separator, and
     leading tabs and spaces are ignored.  The default output field separator is a single space character.

     Many of the options use file and field numbers.  Both file numbers and field numbers are 1 based, i.e. the first file on the command line is
     file number 1 and the first field is field number 1.  The following options are available:

     -a file_number
		 In addition to the default output, produce a line for each unpairable line in file file_number.  (The argument to -a must not be
		 preceded by a space; see the COMPATIBILITY section.)

     -e string	 Replace empty output fields with string.

     -o list	 The -o option specifies the fields that will be output from each file for each line with matching join fields.  Each element of
		 list has the form 'file_number.field', where file_number is a file number and field is a field number.  The elements of list must
		 be either comma (``,'') or whitespace separated.  (The latter requires quoting to protect it from the shell, or, a simpler
		 approach is to use multiple -o options.)

     -t char	 Use character char as a field delimiter for both input and output.  Every occurrence of char in a line is significant.

     -v file_number
		 Do not display the default output, but display a line for each unpairable line in file file_number.  The options -v 1 and -v 2
		 may be specified at the same time.

     -1 field	 Join on the field'th field of file 1.

     -2 field	 Join on the field'th field of file 2.

     When the default field delimiter characters are used, the files to be joined should be ordered in the collating sequence of sort(1), using
     the -b option, on the fields on which they are to be joined, otherwise join may not report all field matches.  When the field delimiter char-
     acters are specified by the -t option, the collating sequence should be the same as sort(1) without the -b option.

     If one of the arguments file1 or file2 is ``-'', the standard input is used.

     The join utility exits 0 on success, and >0 if an error occurs.

COMPATIBILITY

     For compatibility with historic versions of join, the following options are available:

     -a 	 In addition to the default output, produce a line for each unpairable line in both file 1 and file 2.	(To distinguish between
		 this and -a file_number, join currently requires that the latter not include any white space.)

     -j1 field	 Join on the field'th field of file 1.

     -j2 field	 Join on the field'th field of file 2.

     -j field	 Join on the field'th field of both file 1 and file 2.

     -o list ...
		 Historical implementations of join permitted multiple arguments to the -o option.  These arguments were of the form ``file_num-
		 ber.field_number'' as described for the current -o option.  This has obvious difficulties in the presence of files named ``1.2''.

     These options are available only so historic shell scripts don't require modification and should not be used.

SEE ALSO

     awk(1), comm(1), paste(1), sort(1), uniq(1)

STANDARDS

     The join command is expected to be IEEE Std 1003.2 (``POSIX.2'') compatible.

BSD
								  April 28, 1995							       BSD