Sponsored Content
Top Forums UNIX for Beginners Questions & Answers How to align/sort the column pairs of an csv file, based on keyword word specified in another file? Post 303034663 by RudiC on Thursday 2nd of May 2019 03:59:33 AM
Old 05-02-2019
No surprise things are going amiss. You changed both the field separator AND the file structure several times. In post #1, you had <TAB> field separators, and an empty field between the key string and the numerical value pertaining to it. In post #3, there were just a handful spaces separating the fields. And then, in post #7, comma separators, and no empty field. As Don Cragun already pointed out, that won't fly without the necessary code adaptions.
With your last input sample file, try
Code:
awk -F"," '
NR==FNR {for (i=1; i<=3; i++)   {IX = (i-1)*2+1
                                 split ($IX, T, "_")
                                 O[T[1] FS i] = $IX FS $(IX+1)
                                }
         next
        }
        {for (i=1; i<=3; i++)  printf "%s%s", O[$1 FS i] (O[$1 FS i]?_:FS) , i==3?ORS:FS
        }
'  org1.csv key.txt
xop_thy,80,xop_nmg,50,xop_nth,40
avr_irt,70,avr_njk,50,avr_ngt,50
str_tgt,80,str_nhj,60,str_nyu,60
cyv_gty,40,,,,
vir_plo,20,vir_thk,40,vir_tyk,80
cop_thy,70,cop_thl,40,,
,,ijk_yuc,80,ijk_yuc,70
 ,,,,irt_hgt,80

And, don't change neither code nor input structure without exactly knowing what you are doing!


EDIT: With your "new and simpler" file structure, above can be simplified to

Code:
awk -F"," '
NR==FNR {for (i=1; i<=5; i+=2)  {split ($i, T, "_")
                                 O[T[1] FS i] = $i FS $(i+1)
                                }
         next
        }
        {for (i=1; i<=5; i+=2)  printf "%s%s", O[$1 FS i] (O[$1 FS i]?_:FS) , i==5?ORS:FS
        }
' org1.csv key.txt


Last edited by RudiC; 05-02-2019 at 05:50 AM..
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

sorting csv file based on column selected

Hi all, in my csv file it'll look like this, and of course it may have more columns US to UK;abc-hq-jcl;multimedia UK to CN;def-ny-jkl;standard DE to DM;abc-ab-klm;critical FD to YM;la-yr-tym;standard HY to MC;la-yr-ytm;multimedia GT to KJ;def-ny-jrt;critical I would like to group... (4 Replies)
Discussion started by: tententen
4 Replies

2. Shell Programming and Scripting

Sort file based on column

Hi, My input file is $cat samp 1 siva 1 raja 2 siva 1 siva 2 raja 4 venkat i want sort this name wise...alos need to remove duplicate lines. i am using cat samp|awk '{print $2,$1}'|sort -u it showing raja 1 (3 Replies)
Discussion started by: rsivasan
3 Replies

3. Shell Programming and Scripting

Pick the column value based on another column from .csv file

My scenario is that I need to pick value from third column based on fourth column value, if fourth column value is 1 then first value of third column.Third column (2|3|4|6|1) values are cancatenated. Main imp point, in my .csv file, third column is having price value with comma (1,20,300), it has... (2 Replies)
Discussion started by: Ganesh L
2 Replies

4. UNIX for Dummies Questions & Answers

Sort csv file by duplicated column value

hello, I have a large file (about 1gb) that is in a file similar to the following: I want to make it so that I can put all the duplicates where column 3 (delimited by the commas) are shown on top. Meaning all people with the same age are listed at the top. The command I used was ... (3 Replies)
Discussion started by: jl487
3 Replies

5. Linux

Filter a .CSV file based on the 5th column values

I have a .CSV file with the below format: "column 1","column 2","column 3","column 4","column 5","column 6","column 7","column 8","column 9","column 10 "12310","42324564756","a simple string with a , comma","string with or, without commas","string 1","USD","12","70%","08/01/2013",""... (2 Replies)
Discussion started by: dhruuv369
2 Replies

6. Shell Programming and Scripting

Fetching values in CSV file based on column name

input.csv: Field1,Field2,Field3,Field4,Field4 abc ,123 ,xyz ,000 ,pqr mno ,123 ,dfr ,111 ,bbb output: Field2,Field4 123 ,000 123 ,111 how to fetch the values of Field4 where Field2='123' I don't want to fetch the values based on column position. Instead want to... (10 Replies)
Discussion started by: bharathbangalor
10 Replies

7. Shell Programming and Scripting

Get maximum per column from CSV file, based on date column

Hello everyone, I am using ksh on Solaris 10 and I'm gathering data in a CSV file that looks like this: 20170628-23:25:01,1,0,0,1,1,1,1,55,55,1 20170628-23:30:01,1,0,0,1,1,1,1,56,56,1 20170628-23:35:00,1,0,0,1,1,2,1,57,57,2 20170628-23:40:00,1,0,0,1,1,1,1,58,58,2... (6 Replies)
Discussion started by: ejianu
6 Replies

8. UNIX for Beginners Questions & Answers

Filtering records of a csv file based on a value of a column

Hi, I tried filtering the records in a csv file using "awk" command listed below. awk -F"~" '$4 ~ /Active/{print }' inputfile > outputfile The output always has all the entries. The same command worked for different users from one of the forum links. content of file I was... (3 Replies)
Discussion started by: sunilmudikonda
3 Replies

9. UNIX for Beginners Questions & Answers

How to sort a column in excel/csv file?

I have to sort the 4th column of an excel/csv file. I tried the following command sort -u --field-separator=, --numeric-sort -k 2 -n dinesh.csv > test.csv But, it's not working. Moreover, I have to do the same for more than 30 excel/csv file. So please help me to do the same. (6 Replies)
Discussion started by: dineshkumarsrk
6 Replies
join(1) 						      General Commands Manual							   join(1)

NAME
join - relational database operator SYNOPSIS
[options] file1 file2 DESCRIPTION
forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If file1 or file2 is the standard input is used. file1 and file2 must be sorted in increasing collating sequence (see Environment Variables below) on the fields on which they are to be joined; normally the first in each line. The output contains one line for each pair of lines in file1 and file2 that have identical join fields. The output line normally consists of the common field followed by the rest of the line from file1, then the rest of the line from file2. The default input field separators are space, tab, or new-line. In this case, multiple separators count as one field separator, and lead- ing separators are ignored. The default output field separator is a space. Some of the below options use the argument n. This argument should be a or a referring to either file1 or file2, respectively. Options In addition to the normal output, produce a line for each unpairable line in file n, where n is or Replace empty output fields by string s. Join on field m of both files. The argument m must be delimited by space characters. This option and the following two are provided for backward compatibility. Use of the and options ( see below ) is recommended for portability. Join on field m of file1. Join on field m of file2. Each output line comprises the fields specified in list, each element of which has the form where n is a file number and m is a field number. The common field is not printed unless specifically requested. Use character c as a separator (tab character). Every appearance of c in a line is significant. The character c is used as the field sepa- rator for both input and output. Instead of the default output, produce a line only for each unpairable line in file_number, where file_number is or Join on field f of file 1. Fields are numbered starting with 1. Join on field f of file 2. Fields are numbered starting with 1. EXTERNAL INFLUENCES
Environment Variables determines the collating sequence expects from input files. determines the alternative blank character as an input field separator, and the interpretation of data within files as single and/or multi- byte characters. also determines whether the separator defined through the option is a single- or multi-byte character. If or is not specified in the environment or is set to the empty string, the value of is used as a default for each unspecified or empty variable. If is not specified or is set to the empty string, a default of ``C'' (see lang(5)) is used instead of If any internationaliza- tion variable contains an invalid setting, behaves as if all internationalization variables are set to ``C'' (see environ(5)). International Code Set Support Single- and multi-byte character code sets are supported with the exception that multi-byte-character file names are not supported. EXAMPLES
The following command line joins the password file and the group file, matching on the numeric group ID, and outputting the login name, the group name, and the login directory. It is assumed that the files have been sorted in the collating sequence defined by the or environment variable on the group ID fields. The following command produces an output consisting all possible combinations of lines that have identical first fields in the two sorted files sf1 and sf2, with each line consisting of the first and third fields from and the second and fourth fields from WARNINGS
With default field separation, the collating sequence is that of with the sequence is that of a plain sort. The conventions of and are incongruous. Numeric filenames may cause conflict when the option is used immediately before listing filenames. AUTHOR
was developed by OSF and HP. SEE ALSO
awk(1), comm(1), sort(1), uniq(1). STANDARDS CONFORMANCE
join(1)
All times are GMT -4. The time now is 09:16 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy