Sponsored Content
Top Forums Shell Programming and Scripting Sorting based on multiple delimiters Post 302519092 by kevintse on Tuesday 3rd of May 2011 03:13:21 AM
Old 05-03-2011
Quote:
Originally Posted by rdcwayx
No problem I found.

If you run the awk in Solaris, please replace the command with nawk or /usr/xpg4/bin/awk
Code:
awk -F = 'NR==1{max=NF;min=NF}
         {max=(max>NF)?max:NF;min=(min<NF)?min:NF;a[NF]=(a[NF]=="")?$0:a[NF] ORS $0}
    END{for (i=max;i>=min;i--) {if (a[i]!="") print i-1 " delimiters" ORS a[i]}}' test |head -10

6 delimiters
pathan=inayat=khan=rashid=khan=sahebzadi=m
shiv=ram=tandale=ganesh=laxman=hirabai=m
5 delimiters
gore=bibi=sakina=irfanali=tayeba=f
jamadar=aves=ahmed=ashfaque=sherbano=m
ram=tandale=ganesh=laxman=hirabai=m
4 delimiters
kale=amita=bhanudas=shobha=f
lande=amit=chandrabhan=asha=m

---------- Post updated at 04:32 PM ---------- Previous update was at 04:25 PM ----------



Clever way.

little adjust (!a[$1]++) to look better, and -k1 is useless.
Code:
awk -F= '{print NF, $0}' infile | sort -nr |awk '!a[$1]++ {print $1-1 " delimiters" }{print $2}'

!a[$1]++ does look better, but it exposes a little overhead than !d||$1!=d, because it has to increment a[$1] by 1 for each line.
And again, -k1 is not useless. it is still for performance reason, if it is left out, sort has to take the entire line to sort the output, while if it is present, sort only needs to sort the first field(the delimiter count).
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sorting a flat file based on multiple colums(using character position)

Hi, I have an urgent task here. I am required to sort a flat file based on multiple columns which are based on the character position in that line. I am restricted to use the character position instead of the space and sort +1 +2 etc to do the sorting. I understand that there is a previous... (8 Replies)
Discussion started by: cucubird
8 Replies

2. Shell Programming and Scripting

Cut based on Two Delimiters at one go

Hi I wanted to cut the feilds comming after % and After $ at one go can we do some thing like this cut -f 2 -d "%|$" (But it doesnot work) Input File BWPG %TCPRP1 $SCSPR000 BWPH %TCPRP1 $SCSPR003 BWPI %TRTYUP ResourceDescription="IMPRIMANTE " $BWOPTY BWPJ %ZOMBIE ... (4 Replies)
Discussion started by: pbsrinivas
4 Replies

3. Shell Programming and Scripting

Sorting based on Multiple columns

Hi, I have a requirement whereby I have to sort a flat file based on Multiple Columns (similar to ORDER BY Clause of Oracle). I am getting 10 columns in the flat file and I want the file to be sorted on 1st, 3rd, 4th, 7th and 9th columns in ascending order. The flat file is pipe seperated. Any... (15 Replies)
Discussion started by: dharmesht
15 Replies

4. Shell Programming and Scripting

sorting(both Ascending & Descending) files based on multiple fields

Hi All, I am encountered with a problem while sorting a file based on multiple columns . I need to sort like: (field2,ascending) , (field3,ascending) ,(field8,descending) , (field7,ascending),(field13,ascending). So far i was sorting only in ascending order but here i need to use one... (1 Reply)
Discussion started by: apjneeraj
1 Replies

5. Shell Programming and Scripting

AWK with multiple delimiters

I have the following string sample: bla bla bla bla bla I would like to extract the "123" using awk. I thought about awk -F"]" '{ print $1 }' but it doesn't work Any ideas ? (7 Replies)
Discussion started by: gdub
7 Replies

6. Shell Programming and Scripting

Sorting problem: Multiple delimiters, multiple keys

Hello If you wanted to sort a .csv file that was filled with lines like this: <Ticker>,<Date as YYYYMMDD>,<Time as H:M:S>,<Volume>,<Corr> (H : , M, S: ) by date, does anybody know of a better solution than to turn the 3rd and 4th colons of every line into commas, sorting on four keys,... (20 Replies)
Discussion started by: Ryan.
20 Replies

7. Shell Programming and Scripting

Concatinating the lines based on number of delimiters

Hi, I have a problem to concatenate the lines based on number of delimiters (if the delimiter count is 9 then concatenate all the fields & remove the new line char bw delimiters and then write the following data into second line) in a file. my input file content is Title| ID| Owner|... (4 Replies)
Discussion started by: bi.infa
4 Replies

8. Shell Programming and Scripting

treating multiple delimiters[solved]

Hi, I need to display the last column value in the below o/p. sam2 PS 03/10/11 0 441 Unable to get o/p with this awk code awk -F"+" '{ print $4 }' pwdchk.txt I need to display 441(in this eg.) and also accept it as a variable to treat it with if condition and take a decision.... (1 Reply)
Discussion started by: sam_bd
1 Replies

9. Shell Programming and Scripting

awk multiple delimiters

Hi Folks, This is the first time I ever encountered this situation My input file is of this kind cat input.txt 1 PAIXAF 0 1 1 -9 0 0 0 1 2 0 2 1 2 1 7 PAIXEM 0 7 1 -9 1 0 2 0 1 2 2 1 0 2 9 PAKZXY 0 2 1 -9 2 0 1 1 1 0 1 2 0 1 Till the sixth column (which is -9), I want my columns to... (4 Replies)
Discussion started by: jacobs.smith
4 Replies

10. Shell Programming and Scripting

Insert Columns before the last Column based on the Count of Delimiters

Hi, I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number. Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field awk -F 'Ç' '{ if (NF-1 < 139)} END { "Insert 2... (5 Replies)
Discussion started by: arunkesi
5 Replies
JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME
join - relational database operator SYNOPSIS
join [ options ] file1 file2 DESCRIPTION
Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2. If one of the file names is the standard input is used. File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the first in each line. There is one line in the output for each pair of lines in file1 and file2 that have identical join fields. The output line normally con- sists of the common field, then the rest of the line from file1, then the rest of the line from file2. Input fields are normally separated spaces or tabs; output fields by space. In this case, multiple separators count as one, and leading separators are discarded. The following options are recognized, with POSIX syntax. -a n In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2. -v n Like -a, omitting output for paired lines. -e s Replace empty output fields by string s. -1 m -2 m Join on the mth field of file1 or file2. -jn m Archaic equivalent for -n m. -ofields Each output line comprises the designated fields. The comma-separated field designators are either 0, meaning the join field, or have the form n.m, where n is a file number and m is a field number. Archaic usage allows separate arguments for field designators. -tc Use character c as the only separator (tab character) on input and output. Every appearance of c in a line is significant. EXAMPLES
sort /adm/users | join -t: -a 1 -e "" - bdays Add birthdays to password information, leaving unknown birthdays empty. The layout of is given in users(6); bdays contains sorted lines like tr : ' ' </adm/users | sort -k 3 3 >temp join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2' Print all pairs of users with identical userids. SOURCE
/sys/src/cmd/join.c SEE ALSO
sort(1), comm(1), awk(1) BUGS
With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y. One of the files must be randomly accessible. JOIN(1)
All times are GMT -4. The time now is 12:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy