Sorting based on multiple delimiters Post: 302519092

Sponsored Content

Top Forums Shell Programming and Scripting Sorting based on multiple delimiters Post 302519092 by kevintse on Tuesday 3rd of May 2011 03:13:21 AM

05-03-2011

Registered User

Quote:

Originally Posted by rdcwayx

No problem I found.

If you run the awk in Solaris, please replace the command with nawk or /usr/xpg4/bin/awk

Code:

awk -F = 'NR==1{max=NF;min=NF}
         {max=(max>NF)?max:NF;min=(min<NF)?min:NF;a[NF]=(a[NF]=="")?$0:a[NF] ORS $0}
    END{for (i=max;i>=min;i--) {if (a[i]!="") print i-1 " delimiters" ORS a[i]}}' test |head -10

6 delimiters
pathan=inayat=khan=rashid=khan=sahebzadi=m
shiv=ram=tandale=ganesh=laxman=hirabai=m
5 delimiters
gore=bibi=sakina=irfanali=tayeba=f
jamadar=aves=ahmed=ashfaque=sherbano=m
ram=tandale=ganesh=laxman=hirabai=m
4 delimiters
kale=amita=bhanudas=shobha=f
lande=amit=chandrabhan=asha=m

---------- Post updated at 04:32 PM ---------- Previous update was at 04:25 PM ----------

Clever way.

little adjust (!a[$1]++) to look better, and -k1 is useless.

Code:

awk -F= '{print NF, $0}' infile | sort -nr |awk '!a[$1]++ {print $1-1 " delimiters" }{print $2}'

!a[$1]++ does look better, but it exposes a little overhead than !d||$1!=d, because it has to increment a[$1] by 1 for each line.
And again, -k1 is not useless. it is still for performance reason, if it is left out, sort has to take the entire line to sort the output, while if it is present, sort only needs to sort the first field(the delimiter count).

kevintse

View Public Profile for kevintse

Find all posts by kevintse

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sorting a flat file based on multiple colums(using character position)

Hi, I have an urgent task here. I am required to sort a flat file based on multiple columns which are based on the character position in that line. I am restricted to use the character position instead of the space and sort +1 +2 etc to do the sorting. I understand that there is a previous...

2. Shell Programming and Scripting

Cut based on Two Delimiters at one go

Hi I wanted to cut the feilds comming after % and After $ at one go can we do some thing like this cut -f 2 -d "%|$" (But it doesnot work) Input File BWPG %TCPRP1 $SCSPR000 BWPH %TCPRP1 $SCSPR003 BWPI %TRTYUP ResourceDescription="IMPRIMANTE " $BWOPTY BWPJ %ZOMBIE ...

3. Shell Programming and Scripting

Sorting based on Multiple columns

Hi, I have a requirement whereby I have to sort a flat file based on Multiple Columns (similar to ORDER BY Clause of Oracle). I am getting 10 columns in the flat file and I want the file to be sorted on 1st, 3rd, 4th, 7th and 9th columns in ascending order. The flat file is pipe seperated. Any...

4. Shell Programming and Scripting

sorting(both Ascending & Descending) files based on multiple fields

Hi All, I am encountered with a problem while sorting a file based on multiple columns . I need to sort like: (field2,ascending) , (field3,ascending) ,(field8,descending) , (field7,ascending),(field13,ascending). So far i was sorting only in ascending order but here i need to use one...

5. Shell Programming and Scripting

AWK with multiple delimiters

I have the following string sample: bla bla bla bla bla I would like to extract the "123" using awk. I thought about awk -F"]" '{ print $1 }' but it doesn't work Any ideas ?

6. Shell Programming and Scripting

Sorting problem: Multiple delimiters, multiple keys

Hello If you wanted to sort a .csv file that was filled with lines like this: <Ticker>,<Date as YYYYMMDD>,<Time as H:M:S>,<Volume>,<Corr> (H : , M, S: ) by date, does anybody know of a better solution than to turn the 3rd and 4th colons of every line into commas, sorting on four keys,...

7. Shell Programming and Scripting

Concatinating the lines based on number of delimiters

Hi, I have a problem to concatenate the lines based on number of delimiters (if the delimiter count is 9 then concatenate all the fields & remove the new line char bw delimiters and then write the following data into second line) in a file. my input file content is Title| ID| Owner|...

8. Shell Programming and Scripting

treating multiple delimiters[solved]

Hi, I need to display the last column value in the below o/p. sam2 PS 03/10/11 0 441 Unable to get o/p with this awk code awk -F"+" '{ print $4 }' pwdchk.txt I need to display 441(in this eg.) and also accept it as a variable to treat it with if condition and take a decision....

9. Shell Programming and Scripting

awk multiple delimiters

Hi Folks, This is the first time I ever encountered this situation My input file is of this kind cat input.txt 1 PAIXAF 0 1 1 -9 0 0 0 1 2 0 2 1 2 1 7 PAIXEM 0 7 1 -9 1 0 2 0 1 2 2 1 0 2 9 PAKZXY 0 2 1 -9 2 0 1 1 1 0 1 2 0 1 Till the sixth column (which is -9), I want my columns to...

10. Shell Programming and Scripting

Insert Columns before the last Column based on the Count of Delimiters

Hi, I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number. Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field awk -F '�' '{ if (NF-1 < 139)} END { "Insert 2...

LEARN ABOUT PLAN9

join

JOIN(1) 						      General Commands Manual							   JOIN(1)

NAME

       join - relational database operator

SYNOPSIS

       join [ options ] file1 file2

DESCRIPTION

       Join forms, on the standard output, a join of the two relations specified by the lines of file1 and file2.  If one of the file names is the
       standard input is used.

       File1 and file2 must be sorted in increasing ASCII collating sequence on the fields on which they are to be joined, normally the  first	in
       each line.

       There  is  one line in the output for each pair of lines in file1 and file2 that have identical join fields.  The output line normally con-
       sists of the common field, then the rest of the line from file1, then the rest of the line from file2.

       Input fields are normally separated spaces or tabs; output fields by space.  In this case, multiple separators count as	one,  and  leading
       separators are discarded.

       The following options are recognized, with POSIX syntax.

       -a n   In addition to the normal output, produce a line for each unpairable line in file n, where n is 1 or 2.

       -v n   Like -a, omitting output for paired lines.

       -e s   Replace empty output fields by string s.

       -1 m
       -2 m   Join on the mth field of file1 or file2.

       -jn m  Archaic equivalent for -n m.

       -ofields
	      Each  output  line  comprises the designated fields.  The comma-separated field designators are either 0, meaning the join field, or
	      have the form n.m, where n is a file number and m is a field number.  Archaic usage allows separate arguments for field designators.

       -tc    Use character c as the only separator (tab character) on input and output.  Every appearance of c in a line is significant.

EXAMPLES

       sort /adm/users | join -t: -a 1 -e "" - bdays
	      Add birthdays to password information, leaving unknown birthdays empty.  The layout of is given in users(6); bdays  contains  sorted
	      lines like

       tr : ' ' </adm/users | sort -k 3 3 >temp
       join -1 3 -2 3 -o 1.1,2.1 temp temp | awk '$1 < $2'
	      Print all pairs of users with identical userids.

SOURCE

       /sys/src/cmd/join.c

SEE ALSO

       sort(1), comm(1), awk(1)

BUGS

       With default field separation, the collating sequence is that of sort -b -ky,y; with -t, the sequence is that of sort -tx -ky,y.
       One of the files must be randomly accessible.

																	   JOIN(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sorting a flat file based on multiple colums(using character position)

Discussion started by: cucubird

2. Shell Programming and Scripting

Cut based on Two Delimiters at one go

Discussion started by: pbsrinivas

3. Shell Programming and Scripting

Sorting based on Multiple columns

Discussion started by: dharmesht

4. Shell Programming and Scripting

sorting(both Ascending & Descending) files based on multiple fields

Discussion started by: apjneeraj