awk: sort lines by count of a character or string in a line

09-03-2010

Registered User

183, 15

Join Date: Jul 2010

Last Activity: 22 June 2015, 3:25 PM EDT

Posts: 183

Thanks Given: 56

Thanked 15 Times in 13 Posts

awk: sort lines by count of a character or string in a line

I want to sort lines by how many times a string occurs in each line (the most times first).
I know how to do this in two passes (add a count field in the first pass then sort on it in the second pass).

However, can it be done more optimally with a single AWK command? My AWK has improved tremendously in the last few days, but I'm not there yet.

I have written a script that "raises" all files in subdirectores up the supplied path, appending a -1, -2, -3, etc to the end of the filename (before the '.' file extension if present) if there are multiple files with the same name. It leaves all of most of those sub directories empty. I want to clean up the empty directories at the end..

Code:

find "$dir" -mindepth 1 -type d

finds them all but I need to sort them in order of deepest first or my attempts to rmdir them will fail when a folder contains deeper empty folders.

Start with this:

./dir

./dir/dir2

./dir/dira

./dir/dirb

./dir/dirc

./dir2

./dir2/dira

./dir2/dirb

./dir2/dirc

./dir3

./dir3/dira

./dir3/dirb

./dir3/dirb/dirI

./dir3/dirb/dirII

./dir3/dirb/dirIII

./dir3/dirb/dirIV

./dir3/dirb/dirV

./dir3/dirb/dirVI

./dir3/dirc

And end up with this:

./dir3/dirb/dirI

./dir3/dirb/dirII

./dir3/dirb/dirIII

./dir3/dirb/dirIV

./dir3/dirb/dirV

./dir3/dirb/dirVI

./dir/dir2

./dir/dira

./dir/dirb

./dir/dirc

./dir2/dira

./dir2/dirb

./dir2/dirc

./dir3/dira

./dir3/dirb

./dir3/dirc

./dir

./dir2

./dir3

Mike

Michael Stora

View Public Profile for Michael Stora

Find all posts by Michael Stora

09-03-2010

Registered User

298, 4

Join Date: Nov 2009

Last Activity: 3 February 2019, 6:32 PM EST

Location: india

Posts: 298

Thanks Given: 3

Thanked 4 Times in 4 Posts

Hi,

Why would you want to delve into depths if you have to remove all the dirs. ?

you can choose "-maxdepth" instead of "-mindepth"
Or
you can choose to do rm -rf instead.

If I got your issue right.

Regards,

gaurav1086

View Public Profile for gaurav1086

Find all posts by gaurav1086

09-03-2010

Registered User

183, 15

Join Date: Jul 2010

Last Activity: 22 June 2015, 3:25 PM EDT

Posts: 183

Thanks Given: 56

Thanked 15 Times in 13 Posts

Nope. I've got a script that moves all the regular files in subdirectories into the main directory, appending a number on the end when there are duplicate file names.
I want to clean up the remaining empty directories SAFELY (there is a chance one or more contains a special file in which case I wan't the rmdir to fail--I do not want to use -rf! ).

mindepth is being used to exclude the main directory. Specifically mindepth 1 excludes '.'.

Mike

Michael Stora

View Public Profile for Michael Stora

Find all posts by Michael Stora

09-03-2010

Moderator

12,296, 3,792

Join Date: Nov 2008

Last Activity: 1 January 2021, 1:47 AM EST

Location: Amsterdam

Posts: 12,296

Thanks Given: 679

Thanked 3,792 Times in 3,282 Posts

I believe a simple reverse sort will suffice for your purpose, no?

sort -r will produce:

Code:

./dir/dirc
./dir/dirb
./dir/dira
./dir/dir2
./dir3/dirc
./dir3/dirb/dirVI
./dir3/dirb/dirV
./dir3/dirb/dirIV
./dir3/dirb/dirIII
./dir3/dirb/dirII
./dir3/dirb/dirI
./dir3/dirb
./dir3/dira
./dir3
./dir2/dirc
./dir2/dirb
./dir2/dira
./dir2
./dir

This User Gave Thanks to Scrutinizer For This Post:

Scrutinizer

View Public Profile for Scrutinizer

Find all posts by Scrutinizer

09-03-2010

Registered User

183, 15

Join Date: Jul 2010

Last Activity: 22 June 2015, 3:25 PM EDT

Posts: 183

Thanks Given: 56

Thanked 15 Times in 13 Posts

Edit: Actually, I think this would work! Another solution.

Mike

Last edited by Michael Stora; 09-03-2010 at 06:55 AM..

Michael Stora

View Public Profile for Michael Stora

Find all posts by Michael Stora

09-03-2010

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

Or you can simply add -depth to your find command and execute rmdir, ignoring eventual errors (/dev/null).

In addition, some find implementations support the -empty flag.

Code:

find <source> -mindepth 1 -depth -type d -exec rmdir  {} +

In addition, some find implementations (like GNU find, for instance) support the -empty option:

Quote:

-empty File is empty and is either a regular file or a directory.

This User Gave Thanks to radoulov For This Post:

radoulov

View Public Profile for radoulov

Find all posts by radoulov

09-03-2010

Registered User

183, 15

Join Date: Jul 2010

Last Activity: 22 June 2015, 3:25 PM EDT

Posts: 183

Thanks Given: 56

Thanked 15 Times in 13 Posts

-empty will exclude folders containing other empty folders, so I don't want that.

However, -depth does exactly what I'm looking for. For each branch, it has the deeper ones first so they can be deleted in the correct order. THANKS!

$ find . -mindepth 1 -depth -type d

./dir/dir2

./dir/dira

./dir/dirb

./dir/dirc

./dir

./dir2/dira

./dir2/dirb

./dir2/dirc

./dir2

./dir3/dira

./dir3/dirb/dirI

./dir3/dirb/dirII

./dir3/dirb/dirIII

./dir3/dirb/dirIV

./dir3/dirb/dirV

./dir3/dirb/dirVI

./dir3/dirb

./dir3/dirc

./dir3

So ultimately I ended up using:

Code:

find "$dir" -mindepth 1 -depth -type d -exec rmdir {} +

Mike

PS. Still interested in the general AWK solution . . .

Last edited by Michael Stora; 09-03-2010 at 07:06 AM..

Michael Stora

View Public Profile for Michael Stora

Find all posts by Michael Stora

Shell Programming and Scripting

awk: sort lines by count of a character or string in a line

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count specific character of a file in each line and delete this character in a specific position

Discussion started by: teokon90

2. UNIX for Dummies Questions & Answers

Getting the character count of the last line

Discussion started by: MIA651

3. Shell Programming and Scripting

awk - count character count of fields

Discussion started by: PikK45

4. Shell Programming and Scripting

Character count of each line

Discussion started by: bobbygsk

5. Shell Programming and Scripting

awk new line issue, saying string can't contain new line character

Discussion started by: mady135

6. Shell Programming and Scripting

sed or awk delete character in the lines before and after the matching line

Discussion started by: KC_Rules

7. Shell Programming and Scripting

Count character in one line

Discussion started by: ambious

8. Shell Programming and Scripting

awk find a string, print the line 2 lines below it

Discussion started by: mglenney

9. Shell Programming and Scripting

awk to print lines based on string match on another line and condition

Discussion started by: rowie718

10. UNIX for Advanced & Expert Users

How to count no of occurences of a character in a string in UNIX

Discussion started by: kamesh83