Sponsored Content
Top Forums Shell Programming and Scripting awk: sort lines by count of a character or string in a line Post 302450563 by Michael Stora on Friday 3rd of September 2010 04:49:13 AM
Old 09-03-2010
awk: sort lines by count of a character or string in a line

I want to sort lines by how many times a string occurs in each line (the most times first).
I know how to do this in two passes (add a count field in the first pass then sort on it in the second pass).

However, can it be done more optimally with a single AWK command? My AWK has improved tremendously in the last few days, but I'm not there yet.

I have written a script that "raises" all files in subdirectores up the supplied path, appending a -1, -2, -3, etc to the end of the filename (before the '.' file extension if present) if there are multiple files with the same name. It leaves all of most of those sub directories empty. I want to clean up the empty directories at the end..

Code:
find "$dir" -mindepth 1 -type d

finds them all but I need to sort them in order of deepest first or my attempts to rmdir them will fail when a folder contains deeper empty folders.

Start with this:
 
./dir
./dir/dir2
./dir/dira
./dir/dirb
./dir/dirc
./dir2
./dir2/dira
./dir2/dirb
./dir2/dirc
./dir3
./dir3/dira
./dir3/dirb
./dir3/dirb/dirI
./dir3/dirb/dirII
./dir3/dirb/dirIII
./dir3/dirb/dirIV
./dir3/dirb/dirV
./dir3/dirb/dirVI
./dir3/dirc

And end up with this:
 
./dir3/dirb/dirI
./dir3/dirb/dirII
./dir3/dirb/dirIII
./dir3/dirb/dirIV
./dir3/dirb/dirV
./dir3/dirb/dirVI
./dir/dir2
./dir/dira
./dir/dirb
./dir/dirc
./dir2/dira
./dir2/dirb
./dir2/dirc
./dir3/dira
./dir3/dirb
./dir3/dirc
./dir
./dir2
./dir3

Mike
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

How to count no of occurences of a character in a string in UNIX

i have a string like echo "a|b|c" . i want to count the | symbols in this string . how to do this .plz tell the command (11 Replies)
Discussion started by: kamesh83
11 Replies

2. Shell Programming and Scripting

awk to print lines based on string match on another line and condition

Hi folks, I have a text file that I need to parse, and I cant figure it out. The source is a report breaking down softwares from various companies with some basic info about them (see source snippet below). Ultimately what I want is an excel sheet with only Adobe and Microsoft software name and... (5 Replies)
Discussion started by: rowie718
5 Replies

3. Shell Programming and Scripting

awk find a string, print the line 2 lines below it

I am parsing a nagios config, searching for a string, and then printing the line 2 lines later (the "members" string). Here's the data: define hostgroup{ hostgroup_name chat-dev alias chat-dev members thisisahostname } define hostgroup{ ... (1 Reply)
Discussion started by: mglenney
1 Replies

4. Shell Programming and Scripting

Count character in one line

Please check the attachment for the example. Purpose: count how many "|" character in one line and also display the line number. expect result: Line 1 : there are 473 "|" characters Line 2 : there are 473 "|" characters I have tried to use awk to count it, it's ok when the statistic... (8 Replies)
Discussion started by: ambious
8 Replies

5. Shell Programming and Scripting

sed or awk delete character in the lines before and after the matching line

Sample file: This is line one, this is another line, this is the PRIMARY INDEX line l ; This is another line The command should find the line with “PRIMARY INDEX” and remove the last character from the line preceding it (in this case , comma) and remove the first character from the line... (5 Replies)
Discussion started by: KC_Rules
5 Replies

6. Shell Programming and Scripting

awk new line issue, saying string can't contain new line character

Hi , I am doing some enhancements in an existing shell script. There it used the awk command in a function as below : float_expr() { IFS=" " command eval 'awk " BEGIN { result = $* print result exit(result == 0) }"' } It calls the function float_expr to evaluate two values ,... (1 Reply)
Discussion started by: mady135
1 Replies

7. Shell Programming and Scripting

Character count of each line

Hi, I have a file with more than 1000 lines. Most of the lines have 16 characters. I want to find out lines that have less than 14 characters (usually 12 or 13). wc -l gives me the line count and wc -c gives me the total characters in a file. I could not get the total characters for each line.... (1 Reply)
Discussion started by: bobbygsk
1 Replies

8. Shell Programming and Scripting

awk - count character count of fields

Hello All, I got a requirement when I was working with a file. Say the file has unloads of data from a table in the form 1|121|asda|434|thesi|2012|05|24| 1|343|unit|09|best|2012|11|5| I was put into a scenario where I need the field count in all the lines in that file. It was simply... (6 Replies)
Discussion started by: PikK45
6 Replies

9. UNIX for Dummies Questions & Answers

Getting the character count of the last line

I need the character count of the last line of each file in a directory, and not the total. Now I have been doing this but unfortunately, -exec doesn't support pipes: find sent/ -type f -exec tail -1|wc -c {} \; If I try this: find sent/ -type f -exec tail -1 {} \; | wc -c It will give... (6 Replies)
Discussion started by: MIA651
6 Replies

10. Shell Programming and Scripting

Count specific character of a file in each line and delete this character in a specific position

I will appreciate if you help me here in this script in Solaris Enviroment. Scenario: i have 2 files : 1) /tmp/TRANSACTIONS_DAILY_20180730.txt: 201807300000000004 201807300000000005 201807300000000006 201807300000000007 201807300000000008 2)... (10 Replies)
Discussion started by: teokon90
10 Replies
SORT(1)                                                            User Commands                                                           SORT(1)

NAME
sort - sort lines of text files SYNOPSIS
sort [OPTION]... [FILE]... sort [OPTION]... --files0-from=F DESCRIPTION
Write sorted concatenation of all FILE(s) to standard output. With no FILE, or when FILE is -, read standard input. Mandatory arguments to long options are mandatory for short options too. Ordering options: -b, --ignore-leading-blanks ignore leading blanks -d, --dictionary-order consider only blanks and alphanumeric characters -f, --ignore-case fold lower case to upper case characters -g, --general-numeric-sort compare according to general numerical value -i, --ignore-nonprinting consider only printable characters -M, --month-sort compare (unknown) < 'JAN' < ... < 'DEC' -h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G) -n, --numeric-sort compare according to string numerical value -R, --random-sort shuffle, but group identical keys. See shuf(1) --random-source=FILE get random bytes from FILE -r, --reverse reverse the result of comparisons --sort=WORD sort according to WORD: general-numeric -g, human-numeric -h, month -M, numeric -n, random -R, version -V -V, --version-sort natural sort of (version) numbers within text Other options: --batch-size=NMERGE merge at most NMERGE inputs at once; for more use temp files -c, --check, --check=diagnose-first check for sorted input; do not sort -C, --check=quiet, --check=silent like -c, but do not report first bad line --compress-program=PROG compress temporaries with PROG; decompress them with PROG -d --debug annotate the part of the line used to sort, and warn about questionable usage to stderr --files0-from=F read input from the files specified by NUL-terminated names in file F; If F is - then read names from standard input -k, --key=KEYDEF sort via a key; KEYDEF gives location and type -m, --merge merge already sorted files; do not sort -o, --output=FILE write result to FILE instead of standard output -s, --stable stabilize sort by disabling last-resort comparison -S, --buffer-size=SIZE use SIZE for main memory buffer -t, --field-separator=SEP use SEP instead of non-blank to blank transition -T, --temporary-directory=DIR use DIR for temporaries, not $TMPDIR or /tmp; multiple options specify multiple directories --parallel=N change the number of sorts run concurrently to N -u, --unique with -c, check for strict ordering; without -c, output only the first of an equal run -z, --zero-terminated line delimiter is NUL, not newline --help display this help and exit --version output version information and exit KEYDEF is F[.C][OPTS][,F[.C][OPTS]] for start and stop position, where F is a field number and C a character position in the field; both are origin 1, and the stop position defaults to the line's end. If neither -t nor -b is in effect, characters in a field are counted from the beginning of the preceding whitespace. OPTS is one or more single-letter ordering options [bdfgiMhnRrV], which override global order- ing options for that key. If no key is given, use the entire line as the key. Use --debug to diagnose incorrect key usage. SIZE may be followed by the following multiplicative suffixes: % 1% of memory, b 1, K 1024 (default), and so on for M, G, T, P, E, Z, Y. *** WARNING *** The locale specified by the environment affects sort order. Set LC_ALL=C to get the traditional sort order that uses native byte values. AUTHOR
Written by Mike Haertel and Paul Eggert. REPORTING BUGS
GNU coreutils online help: <http://www.gnu.org/software/coreutils/> Report sort translation bugs to <http://translationproject.org/team/> COPYRIGHT
Copyright (C) 2017 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>. This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. SEE ALSO
shuf(1), uniq(1) Full documentation at: <http://www.gnu.org/software/coreutils/sort> or available locally via: info '(coreutils) sort invocation' GNU coreutils 8.28 January 2018 SORT(1)
All times are GMT -4. The time now is 02:10 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy