Sponsored Content
Full Discussion: sort find results
Top Forums UNIX for Dummies Questions & Answers sort find results Post 302599515 by drl on Friday 17th of February 2012 10:44:36 AM
Old 02-17-2012
Hi.

You could use a non-standard collating sequence with a non-standard utility:
Code:
#!/usr/bin/env bash

# @(#) s1	Demonstrate msort custom collating sequence.
# http://billposer.org/Software/msort.html

# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
edges() { local _f _n _l;: ${1?"edges: need file"}; _f=$1;_l=$(wc -l $_f);
  head -${_n:=3} $_f ; pe "--- ( $_l: lines total )" ; tail -$_n $_f ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C msort

FILE=${1-data1}

pl " Input data file $FILE:"
cat $FILE

pl " Results, default collating sequence:"
msort -l -q -w $FILE

pl " Results, custom collating sequence:"
msort -l -q -w -s collating-sequence.txt $FILE

pl " Custom: collating-sequence.txt:"
cat collating-sequence.txt

exit 0

producing:
Code:
% ./s1

Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 2.6.26-2-amd64, x86_64
Distribution        : Debian GNU/Linux 5.0.8 (lenny) 
bash GNU bash 3.2.39
msort 8.44

-----
 Input data file data1:
H:\FileList\A\E\F\G\newCppFile.cpp
H:\FileList\header01.h
H:\FileList\B\nextCppFile.cpp

-----
 Results, default collating sequence:
H:\FileList\A\E\F\G\newCppFile.cpp
H:\FileList\B\nextCppFile.cpp
H:\FileList\header01.h

-----
 Results, custom collating sequence:
H:\FileList\header01.h
H:\FileList\A\E\F\G\newCppFile.cpp
H:\FileList\B\nextCppFile.cpp

-----
 Custom: collating-sequence.txt:
a
b
c
d
e
f
g
h
A
B

The msort utility was in the Debian repository. See the web page noted in the script for a PDF of documentation and other details ... cheers, drl

Quote:
"Non-standard" extant tools often: are general, have the
simplest, most appropriate interface, and are convenient
alternatives in the context of equally useful, but
"non-standard", nonce (one-off) awk, perl, ruby scripts.
The knowledge that such tools exist can be of advantage for
solving future similar, but specifically different problems.
( edit 1: minor typo )

Last edited by drl; 02-17-2012 at 05:42 PM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

find results

Hi, how can I get only useful results from find / -size 10000000 without the "Permissions denied" files ? tks C (5 Replies)
Discussion started by: Carmen123
5 Replies

2. UNIX for Dummies Questions & Answers

How to sort find results

Hi-- Ok. I have now found that: find -x -ls will do what I need as far as finding all files on a particular volume. Now I need to sort the results by the file's modification date/time. Is there a way to do that? Also, I notice that for many files, whereas the man for find says ls is... (8 Replies)
Discussion started by: groundlevel
8 Replies

3. Shell Programming and Scripting

HELP: I need to sort a text file in an uncommon manner, can't get desired results

Hi All I have a flat text file. Each line in it contains a "/full path/filename". The last three columns are predictable, but directory depth of each line varies. I want to sort on the last three columns, starting from the last, 2nd last and 3rd last. In that order. The last three columns... (6 Replies)
Discussion started by: JakeKatz
6 Replies

4. Shell Programming and Scripting

Help with sort folder results

Here is the code, but the list is not sorted properly (alphabetically)? <?php function folderlist(){ $startdir = './'; $ignoredDirectory = '.'; $ignoredDirectory = '..'; if (is_dir($startdir)){ if ($dh = opendir($startdir)){ while (($folder = readdir($dh)) !== false){ if... (0 Replies)
Discussion started by: mrlayance
0 Replies

5. UNIX for Dummies Questions & Answers

LINUX SORT command chops results

I am trying to sort a file . The file looks like this: DDFF 2 /ztpfrepos/pgr/load DDFQ 2 /ztpfrepos/pgr/load DDFX 2 /ztpfrepos/pgr/load DDUA 2 /ztpfrepos/pgr/load My command: sort -k1 /home/c153507/Bin/OPL1.txt -o /home/c153507/Bin/OPL1.txt The results are OK except for one line where... (4 Replies)
Discussion started by: Yahalom
4 Replies

6. UNIX for Dummies Questions & Answers

How to do ls -l on results of grep and find?

Hi, Am running the command below to search for files that contains a certain string. grep -il "shutdown" `find . -type f -mtime -1 -print` | grep "^./scripts/active" How do I get it to do a ls -l on the list of files? I tried doing ls -l `grep -il "shutdown" `find . -type f -mtime -1... (5 Replies)
Discussion started by: newbie_01
5 Replies

7. UNIX for Beginners Questions & Answers

Weird 'find' results

Hello and thanks in advance for any help anyone can offer me I'm trying to learn the find command and thought I was understanding it... Apparently I was wrong. I was doing compound searches and I started getting weird results with the -size test. I was trying to do a search on a 1G file owned by... (14 Replies)
Discussion started by: bodisha
14 Replies

8. UNIX for Beginners Questions & Answers

Strange sort -r results

Hi Folks - I have this file that looks like this: outbox/logs/Client_1042.log outbox/logs/Client_941.log outbox/logs/Client_942.log outbox/logs/Client_943.log outbox/logs/Client_944.log And this is my code: #!/bin/bash _OUTBOX_BIN="outbox/logs/" _NAME="Client" _TEMP="temp.txt"... (9 Replies)
Discussion started by: SIMMS7400
9 Replies

9. UNIX for Beginners Questions & Answers

Inconsistent results using sort function

Could you please advise on the following: I have two space-delimited files with 9 and 10 columns, respectively, with exactly the same values in column 1. However, the order of column 1 differs between the two files, so I want to sort both files by column 1, so that I can align them and... (6 Replies)
Discussion started by: aberg
6 Replies

10. UNIX for Beginners Questions & Answers

Strange results from 'strings | sort'

Using the 'strings' command and piping the result to 'sort' is producing strange results. I get block of lines that begin with asterisks, then a block that begins with some text, then more lines that begin with asterisks. The actual content is correct - lines beginning with asterisks is the... (5 Replies)
Discussion started by: edstevens
5 Replies
asort(1)						      General Commands Manual							  asort(1)

NAME
asort - Sorts or merges files and supports multiple collating weight sequences SYNOPSIS
asort [-m] [-o output_file] [-Abdfinruv] [-Ccollate_sequence] [-k keydef]... [-t character] [-T directory] [-y] [kilobytes] [-z record_size]... file... asort -c [-u] [-Abdfinruv] [-Ccollate_sequence] [-k keydef]... [-t character] [-T directory] [-y] [kilobytes] [-z record_size]... file... The following syntax is maintained for backward compatibility but may be withdrawn in a future release: asort [-Abcdfimnruv] [-Ccol- late_sequence] [-o output_file] [-t character] [-T directory] [-y] [kilobytes] [-z record_size] [+fskip] [.cskip] [-fskip] [.cskip] [-bdfinr]... file... OPTIONS
The asort command includes the same options as the sort command (see sort(1)) in addition to the following options: Specifies the collating weight sequence to be used in sorting the data files. When this option is specified, the asort command does not use the collating table from the locale database. Instead, the command uses a set of special system and user collating tables to determine the collating weights of characters, including user-defined characters (UDCs). The collate_sequence argument can be in long form (for example, "Pinyin Radical Stroke") or short form (for example, prs). The code- set of the locale determines which collation weight names can be specified for collate_sequence. The following list specifies the long and short collation weight names that are valid for supported codesets. For DEC Hanzi: Pinyon (or p) Radical (or r) Stroke (or s) For DEC Hanyu, Taiwanese EUC, and BIG-5: Phonetic (or p) Radical (or r) Stroke (or s) Uses a breadth-first sorting mechanism instead of the default depth-first mechanism to sort the input data. To have any effect, the -v option must be used together with the -C option. DESCRIPTION
The asort command sorts lines in its input files and writes the result to standard output. The asort command is similar to the sort com- mand. See the sort(1) reference page for information about features the two commands have in common. The asort command provides additional features for processing multiple collating weight sequences used with Asian languages, such as Chi- nese. For example, pinyon (p), stroke (s), and radical (r) are three dimensions along which characters can be ordered in Simplified Chi- nese. The -C option allows users to specify the priority level that these dimensions have during sorting. For example, -C srp specifies that characters should be sorted first by stroke, then by radical, then by pinyon. The specified sequence is applied to user-defined char- acters (UDCs) as well as to standard characters. When the -C option is specified, the default behavior of the asort command is to use a depth-first sorting mechanism to sort the input files. With the depth-first mechanism, pairs of multibyte characters in a sort field are compared by exhausting all the specified collat- ing weights and/or internal codes one at a time until the collating order is resolved. Only when two characters are identical is the next pair of characters compared. The depth-first sorting mechanism is also called character sorting. However, the asort command provides the -v option to use the Asian VMS-like breadth-first sorting mechanism. With the breadth-first mecha- nism, pairs of multibyte characters in a sort field are compared using the first collating weight for all the characters in the sort field first. Only when two sets of data in a sort field are computed to have the same collating order are succeeding collating weights used for resolving the collating order. The breadth-first sorting mechanism is sometimes called string sorting. NOTES
Currently, the asort command is supported for use only with Chinese codesets. EXIT STATUS
The asort command returns the following exit values: All input files were output successfully, or -c was specified and the input file was correctly sorted. If -c was specified, the file was not ordered as specified, or if the -c and -u options were both specified, two input lines were found with equal keys. An error occurred. EXAMPLES
Unless stated otherwise, the following examples assume the locale setting is zh_TW.dechanyu: To perform character sorting first by stroke and then by radical, enter: asort -C"Stroke Radical" names This command displays the lines in names sorted in ascending order according to the number of strokes in characters. If the number of strokes happen to be the same for two characters, the radicals of the characters determine how the characters are ordered.An alternative short form of the same command is as follows: asort -Csr names To perform string sorting first by stroke and then by radical in a way similar to the sort command available on an Asian VMS system, enter: asort -v -C"Stroke Radical" names SEE ALSO
Commands: sort(1) Functions: setlocale(3) Files: locale(4) Others: Chinese(5), i18n_intro(5) asort(1)
All times are GMT -4. The time now is 02:44 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy