Sponsored Content
Top Forums Shell Programming and Scripting How do i sort lines lexigraphical in bash? Post 302987979 by bakunin on Monday 19th of December 2016 09:12:53 AM
Old 12-19-2016
Quote:
Originally Posted by kidi
The error seem to be localized here where i sort the utt2spk file, which is done like this..

Code:
    for x in test train; do
            for f in text utt2spk; do
                sort data/$x/$f -o data/$x/$f
            done
    done

There are several problems here and they are not necessarily related. let me address them one by one:

1) input file as output file
In general you cannot use the file you read from for input as the output file at the same time. You need to write to an intermediate file and then move that to the original place overwriting the original. This - as a side effect - makes the whole process a little bit safer in case something goes wrong. Take the following as a sketch and modify the error handling according to your needs:

Code:
for x in test train; do
            for f in text utt2spk; do
                # sort data/$x/$f -o data/$x/$f

                if sort data/${x}/${f} -o data/${x}/${f}.tmp ; then
                   mv data/${x}/${f}.tmp data/${x}/${f}
                else
                   echo "something went wrong with data/${x}/${f}" >&2
                   exit 1
                fi
            done
done

2) Note the difference between numerical and alphabetical sorting
In your request you imply your expectation to have the file (partially) sorted numerically. The difference is that alphabetically "a12bc" is after "a123bc" because "3" (4th character in second string) is before "b" in ASCII. But numerically you will want to have "12" before "123". You need to define a numeric sort order by using the "-n" switch of sort. I suggest to read the man page of sort for the details.

3) Internationalisation
This is - according to the POSIX documentation - already done. sort when starting uses the internationalisation variables (LANG, LC_*, ...) to determine the collation sequence applying to the sort. This only applies to special characters, though (like Umlauts in german ["ä", "ö", ...], the spanish enje ["ñ"], etc.). It won't affect the sorting of numbers vs. letters.

I hope this helps.

bakunin

Last edited by bakunin; 12-19-2016 at 03:33 PM.. Reason: typos
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sort (bash command)

I did a search on this, and found lots on SORT but no answer to my question. I have a C program that fetches all of our users from Netware, and I have that it makes a file that I later include in a html as a select tag drop-down menu. Here is what 1 line looks like: <option... (5 Replies)
Discussion started by: booboo
5 Replies

2. Shell Programming and Scripting

How to sort decimal values in bash

Hi, I have a list of values from associative array from 0,..till 1.0000. I tried various sort options; sort -g, sort -nr but it still couldnt work. In other words, the numbers are not sorted accordingly. Please help. Thanks. (1 Reply)
Discussion started by: ahjiefreak
1 Replies

3. Shell Programming and Scripting

Need Help to sort text lines

I need to sort input file as below to display as below: input.txt User: my_id File: oracle/scripts/ssc/ssc_db_info User: your_id File: pkg_files/BWSwsrms/request User: your_id File: pkg_files/BWSwsco/checkConfig.sh OUTPUT: User: my_id File: ... (3 Replies)
Discussion started by: tqlam
3 Replies

4. Shell Programming and Scripting

BASH: Sort four lines based on first line

I am in the process of sorting an AutoHotkey script's contents so as to make it easier for me to find and view its nearly 200 buzzwords (when I forget which one corresponds with what phrase, which I do now and then). About half to two-thirds of the script's key phrases correspond to locations... (7 Replies)
Discussion started by: SilversleevesX
7 Replies

5. Shell Programming and Scripting

grep from 3 lines and sort

Pseudo name=hdiskpower54 Symmetrix ID=000190101757 Logical device ID=0601 state=alive; policy=SymmOpt; priority=0; queued-IOs=0 ============================================================================== ---------------- Host --------------- - Stor - -- I/O Path - -- Stats --- ### HW... (7 Replies)
Discussion started by: Daniel Gate
7 Replies

6. Shell Programming and Scripting

Bash - remove duplicates without sort

I need to use bash to remove duplicates without using sort first. I can not use: cat file | sort | uniq But when I use only cat file | uniq some duplicates are not removed. (4 Replies)
Discussion started by: locoroco
4 Replies

7. UNIX for Dummies Questions & Answers

Bash script to sort files

I've got a disorganized list of items and quantities for each. I've been using a combination of grep and sort to find out how much to buy of each item. I'm tired of having to constantly using these commands so I've been trying to write a shell script to make it easier, but I can't figure out how... (3 Replies)
Discussion started by: PTcharger
3 Replies

8. Shell Programming and Scripting

How to sort lines according words?

Hello I greped some lines from an xml file and generated a new file. but some entries are missing my table is unsorted. e.g. NAME="Adel" ADDRESS="Donaustr." NUMBER="2" POSTCODE="33333" NAME="Adel" ADDRESS="Donaustr." NUMBER="2" POSTCODE="33333" NAME="Adel" NUMBER="2" POSTCODE="33333"... (5 Replies)
Discussion started by: witchblade
5 Replies

9. UNIX for Dummies Questions & Answers

awk - (URGENT!) Print lines sort and move lines if match found

URGENT HELP IS NEEDED!! I am looking to move matching lines (01 - 07) from File1 and 77 tab the matching string from File2, to File3.txt. I am almost done but - Currently, script is not printing lines to File3.txt in order. - Also the matching lines are not moving out of File1.txt ... (1 Reply)
Discussion started by: High-T
1 Replies

10. UNIX for Beginners Questions & Answers

How to sort file with certain criteria (bash)?

I am running a command that is part of a script and this is what I am getting when it is sorted by the command: command: ls /tmp/test/*NDMP*.z /tmp/test/CARS-GOLD-NET_CHROMJOB-01-XZ-ARCHIVE-NDMP.z /tmp/test/CARS-GOLD-NET_CHROMJOB-01-XZ-NDMP.z... (2 Replies)
Discussion started by: newbie2010
2 Replies
SORT(1) 						      General Commands Manual							   SORT(1)

NAME
sort - sort a file of ASCII lines SYNOPSIS
sort [-bcdfimnru] [-tc] [-o name] [+pos1] [-pos2] file ... OPTIONS
-b Skip leading blanks when making comparisons -c Check to see if a file is sorted -d Dictionary order: ignore punctuation -f Fold upper case onto lower case -i Ignore nonASCII characters -m Merge presorted files -n Numeric sort order -o Next argument is output file -r Reverse the sort order -t Following character is field separator -u Unique mode (delete duplicate lines) EXAMPLES
sort -nr file # Sort keys numerically, reversed sort +2 -4 file # Sort using fields 2 and 3 as key sort +2 -t: -o out # Field separator is : sort +.3 -.6 # Characters 3 through 5 form the key DESCRIPTION
Sort sorts one or more files. If no files are specified, stdin is sorted. Output is written on standard output, unless -o is specified. The options +pos1 -pos2 use only fields pos1 up to but not including pos2 as the sort key, where a field is a string of characters delim- ited by spaces and tabs, unless a different field delimiter is specified with -t. Both pos1 and pos2 have the form m.n where m tells the number of fields and n tells the number of characters. Either m or n may be omitted. SEE ALSO
comm(1), grep(1), uniq(1). SORT(1)
All times are GMT -4. The time now is 11:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy