Sponsored Content
Top Forums Shell Programming and Scripting Sorting on length with identification of number of characters Post 302758427 by Scrutinizer on Saturday 19th of January 2013 08:17:07 AM
Old 01-19-2013
Try:
Code:
gawk '{print length, $1}' infile | sort -n | gawk '$1!=p{print $1}{print $2; p=$1}'

You would need to use a version of awk that correctly counts multi-byte characters:
This User Gave Thanks to Scrutinizer For This Post:
 

10 More Discussions You Might Find Interesting

1. AIX

Is the Length of User ID for AIX Limit to 8 Characters?

Hi, I'm using AIX version 5.3 currently. I'm trying to create a user id, e.g. andyleong, which the system prompted the length is too long. 1. I would like to know is that the length of user id is limited to maximum 8 characters for AIX. 2. Is it apply to all versions of AIX? If no... (2 Replies)
Discussion started by: meihua_t
2 Replies

2. UNIX for Dummies Questions & Answers

Conditional sorting on fixed length flat file

I have a fixed length file that need to be sorted according to the following rule IF B=1 ORDER by A,B Else ORDER by A,C Input file is ABC 131 112 122 231 212 222 Output needed ABC 112 131 122 212 231 222 (1 Reply)
Discussion started by: zsk_00
1 Replies

3. Shell Programming and Scripting

Sorting with non- and alphanumeric characters

Hi guys, I'm new to this forum and I'm not a UNIX expert. I can't figure out this certain problem i'm having: I need to sort some words, some of the words are annotations (enclosed within < and >). I need to have them sorted alphabetically with all non-alphanumeric characters up front. For... (2 Replies)
Discussion started by: fed.m.ang
2 Replies

4. Shell Programming and Scripting

Search and replace particular characters in fixed length-file

Masters, I have fixed length input file like FHEAD0000000001XXXX20090901 0000009000Y1000XXX2 THEAD000000000220090901 ITM0000109393813 430143504352N22SP 000000000000RN000000010000EA P0000000000000014390020090901 TTAIL0000000003000000 FTAIL00000000040000000002 Note... (4 Replies)
Discussion started by: bittoo
4 Replies

5. UNIX for Dummies Questions & Answers

Sorting words based on length

i need to write a bash script that recive a list of varuables kaka pele ronaldo beckham zidane messi rivaldo gerrard platini i need the program to print the longest word of the list. word in the output appears on a separate line and word order in the output is in the order Llachsicografi costs.... (1 Reply)
Discussion started by: yairpg
1 Replies

6. Shell Programming and Scripting

Remove characters from fixed length file

Hello I've question on the requirement I am working on. We are getting a fixed length file with "33" characters long. We are processing that file loading into DB. Now some times we are getting a file with "35" characters long. In this case I have to remove two characters (in 22,23... (14 Replies)
Discussion started by: manasvi24
14 Replies

7. Shell Programming and Scripting

Need to find lines where the length is less than 50 characters

Hi, I have a big file say abc.csv. And in that file, I need to find lines whose length is less than 50 characters. How can it be achieved? Thanks in advance. Thanks (4 Replies)
Discussion started by: Gangadhar Reddy
4 Replies

8. Shell Programming and Scripting

Sorting by length

Hello, I have a very large file: a dictionary of headwords of around 40000 and would like to have the dictionary sorted by its length i.e. the largest string first and the smallest at the end. I have hunted for a perl or awk script on the forum which can do the job but there is none available. I... (8 Replies)
Discussion started by: khoremand
8 Replies

9. Shell Programming and Scripting

Sorting a file with frequency on length

Hello, I have a file which has the following structure word space Frequency The file is around 30,000 headwords each along with its frequency. The words have different lengths. What I need is a PERL or AWK script which can sort the file on length of the headword and once the file is sorted on... (12 Replies)
Discussion started by: gimley
12 Replies

10. Shell Programming and Scripting

Checking the user input in perl for characters and length

My question is basically as the title says. How can I check a user inputted string is only certain characters long (for example, 3 characters long) and how do I check a user inputted string only contains certain characters (for example, it should only contain the characters 'u', 'a', 'g', and 'c')... (4 Replies)
Discussion started by: Eric1
4 Replies
RECOLLINDEX(1)						      General Commands Manual						    RECOLLINDEX(1)

NAME
recollindex - indexing command for the Recoll full text search system SYNOPSIS
recollindex -h recollindex [ -c <configdir> ] [ -z ] [ -m ] [ -w <seconds> ] [ -D ] [ -x ] recollindex [ -c <configdir> ] -i [ -f ] [<filename [filename ...]>] recollindex [ -c <configdir> ] -e [<filename [filename ...]>] recollindex [ -c <configdir> ] -l recollindex [ -c <configdir> ] -s <lang> recollindex [ -c <configdir> ] -S DESCRIPTION
The recollindex utility allows you to perform indexing operations for the Recoll text search system. The -c option specifies the configuration directory name, overriding the default or $RECOLL_CONFDIR. There are several modes of operation. The normal mode will index the set of files described in the configuration file recoll.conf. This will incrementally update the database with files that changed since the last run. If option -z is given, the database will be erased before starting. If option -m is given, recollindex is started for real time monitoring, using the file system monitoring package it was configured for (either fam, gamin, or inotify). This mode must have been explicitly configured when building the package, it is not available by default. The program will normally detach from the controlling terminal and become a daemon. If option -D is given, it will stay in the foreground. Option -w <seconds> can be used to specify that the program should sleep for the specified time before indexing begins. The default value is 60. The daemon normally monitors the X11 session and exits when it is reset. Option -x disables this X11 session monitoring (daemon will stay alive even if it cannot connect to the X11 server). You need to use this too if you use the daemon without an X11 context. recollindex -l will list the names of available language stemmers. recollindex -i will index individual files into the database. The stem expansion and aspell databases will not be updated. The skippedPaths and skippedNames configuration variables will be used, so that some files may be skipped. You can tell recollindex to ignore skippedPaths and skippedNames by setting the -f option. This allows fully custom file selection for a given subtree, for which you would add the top directory to skippedPaths, and use any custom tool to generate the file list (ie: a tool from a source code control system). recollindex -e will erase data for individual files from the database. The stem expansion databases will not be updated. With options -i or -e , if no file names are given on the command line, they will be read from stdin, so that you could for example run: find /path/to/dir -print | recollindex -e followed by find /path/to/dir -print | recollindex -i to force the reindexing of a directory tree (which has to exist inside the file system area defined by topdirs in recoll.conf). recollindex -s will build the stem expansion database for a given language, which may or may not be part of the list in the configuration file. If the language is not part of the configuration, the stem expansion database will be deleted at the end of the next normal indexing run. You can get the list of stemmer names from the recollindex -l command. Note that this is mostly for experimental use, the normal way to add a stemming language is to set it in the configuration, either by editing "recoll.conf" or by using the GUI indexing configuration dialog. At the time of this writing, the following languages are recognized (out of Xapian's stem.h): o danish o dutch o english Martin Porter's 2002 revision of his stemmer o english_lovins Lovin's stemmer o english_porter Porter's stemmer as described in his 1980 paper o finnish o french o german o italian o norwegian o portuguese o russian o spanish o swedish recollindex -S will rebuild the phonetic/orthographic index. This feature uses the aspell package, which must be installed on the system. SEE ALSO
recoll(1) recoll.conf(5) 8 January 2006 RECOLLINDEX(1)
All times are GMT -4. The time now is 07:20 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy