Sponsored Content
Top Forums Shell Programming and Scripting sort file with non ascii chars and cjk with perl Post 302304482 by jim mcnamara on Monday 6th of April 2009 02:51:12 PM
Old 04-06-2009
Regular sort responds to a sort request buy comparing what is called a collation sequence. This is defined by locale settings.

What does this give for output? Please show it:
Code:
locale

Look at the variable named LC_COLLATE. That sets how sort sees this character comparison.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Sort file in perl

Hi, I have an entry file for a perl script from which I need to remove duplicate entry. For example: one:two:three one:four:five two:one:three must become : one:two:three two:one:three The duplicate entry is only the first field. I try many options of sort system command but don't... (4 Replies)
Discussion started by: annececile
4 Replies

2. Shell Programming and Scripting

sort a file by date using perl

Hello, do any body help me to sort a file by date using perl? thanks in advance Esham (4 Replies)
Discussion started by: esham
4 Replies

3. Shell Programming and Scripting

replace ascii chars without loosing it.

Hi, Can some one tell, how to replace ascii non printable TAB from the while to something, then later on replace it back to TAB. Basciallz we do bulk data processing, our processin treats TAB as new field , So I thought we can replace it with something and later on revert it. TIA (4 Replies)
Discussion started by: braindrain
4 Replies

4. Shell Programming and Scripting

Perl Sort on Text File

Hi, I have a file of names and I want perl to do a sort on this file. How can I sort this list of names using perl? I'm thinking of a command like: @sorted = sort { lc($a) cmp lc($b) } @not_sorted # alphabetical sort The only thing I'm sort of unsure of is, how would I get the name in my... (6 Replies)
Discussion started by: eltinator
6 Replies

5. Shell Programming and Scripting

perl sort unicode non-ascii letters

In another thread (field separator in Perl) I nearly solved my sorting problem and I finally understood the Schwartzian transform especially thank to KevinADC. After that I've found out that the sorting was not done the way I need it. I did not notice it at first because I used all vowels as a... (6 Replies)
Discussion started by: ahsog
6 Replies

6. Shell Programming and Scripting

Perl script to sort an Excel file

Hello! I need to sort a file that is partly in English partly in Bulgarian. The original file is an Excel file but I converted it to a tab-delimited text file. The encoding of the tab delimited file is UTF-8. To sort the text, the script should test every line of the text file to see if... (9 Replies)
Discussion started by: degoor
9 Replies

7. Shell Programming and Scripting

sort -t option causing code to fail need ASCII character

Hello, When I run this UNIX code without the -t option it gives me the desired results. The code keeps the record with the greatest datetime based on the key columns. I sort it first then sort it again with the -u option, that's it. I need to have a variable to specify an ASCII character... (2 Replies)
Discussion started by: script_op2a
2 Replies

8. Shell Programming and Scripting

Perl SFTP, to get, sort and process every file.

Hi All, I'm niks, and i'm a newbie here and newbie also in perl sorry, i'm just wondering how can i get the file from the other hostname using sftp? then after i get it i'm going to sort the file and process it one by one. sorry because i'm a newbie. Thanks, -niks (4 Replies)
Discussion started by: nikki1200
4 Replies

9. Shell Programming and Scripting

Remove duplicate chars and sort string [SED]

Hi, INPUT: DCBADD OUTPUT: ABCD The SED script should alphabetically sort the chars in the string and remove the duplicate chars. (5 Replies)
Discussion started by: jds93
5 Replies

10. Shell Programming and Scripting

Convert Hex to Ascii in a Ascii file

Hi All, I have an ascii file in which few columns are having hex values which i need to convert into ascii. Kindly suggest me what command can be used in unix shell scripting? Thanks in Advance (2 Replies)
Discussion started by: HemaV
2 Replies
wcsxfrm(3)						     Library Functions Manual							wcsxfrm(3)

NAME
wcsxfrm - Transforms wide-character strings for collation in the current locale LIBRARY
Standard C Library (libc) SYNOPSIS
#include <wchar.h> size_t wcsxfrm( wchar_t *ws1, const wchar_t *ws2, size_t n); STANDARDS
Interfaces documented on this reference page conform to industry standards as follows: wcsxfrm(): XSH5.0 Refer to the standards(5) reference page for more information about industry standards and associated tags. PARAMETERS
Contains a pointer to the destination wide-character string. Contains a pointer to the source wide-character string. Specifies the maxi- mum number of wide-character codes to place into the array specified by ws1. DESCRIPTION
The wcsxfrm() function is used to transform the wide-character string specified by the ws2 parameter into a string of wide-character codes based on the collation values of the wide characters in the current setting of the LC_COLLATE locale category. No more than n wide charac- ters (including the terminating null wide character) are copied into the array specified by the ws1 parameter. When two transformed wide- character strings are obtained and the transformed strings are compared using wcscmp(), the result is the same as that obtained by a direct call to wcscoll() on the two original wide-character strings. The application can call wcsxfrm() with ws1 as a null pointer to determine the number of wide characters to allocate for the ws1 parameter. For example, the following expression calculates the size of ws1: 1 + wcsxfrm(NULL, ws2, 0) As shown in this example, the n parameter must be 0 (zero) when ws1 is a null pointer. When operating on overlapping strings, the behavior of the wcsxfrm() function is unreliable. NOTES
If an application does multiple comparisons based on the current locale's collation values and uses the same set of text strings, the wcsxfrm() transformation function in conjunction with the wcscmp() function may be more efficient than the wcscoll() collation function. This is because the string is transformed based on the locale tables only once. However, the transformation function must convert all characters in the string for each level of a multilevel collation. In comparison, the collation function stops comparing characters at the first inequality. These tradeoffs make the most efficient method for a specific application dependent on the number of repeated comparisons of strings within the set, the number of collation levels for the current locale, and the values of the strings within the set. RETURN VALUES
If the ws1 parameter is a wide-character null pointer, the wcsxfrm() function returns the number of wide-character elements (not including the terminating null wide character) required to store the transformed wide-character string. If the count specified by the n parameter is sufficient to hold the transformed string in the ws1 parameter (including the terminating null wide character) the return value is set to the actual number of wide-character elements placed in the ws1 parameter, not including the terminating null wide character. If the return value is equal to or greater than the value specified by the n parameter, the contents of the array pointed to by the ws1 parameter are indeterminate. On error, the wcsxfrm() function returns (size_t)-1 and sets errno to indicate the error. ERRORS
If any the following conditions occur, the wcsxfrm() function sets errno to the corresponding value: The ws2 parameter contains wide-char- acter codes outside the domain of the collating sequence defined by the current locale. [Tru64 UNIX] There was insufficient memory avail- able to allocate temporary storage for this operation. RELATED INFORMATION
Functions: setlocale(3), string(3), wcscmp(3), wcscoll(3) Standards: standards(5) delim off wcsxfrm(3)
All times are GMT -4. The time now is 10:31 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy