Are any of you guys aware of any problems when trying to sort compressed fields? Why I uncompress the file I am trying to sort, I have no problem sorting but when I try to sort compressed fields it doesnt work properly. I need to be able to sort these compressed fields. Any explanation why? (1 Reply)
Hi Friends,
I've a large datafile, I've to sort the entire records of this file based on the positions. For ex:
ccc112IVEAGH VETERINARY SERVICES
cca110SHOOTER PROPERTY SERVICES
bbb111JUNIOR GOLF ACADEMY LIMITED
aaa110AULD PROPERTIES T/A R&J AULD
ccb111LISBURN FUELS
aac112P & MRS C... (1 Reply)
Hi all,
I have a file that looks like this...
##########
1zz2_15-43 1ouy_0-13-35 1.12619901947
2gfs_41-7 1yqj_3 0.793602121208
1bl7_11-3 1wbo_1-3-4 0.791065168287
1ywr_16-3 2ghl_22 0.956896171134
2exc_11-35 1pmq_13-15-87 0.597677672501
2bal_25-7 1ouk_17-19-21-228-58 0.668388304836... (6 Replies)
Hi All,
I am encountered with a problem while sorting a file based on multiple columns . I need to sort like:
(field2,ascending) , (field3,ascending) ,(field8,descending) , (field7,ascending),(field13,ascending).
So far i was sorting only in ascending order but here i need to use one... (1 Reply)
I have a bash script which takes a log file with each record separated by a #. The records have multiple fields but field $1 is always the date and time. When the script is run it prints the record just fine from oldest to newest. I need to have records print out from newest first.
Here is the... (7 Replies)
Hi,
I have the following file, I need to sort it based on a column and write to different output files based on this column
request_guid iso_country_cd address_data response_time
32895901-d17f-414c-ac93-3e7e0f5ec240 AND BaseName:CATALUNYA; HouseNumber:1; ISOCountryCode:AND;... (1 Reply)
Good afternoon!
I am a perl newbie. I hope you will be patient with me.
I have a script that needs to be written in perl. I can't do it in awk or shell scripting.
Here is the script:
#!/usr/bin/perl
use POSIX qw(strftime);
use FileHandle;
use Getopt::Long;
use IO::Handle;... (0 Replies)
Dear folks.
I have list of ip address from different country, i would like check from script this ip address belong to which country, please suggest any command, i have more than 200 ip address so i need to do it via script. Script will show me country code and its City if possible?
... (0 Replies)
Hi all, I have a tab separated file, and one of the fields is sub-delimited by colon. The problem is there can be zero to 4 colons within this field. When I try to change colons to tabs the result is a file with a differing number of fields.
I want to go from:
a:b:c:d:e
a:b:c
a:b:c:d:e
a... (4 Replies)
Hi all,
I have a file with a list of rpm's that have different dates. I am trying to just grab the latest rpm and install date, and discard the rest. The file has 1000's of entries all with different names and dates.
I have tried sort -k on the file and I am not grabbing the info,
... (4 Replies)
Discussion started by: gartie
4 Replies
LEARN ABOUT DEBIAN
mmseg
MMSEG(1) User Contributed Perl Documentation MMSEG(1)NAME
mmseg - maximum matching segment Chinese text.
SYNOPSIS
mmseg -d dict_file [option]... [corpus_file]...
DESCRIPTION
mmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. mmseg segments corpus_file, or standard input if
no filename is specified, and write the segmented result to standard output.
OPTIONS -d dict_file
Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8.
-f,--format (text|bin)
Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short
integer of the word-ids are written to stdout.
-s, --stok STOK_ID
Sentence token id. Default 10. It will be written to output in binary mode after every sentence.
-i, --show-id
Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text.
-a, --ambiguious-id AMBI-ID
Ambiguious means ABC => A BC or AB C. If specified (AMBI-ID != 0), The sequence ABC will not be segmented, in binary mode, the AMBI-ID
is written out; in text mode, "<ambi>ABC</ambi>" will be output. Default is 0.
NOTES
Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words.
AUTHOR
Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.
SEE ALSO slmseg(1), ids2ngram (1).
perl v5.14.2 2012-06-09 MMSEG(1)