If you are saying you need column 1st as an index for output then considering that in you shown output 2 times like ABC,AD,AS,ER,YT,YU,ER,GT and ABC,SEis typo. Following may help you in same then, if order like Input_file doesn't matter for you.
Output will be as follows.
Thanks,
R. Singh
I have a case where, I need to look into a file.
Go to each line of the file, find the length of the line, if the length of the line is more than 75 chars, I need to split the line into multiple lines of 75chars max. If the length of the line is less than 75, we need not do anything.
So at the... (4 Replies)
Can any one give me the idea on replacing multiple blank lines with a single blank line?
Please conside it for a file having more than 100 number of characters.
Regards,
Siba (3 Replies)
I have a file that contains the following:
:@:176:@:4:@:name:@:file:@:this is
a summary:@:description can be
long but who knows
can even have <br> tags.:@:how to:@:type:@:18544:@:550:@:400:END:
:@:177:@:9:@:name:@:file:@:summary:@:this
will containg... (18 Replies)
Hi,
I've some files with the following data and i need to convert the lines between the separator ---, into a single line. I've tried with the paste cmd but my main problem is that the number of lines between the separator is not fix, it can very between 1-4 lines.
Input
---
2010-02-22... (4 Replies)
This is related to one of my previous post but now with a slight difference: I need the "Updated:" to be in one line as well as the "Information:" on one line as well. These are in multiple lines right now as seen below. These can have 2 or more lines that needs to be in one line.
System name:... (8 Replies)
Hi,
I have a requirement with,
No~Dt~Notes
1~2011/08/1~"aaa
bbb
ccc
ddd
eee
fff
ggg
hhh"
Single column alone got splitted into multiple lines.
I require the output as
No~Dt~Notes
1~2011/08/1~"aaa<>bbb<>ccc<>ddd<>eee<>fff<>ggg<>hhh"
mean to say those new lines to be... (1 Reply)
Hi,
1. Each message starts with date
2. There is blank line between each message
3. Each message does not contain same number of lines.
Any help in merging multiple lines in each message to a single line is much appreciated.
AIX: Korn Shell
Error log file looks like below.
... (5 Replies)
Hi All,
I have a spool file which as shown below. I want to make it as single line after every semicolon. In this case there should be 2 lines in vi editor.
I am not used to use sed so could you guys please help me out ?
exec spk_dba.sp_runsql('ALP','CREATE DATABASE LINK "TEST" CONNECT TO... (2 Replies)
I have code as below
# create temporary table `temp4277`(key(waybill_no))
select waybill_no,concat_ws('',card_type,card_series_no) cardinfo
from rfid_temp_ticket where waybill_no='4277'
group by... (4 Replies)
Discussion started by: kaushik02018
4 Replies
LEARN ABOUT DEBIAN
mmseg
MMSEG(1) User Contributed Perl Documentation MMSEG(1)NAME
mmseg - maximum matching segment Chinese text.
SYNOPSIS
mmseg -d dict_file [option]... [corpus_file]...
DESCRIPTION
mmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. mmseg segments corpus_file, or standard input if
no filename is specified, and write the segmented result to standard output.
OPTIONS -d dict_file
Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8.
-f,--format (text|bin)
Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short
integer of the word-ids are written to stdout.
-s, --stok STOK_ID
Sentence token id. Default 10. It will be written to output in binary mode after every sentence.
-i, --show-id
Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text.
-a, --ambiguious-id AMBI-ID
Ambiguious means ABC => A BC or AB C. If specified (AMBI-ID != 0), The sequence ABC will not be segmented, in binary mode, the AMBI-ID
is written out; in text mode, "<ambi>ABC</ambi>" will be output. Default is 0.
NOTES
Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words.
AUTHOR
Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.
SEE ALSO slmseg(1), ids2ngram (1).
perl v5.14.2 2012-06-09 MMSEG(1)