10-29-2009
Try transferring the file via acsii mode instead of binary.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I have few files in unix which are in dos format. While I am copying these files, ^M, ^@, etc characters are being generated.
I tried dos2unix command in Linux and it doesn't work.
I tried sed to remove these characters but they won't go.
I came to about this 'tr' command and tried to use it... (16 Replies)
Discussion started by: chiru_h
16 Replies
2. Shell Programming and Scripting
Hi
I have a file that has semicolons in it (;) is there a way to just remove these in the file. Example
name: Joe Smith; group: Group1;
name: Mary White; group: Group2; (2 Replies)
Discussion started by: bombcan
2 Replies
3. HP-UX
Hi,
I have a very huge file and it contains some unprintable characters like ^H and ^D.
If I try to remove using cat test1.ser| tr -d '\136 110'>newfile1 it is only removing ^and all spaces in the file.
How can I remove these characters (^D ^H) and keep my spaces as it is?
Thanks &... (1 Reply)
Discussion started by: arsheshadri
1 Replies
4. UNIX for Dummies Questions & Answers
Hi all,
Any help on how to do the following? :eek:
I have an infile as follows:
_thisishowyouwritehelloworld
_thisisalsohowyouwritehelloworld2
I want to delete the characters from "_" to "how" and be left with:
youwritehelloworld
youwritehelloworld2
I am able to do delete from a... (2 Replies)
Discussion started by: dr_sabz
2 Replies
5. UNIX for Advanced & Expert Users
hi
I have a perl script conv.pl. when i execute this file and direct i to log file I see lots of ^M characters in the log file. There is no ^M in conv.pl file. Log file is generated only after conv.pl is executed.
Please help as how to get rid of these.
This conv.pl is going to get schduled... (0 Replies)
Discussion started by: infyanurag
0 Replies
6. Shell Programming and Scripting
Dear Friends,
I want to remove text between two patters.
Problem is, it has random special characters like \ / | * ` ~ ! $ etc.
These random special characters has no fixed length. But these special characters are appearing between a fixed pattern
e.g.
DM&^%#|#!\/?CT
Expected output... (14 Replies)
Discussion started by: anushree.a
14 Replies
7. UNIX for Dummies Questions & Answers
Hi,
in a file, i have records as below:
123|62|absnb|267629
123|267|28728|uiuip
123|567|26761|2676
i want to remove the non printable characters after the end of each record.
I guess there are certain charcters but not visible.
i don't know what character that is exactly.
I used... (2 Replies)
Discussion started by: pandeesh
2 Replies
8. UNIX for Dummies Questions & Answers
I have been given a shell script that I need to amend. To do the following
extract the filename from the flag file by removing the .flag extension.
# Local variables
# Find if the flag files exists
MASK=coda_mil2*.flag
# Are there any files?
bookmark="40"
fileFound=0
ls -1... (3 Replies)
Discussion started by: andymay
3 Replies
9. Shell Programming and Scripting
bash-3.00$ cat temp.txt
./a/REA01/ces1/apps/ces_ces1_init3_aa.ear/ces.war/WEB-INF/classes/reds/common/environment.properties
./a/REA01/ces1/apps/ces_ces1_init3_aa.ear/commonproperties/hi/HostIntegration.properties... (9 Replies)
Discussion started by: bhas85
9 Replies
10. Shell Programming and Scripting
Hi there,
Im having a bit of difficulty with this one and I suspect its because of the character I want to match against maybe causing me a problem, but i wanted to remove everything up to (but not including) the first instance of '{' in a string
so for example the string that I want to... (2 Replies)
Discussion started by: hcclnoodles
2 Replies
MMSEG(1) User Contributed Perl Documentation MMSEG(1)
NAME
mmseg - maximum matching segment Chinese text.
SYNOPSIS
mmseg -d dict_file [option]... [corpus_file]...
DESCRIPTION
mmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. mmseg segments corpus_file, or standard input if
no filename is specified, and write the segmented result to standard output.
OPTIONS
-d dict_file
Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8.
-f,--format (text|bin)
Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short
integer of the word-ids are written to stdout.
-s, --stok STOK_ID
Sentence token id. Default 10. It will be written to output in binary mode after every sentence.
-i, --show-id
Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text.
-a, --ambiguious-id AMBI-ID
Ambiguious means ABC => A BC or AB C. If specified (AMBI-ID != 0), The sequence ABC will not be segmented, in binary mode, the AMBI-ID
is written out; in text mode, "<ambi>ABC</ambi>" will be output. Default is 0.
NOTES
Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words.
AUTHOR
Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.
SEE ALSO
slmseg(1), ids2ngram (1).
perl v5.14.2 2012-06-09 MMSEG(1)