02-09-2009
Please post an example of input and expected output. Please make the number base and character set clear, or state that it is raw data. We normally assume ASCII characters, but your sample characters are mostly outside the normal printable range.
Your example looks like hexadecimal rather than binary which may be why nobody has responded.
Please also post the version of Unix/Linux. There are core tools in most unixes to handle conversion.
If you are trying to fix a non-text data file this is not a job for shell scripting.
9 More Discussions You Might Find Interesting
1. What is on Your Mind?
Okay, I don't know how this is funny, but it is. I found this while searching for Linux (please don't ask). :o (0 Replies)
Discussion started by: gnerd
0 Replies
2. Solaris
:) Hi,
Can any one help me to find available escape sequences in UNIX shell programming? ( Like \n, \c etc,. in C or C++)
Iam generating one report using one of the script, in that it is very much essential.
Regards,
LOVE (6 Replies)
Discussion started by: Love
6 Replies
3. Programming
hi
how to deal with hardware in c or c++?
i need to learn how open CD and close any hard >>> and control hardware
:) (2 Replies)
Discussion started by: hgphsf
2 Replies
4. Programming
Hi,
i have read trigraph sequence in
The C99 Draft (N869, 18 January, 1999)
printf("Eh???/n");
will produce
printf("Eh?\n");
what does that mean?
i tried that but i am getting the same output i.e
Eh???/n.
what actually these tri graph characters are?
any idea why ,when and... (1 Reply)
Discussion started by: MrUser
1 Replies
5. Shell Programming and Scripting
My file looks like this:
But I would like to 'trim' all sequences to the same lenght 32 characters, keeping intact all the identifier (>GHXCZCC01AJ8CJ)
Would it be possible to use awk to perform this task? (2 Replies)
Discussion started by: Xterra
2 Replies
6. Shell Programming and Scripting
Hello *nix specialists,
Im working for a non profit organisation in Germany to transport DSL over WLAN to people in areas without no DSL. We are using Linksys WRT 54 router with DD-WRT firmware There are at the moment over 180 router running but we have to change some settings next time. So my... (7 Replies)
Discussion started by: digidax
7 Replies
7. Shell Programming and Scripting
Hello to all,
I would like to search sequences of bytes inside big binary file.
The bin file contains blocks of information, each block begins is estructured as follow:
1- Each block begins with the hex 32 (1 byte) and ends with FF. After the FF of the last block, it follows 33.
2- Next... (59 Replies)
Discussion started by: Ophiuchus
59 Replies
8. Shell Programming and Scripting
Hi Gurus,
Escape sequences \n, \t, \b, \t, \033(1m are not working.
I just practiced these escape sequences. It worked first. Later its not working.
Also the command - echo inside the script editor shows as shaded by a color. Before that echo inside the script editor wont show like this.... (4 Replies)
Discussion started by: GaneshAnanth
4 Replies
9. Shell Programming and Scripting
Hi,
Please would anybody help find the missing sequences in the filename of the files?
I have for example these files:
OOOAAAALOGS400001.txt
OOOAAAALOGS400002.txt
OOOAAAALOGS400003.txt
OOOBBBBLOGS40001.txt
OOOBBBBLOGS400002.txt
OOOBBBBLOGS400003.txt
OOOCCCCLOGS400001.txt... (13 Replies)
Discussion started by: arrals_vl
13 Replies
LEARN ABOUT DEBIAN
slmseg
SLMSEG(1) User Contributed Perl Documentation SLMSEG(1)
NAME
slmseg - maximum matching segment Chinese text.
SYNOPSIS
slmseg -d dict_file [option]... [corpus_file]...
DESCRIPTION
slmseg is a tool for segmenting Chinese text into words using maximum matching algorithm. slmseg segments corpus_file, or standard input if
no filename is specified, and write the segmented result to standard output.
OPTIONS
-d dict_file
Use dict_file as lexicon. A default lexicon can be found at /usr/share/sunpinyin-slm/dict.utf8.
-f,--format (text|bin)
Output Format, can be 'text' or 'bin'. default 'bin'. Normally, in text mode, word text are output, while in binary mode, binary short
integer of the word-ids are written to stdout.
-s, --stok STOK_ID
Sentence token id. Default 10. It will be written to output in binary mode after every sentence.
-i, --show-id
Show Id info. Under text output format mode, attach id after known words. If under binary mode, print id(s) in text.
-m, --model language-model-file Speficy the language model file. This file is always generated by slmthread.
NOTES
Under binary mode, consecutive id of 0 are merged into one 0. Under text mode, no space are inserted between unknown-words.
AUTHOR
Originally written by Phill.Zhang <phill.zhang@sun.com>. Currently maintained by Kov.Chai <tchaikov@gmail.com>.
SEE ALSO
mmseg(1), ids2ngram (1).
perl v5.14.2 2012-06-09 SLMSEG(1)