Hi
I have a file with data arranged into columns. The first column is the chromosome name.
When I use grep to subset only rows with chr1, I get chr1 but also chr10, chr11,..
How do I get only rows with chr1?
grep chr1 filein > fileout
head fileout
chr1 59757841
chr11 108258691 ... (2 Replies)
Hi,
I'm struggling with a regex that would match a 'b' that follows an 'a' and is at the end of a string of non-white characters. For example:
Line 1: aba abab b abb aab bab baa
I can find the right strings but I'm lacking knowledge of how to "discard" the bits that precede bs.... (2 Replies)
Hi all,
any idea how to match the following:
char*<no or any string or space> buf and
char *<no or any string or space> buf
i need to capture the buf characters too.
currently i need two checks to cover this:
#search char* <any string> buf or char *<any string> buf
@noarray =... (2 Replies)
Hi all,
I am looking for a regex syntax to match repeated appearance. Likes,
']+]+' matches for string '65A SOME MORE AND 78B'
Now, this gets messy if I need to extract all such repeated appearance. I don't want to write ] four or five times for matching repeated appearance.
Thanks in... (2 Replies)
I am trying to match a similar line using grep with regular expression
the line is
/remote/mac/pbbbb/abc/def/hij/hop/include/abc/tif/element/test/testfiles/Office.cpp:57: const OfficeType& getType().get() const;
I just need to extract the bold characters using grep with regular expression.... (5 Replies)
hi everyone
suppose we have two scenario
echo ABCD | grep \{4\}
DATE
echo SYSDATE | grep \{4\}
SYSDATE
i want to match the string of four length only please help (5 Replies)
Hi,
I read the book of <<unix shell programming>>. The regular expression ^\(.\)\1 matches the first character on the line and stores it in register 1. Then the expression matches whatever is stored in the register 1, as specified by the \1. The net effect of this regular expression is to match... (2 Replies)
Hi Everybody!
I need some help with a regular expression in Perl that will match files named messages, but also files named message.1, message.2 and so on. So really I need one that will find messages and messages that might be followed by a period and a digit without matching other files like... (2 Replies)
I cannot seem to get this to work correct:
my ($k, $v) = split(/F/, $fc{$DIR}{symbolic}, 2);
Below is the input (the $fc{$DIR}{symbolic} variable):
QMH2562 FW:v5.06.03 DVR:v8.03.07.15.05.09-kbut i also need it to break on FV:
Emulex NC553i FV4.2.401.6 DV8.3.5.86.2pthe code above... (2 Replies)
echo 20110101 | awk '{ print match($0,/^((17||18||19||20)|)-*(|0|1)-*(|0||3)$/))
I am getting a match for the above, where as it shouldn't, as there is no hyphen in the echoed date.
Another question is what is the difference between || and | in the above statement (4 Replies)
Discussion started by: tostay2003
4 Replies
LEARN ABOUT DEBIAN
sylseg-sk
sylseg-sk(1) USER COMMANDS sylseg-sk(1)NAME
sylseg-sk - segments a Slovak words in to the sylables
SYNOPSIS
sylseg-sk [--best] [--color] [--dl debug level] [--help] [--ofile <file_name>] [<input_file>]
DESCRIPTION
The sylabic segmentation is esential for some linguistic or speech recognition applications. Depending on the language either rule based or
statistical approach is beying used. For Slovak the statistical approach seems to be more suitable.
sylseg-sk implements one of the statistical approaches for the syllabic segmentaion. Each input word is segmented into the syllables. The
several possible segmentations are generated and sorted by the likelihood. If no input file is specified, the standard input is expected.
If input file is used then the output is written in to the file as well. The filename is input filename with the extension ".syllables".
The input output code page is ISO 8859-2. To use it with different CP use some CP convertor and pipes. For example to have input and output
in UTF-8 use (for interactive use): filterm UTF8-iso2 iso2-UTF8 sylseg-sk or (for batch processing) iconv -f UTF-8 -t ISO_8859-2 | sylseg-
sk | iconv -f ISO_8859-2 -t UTF-8
Performance of the syllabic segmentation depend on the used statistics. To improve the quality of the segmentaion is possible to train the
better system with the sylseg-sk-training tool and replace the original file located in /usr/share/sylseg_sk/sylseg-sk.stats
The design of the sylseg-sk is language independent. With retrained statistics it theoreticaly should work for any language.
OPTIONS --best Print the best result only.
--color
Enable color output.
--dl 1..5
Set the debug level. Control the amount of displayed information The debug level 0 displays nothing. The maximum level 5 displays
full debugging report. The default debug level is 1.
--help display a short help text
--ofile <file_name>
Write output also in to given file.
EXAMPLES
Use standard input and debug level 3:
sylseg-sk --dl 3
Process all the from file aaa.txt and print just the best segmentation:
sylseg-sk --best aaa.txt
EXIT STATUS
sylseg-sk returns a zero if it succeeds to process all the input words
AUTHOR
Jozef Ivanecky (dodo (at) kanoistika.sk)
SEE ALSO sylseg-sk-training(1), filterm(1), iconv(1), konwert(1)version 0.5 December 1, 2006 sylseg-sk(1)