07-13-2009
Split strings based on length
Hi All
I am very much in need of help splitting strings based on length in Perl. e.g.,
Input text is :
International NOUN
Corp. NOUN
's POS
Tulsa NOUN
Output I want is :
International I In Int Inte l al nal onal NOUN
Corp. C Co Cor Corp . p. rp. orp. NOUN
's ' 's __nil__ __nil__ s 's __nil__ __nil__ POS
Tulsa T Tu Tul Tuls a sa lsa ulsa NOUN
Please help me. Thanks in advance.
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
Can some please help me? Want to find files over 35 characters in length? I am running HPUX. Would it be possible with find?
Thanks in advance (8 Replies)
Discussion started by: J_ang
8 Replies
2. Shell Programming and Scripting
I'm trying to find the longest word in /usr/share/dict/words
The first thing I can think of is to sort the content by length then it would be easy to find out, but then i realize theres no option of sort to sort by length.
Could you guys please give me some help?:confused: (7 Replies)
Discussion started by: rockbike
7 Replies
3. Shell Programming and Scripting
Hi all,
I want to delete all lowercase characters from my file, but only strings of length 7 and more.
For example, how can I go from:
JHGEFigeIGDUIirfyfiyhgfoiyfKJHGuioyrDHG
To:
JHGEFigeIGDUIKJHGuioyrDHG
There should be a trick to add to sed 's///g', but I can't figure it out.... (2 Replies)
Discussion started by: elbuzzo
2 Replies
4. Shell Programming and Scripting
Suppose i have a file which contains thousands of records. e.g adjgmptjadmwpgjmwmd i need to replace the string from 3rd to 8th position using awk script in entire file. And also the positions will be passed as parameter. (3 Replies)
Discussion started by: laknar
3 Replies
5. UNIX for Dummies Questions & Answers
Hello guys,
should be a very easy questn for you:
I need to delete strings in file1 based on the list of strings in file2.
like file2:
word1_word2_
word3_word5_
word3_word4_
word6_word7_
file1:
word1_word2_otherwords..,word3_word5_others... (7 Replies)
Discussion started by: roussine
7 Replies
6. Shell Programming and Scripting
Hello
I started to learn bash about 1 or 2 weeks, please help me.
I have about 200000 strings like these:
ATGCCAGGGGAGCCCAGAAGGTAAAACTTGATCTGAAATGTATGTTTATATATAATTTAGGTAATCAATTGGCATGTGAA
and I need to split each letter to get:
A T G C C A G G G G A G C C C A G A A G G T A A A A C T T... (9 Replies)
Discussion started by: geparada88
9 Replies
7. Shell Programming and Scripting
Hi,
I need to split a file based on last occurece of a string. PFB the explanation
I have a file in following format
aaaaaaaaaaaaaaaaaaaaaaaaaaa
bbbbbbbbbbbbbbbbbbbbbbbbbbb
ccccccccccccccccccccccccccc
ddddddddddddddddddddddddddd
3186rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr... (4 Replies)
Discussion started by: Neelkanth
4 Replies
8. Shell Programming and Scripting
Hi,
I have a similar input format-
A_1 2
B_0 4
A_1 1
B_2 5
A_4 1
and looking to print in this output format with headers. can you suggest in awk?awk because i am doing some pattern matching from parent file to print column 1 of my input using awk already.Thanks!
letter number_of_letters... (5 Replies)
Discussion started by: prashob123
5 Replies
9. Shell Programming and Scripting
Hi,
i need help to extract certain strings/words from lines with different length. I have 3 columns separated by tab delimiter. like below
Probable arabinan endo-1,5-alpha-L-arabinosidase A (EC 3.2.1.99) (Endo-1,5-alpha-L-arabinanase A) (ABN A) abnA Ady3G14620
Probable arabinan... (5 Replies)
Discussion started by: redse171
5 Replies
10. Shell Programming and Scripting
I'm having data like this,
"8955719","186497034","0001","M","3"
"8955719","186497034","0002","M","10"
"8955719","186497034","0003","M","10"
"8955719","186497034","0004","M","3"
"8955723","186499034","0001","M","3"
"8955723","186499034","0002","M","10"
"8955723","186499034","0003","M","10"... (3 Replies)
Discussion started by: Artlk
3 Replies
LEARN ABOUT CENTOS
wnstats
WNSTATS(7) WordNettm WNSTATS(7)
NAME
wnstats - WordNet 3.0 database statistics
DESCRIPTION
Number of words, synsets, and senses
+----------+---------+---------+------------------+
| POS | Unique | Synsets | Total |
| | Strings | | Word-Sense Pairs |
+----------+---------+---------+------------------+
|Noun | 117798 | 82115 | 146312 |
|Verb | 11529 | 13767 | 25047 |
|Adjective | 21479 | 18156 | 30002 |
|Adverb | 4481 | 3621 | 5580 |
+----------+---------+---------+------------------+
|Totals | 155287 | 117659 | 206941 |
+----------+---------+---------+------------------+
Polysemy information
+----------+------------------+------------+------------+
| POS | Monosemous | Polysemous | Polysemous |
| | Words and Senses | Words | Senses |
+----------+------------------+------------+------------+
|Noun | 101863 | 15935 | 44449 |
|Verb | 6277 | 5252 | 18770 |
|Adjective | 16503 | 4976 | 14399 |
|Adverb | 3748 | 733 | 1832 |
+----------+------------------+------------+------------+
|Totals | 128391 | 26896 | 79450 |
+----------+------------------+------------+------------+
+----------+----------------------------+----------------------------+
| POS | Average Polysemy | Average Polysemy |
| | Including Monosemous Words | Excluding Monosemous Words |
+----------+----------------------------+----------------------------+
|Noun | 1.24 | 2.79 |
|Verb | 2.17 | 3.57 |
|Adjective | 1.40 | 2.71 |
|Adverb | 1.25 | 2.50 |
+----------+----------------------------+----------------------------+
NOTES
Statistics for all types of adjectives and adjective satellites are combined.
The total of all unique noun, verb, adjective, and adverb strings is actually 147278. However, many strings are unique within a syntactic
category, but are in more than one syntactic category. The figures in the table represent the unique strings in each syntactic category.
WordNet 3.0 Dec 2006 WNSTATS(7)