Hello,
I am interested in writing a context driven NGram analysis i.e. detecting the frequency of utterance of a given character based on its immediate context i.e. the character which can preced and follow the given entity. In the case of Intial and Final the context would be immediate character following or preceding the entity respectively.
An example would illustrate what is meant. Given the following words as Input
The NGram output with frequency would be as under:
Although this is feasible in a program in C or Java, I wonder if a Perl or AWK script would do the job.
I am sure this tool will help quite a few people working in Natural language processing.
2 remarks, please:
My working environment is Windows hence Piping is impossible
The data on which the script would be used for training would be very large.
Hello,
I am interested in writing a context driven NGram analysis i.e. detecting the frequency of utterance of a given character based on its immediate context i.e. the character which can preced and follow the given entity. In the case of Intial and Final the context would be immediate character following or preceding the entity respectively.
An example would illustrate what is meant. Given the following words as Input
The NGram output with frequency would be as under:
Although this is feasible in a program in C or Java, I wonder if a Perl or AWK script would do the job.
I am sure this tool will help quite a few people working in Natural language processing.
2 remarks, please:
My working environment is Windows hence Piping is impossible
The data on which the script would be used for training would be very large.
Many thanks.
I don't understand what you're trying to do.
Why isn't the following your NGram output list:
Even if you define "entity" to be a single character, you still seem to be missing:
from your output list.
I guess I did the analysis manually and hence slipped up.
I agree traditional ngrams work the way you have defined, but I am interested in contextual ngrams in which the frequency of occurrence of a given string is determined by its immediate context.
Since the analysis is at a micro-level and not a macrol-evel, such NGrams can be used for predicting whether a given string complies with the training data and witha few additional tweaks even suggest a valid structure.
I hope I have made the idea clear and why the analysis in terms of context driven Ngrams is slightly different.
Many thanks for your response
You could try something like:
If your input is always one word per line, you can make this run a little bit faster by removing the code in red above and changing every occurrence of $i to $1.
If you want to try this on a Solaris/SunOS system, use /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk instead of just awk.
Last edited by Don Cragun; 10-13-2013 at 06:50 AM..
Reason: Fix typo (code; not lines).
I have created one file that contains all the necessary info in it to create a download link. In each of the lines /results/analysis/output/Home/Auto_user_S5-00580-6-Medexome_67_032/plugin_out/FileExporter_out.67... (8 Replies)
Hello,
I have a large data file which contains a huge amount of garbage i.e. words which do not exist in the language. An example will make this clear:
kpaware
nlupset
rrrbring
In other words these words are invalid in English and constitute garbage in the data.
I have identified such... (2 Replies)
Hi,
I am having an xml file which looks like this:
<Nodes>
<Node>
<Nodename>Student</Nodename>
<Filename>1.txt</filename>
<Node>
<Nodename>Dummy</Nodename>
<Filename>22.txt</filename>
</Node>
</Node>
</Nodes>
The text files will have data like this:
#1.txt... (8 Replies)
Hi,
I have one file name file.txt
It has the following contents:
#File Contents
StartTime,EndTime,COUNTER1,COUNTER2,COUNTER3
12:13,12:14,0,1,0
The output should be like this:
StartTime: 12:13
ENDTIME: 12:14 (2 Replies)
hey gurus!
i m a perl newbie!!
i want to create an empty file and also directory in perl...
how to print a msg if the present working directory has ".db" extension. like in shell
if ] ; then
echo "hello "
i want to do this in perl!!
please help.. (4 Replies)
Hi, i have some files in text format and i want to create a file with all the information in the others files, but i don't want copy all the information exactly i just need the information from the fourth line to the end of file
I will try to explain with an example:
file1.txt
abc
abc... (1 Reply)
hi
I am posting this for my friend...
is it possible to create widgets using perl pk module in IBM AIX 5.3?
They dont have a GUI so is it possible to create the above mentioned thing in a CUI?
thanks!
Sathish (1 Reply)
Hi Guys!!!!!!!!!!!!!!!!!!!!!
can we create or copy directories in perl without using system commands like "mkdir" and "cp"
script needed urgent !!!!!!!!!!!!!!!!!!!!!!!!!!!
cheers,
aajan (7 Replies)