The first file (FILE_INFO in my code) consists of four parameters for each line.
I want to write out the corresponding sentences from 2nd file (M_IN in my code) and 3rd file (E_IN in my code) based on the 3rd column and 4th column parameters of first file. The M# and E# are the sentence numbers in 2nd and 3rd files.
The format of 2nd and 3rd files is : [where M# and E# are for sentence numbers in 2nd and 3rd files]
EOS means the fullstop .(End of sentence)
Please correct the script I have written. The E_OUT and M_OUT are the output files where the corresponding sentences will be written.
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi.
Many people don't wish to slog through someone else's code to find logic errors. You can use intermediate prints or the debug facility of perl to see where your code is incorrect. You could also look at provably correct code to see how it works.
Here's a solution in shell:
producing:
So now that I think I understand the problem, I do a perl version and try to make sure that I avoid reading the data file more than once (as is done with grep in the shell script):
Using the same sample data files, produces:
Note the use of print statements if $debug is true. Simply swapping the position of the assignments turns on and off those debugging outputs. That's useful for a quick program, and the code can be left in, ready to turn on if and when the code is modified ... cheers, drl
PS I eliminated the extra trailing space before the full stop, it looked better that way.
---------- Post updated 08-09-09 at 04:32 AM ---------- Previous update was 08-08-09 at 03:17 PM ----------
Hi drl
The perl script works well for lesser number of sentences but when the number crosses 15 or more. A group of sentences merge together to form a line.Also, Some of the sentences get printed repeatedly. In fact, I want to run this program for thousands of sentences. Is this problem due to the array or something else?
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi.
OK, I changed the way that the sentence data structure is handled. The memory use might be high for a very large file, but for the sample data you have provided, this produces the same output:
producing:
cheers, drl
After the correction of the code, some of the sentences get printed repeatedly at the output side. What could be the problem? I want the individual sentences to be printed only once. So, what can be done in this regard?
Last edited by my_Perl; 08-12-2009 at 07:22 AM..
Reason: Change in addressing
I used to use this script to extract the same lines from two files:
grep -f file1 file2 > outputfile
now I have file1 AB029895
AF208401
AF309648
AF526378
AJ444445
AJ720950
AJ851546
AY568629
AY591907
AY994087
BU116401
BU116599
BU119689
BU121308
BU125622
BU231446
BU236750
BU237045 (4 Replies)
I have a text file that looks like this :
root/user/usr1/0001/abab1*
root/user/usr1/0001/abab2*
root/user/usr1/0002/acac1*
root/user/usr1/0002/acac2*
root/user/usr1/0003/adad1*
root/user/usr1/0004/aeae1*
root/user/usr1/0004/aeae2*
How could I code this to extract just the subjects... (9 Replies)
Data file example
I look for primary and * to isolate the interesting slot number.
slot=`sed '/^primary$/,/\*/!d' filename | tail -1 | sed s'/*//' | awk '{print $1" "$2}'`
Now I want to get the Touch line for only the associate slot number, in this case, because the asterisk... (2 Replies)
I use "MineOS" (a linux distro with python scripts and web ui included for managing a Minecraft Server). The author of the scripts is currently having a problem with the Minecraft server log file being spammed with certain entries. He's working on clearing up the spam.
But in the meantime, I'm... (8 Replies)
I have hundreds of files to process. In each file
I need to look for a pattern then
extract value(s) from next line and then
search for value(s) selected from point (2) in the same file at a specific position.
HEADER ELECTRON TRANSPORT 18-MAR-98 1A7V
TITLE CYTOCHROME... (7 Replies)
Dear all,
Greetings.
I would like to ask for your help to extract lines with specific words in addition 2 lines before and after these lines by using awk or sed.
For example, the input file is:
1 ak1 abc1.0
1 ak2 abc1.0
1 ak3 abc1.0
1 ak4 abc1.0
1 ak5 abc1.1
1 ak6 abc1.1
1 ak7... (7 Replies)
I have an xml file with the below data:
unix>Cat address.xml
<Address City=”Amsterdam”
Street = “station straat”
ZIPCODE="2516 CK "
</Address>
<Address City=”Amsterdam”
Street = “Leeuwen straat”
ZIPCODE="2517 AB "
</Address>
<Address City=”The Hauge”
Street = “kirk straat”
... (1 Reply)
Hello UNIX experts,
I have 124 text files in a directory. I want to extract the 45678th line of all the files sequentialy by file names. The extracted lines should be printed in the output file on seperate lines.
e.g. The input Files are one.txt, two.txt, three.txt, four.txt
The cat of four... (1 Reply)
Hi Experts,
I have lots of big size files. Below is the snapshot of a file. From the files i want extract informmation like belows. What could be command or script for that?
DELETE
RESP:940120105
CREATE
RESP:0
GET
RESP:0
File contains like below-
...
...
<log... (8 Replies)