Hi everyone, and thank you for your help with this. I am VERY new with perl so all of your help is appreciated. I have tried google but as I don't know the proper terms to search for and could be daunting for a newbie scripter... I know this is very easy for most of you! Thanks!
I have a multi-gig file of the repeated format:
I want to use perl to read in this text file, "out1.txt" and (parse?) it into the values, firstname, last name, job title, company, etc. etc. and output to a csv file
I know that for each of these values, they occur within a specific pattern eg. the "Company" value I want will be always be <td colspan="2" class="row"><input type="text" maxlength="30" name="company" size="30" value="HERE" /></td>. And the other patterns will occur in the same place in similar strings. I know ALL records will exist for each "person"
Is there a good script that is already written that is close OR can someone help me formulate from this to perl :
I am just looking for basic framework for one or two sequential patterns, the while loop, etc.
The problems for me is matching values in a specific location of multiple known strings in sequential order and putting them into a csv file.
Thanks for your help!
Last edited by sinusoid; 11-01-2010 at 03:50 PM..
Reason: making a little more clear
---------- Post updated at 09:00 AM ---------- Previous update was at 08:39 AM ----------
Quote:
Originally Posted by sinusoid
...
Can someone quickly explain
for
is it indexing the last character position in "firstName" and then splitting on that, or is the split(/"/" a regex expression...
The "rindex" function in this expression:
returns the position of the last (i.e. rightmost) occurrence of substr in str.
If substr doesn't exist in str, then it returns -1.
So the condition -
checks if the rightmost index of "firstName" in $line is greater than -1. In other words, it checks if "firstName" exists in $line.
If it does, then this statement -
splits $line on the literal double-quotes character and assigns the tokens (or split elements) to the array "@splitLine".
As an example:
will split the string "abc:def:ghijk:l" on the literal semi-colon character (":") and assign the split elements to the array @x. So, after that operation, @x will have-
"abc" at index 0,
"def" at index 1,
"ghijk" at index 2 and
"l" at index 3.
The "//" in the split function allows regexes to be used, instead of literal characters. So, for instance, if the string you want to split is "a b c d e", and the number of spaces between the elements is variable, then you can use a regex in the split condition like so:
You could use double-quotes instead of "//".
After $line is split on double-quotes and assigned to @splitLine, the value of "firstName" is the 11 element of that array.
I am using Perl version 5.8.4 and trying to understand the use of regular expression. Following is my code and output.
$string = "Perl is a\nScripting language";
($start) = ($string =~ /\A(.*?) /);
@lines = ($string =~ /^(.*?) /gm);
print "First Word (using \\A): $start\n","Line... (4 Replies)
Hi,
I have a list of IP, eg :
192.168.0.15
192.168.0.24
192.168.2.110
192.168.2.200
And I would like the shortest pattern who match with '192.168.0' and '192.168.2' (without the last dot and number). (7 Replies)
My log file looks as given below, its actually a huge file around 1 GB and these are some of the line:
conn=5368758 op=10628050 msgId=64 - RESULT err=0 tag=101 nentries=1 etime=0
conn=7462122 op=-1 msgId=-1 - fd=247 slot=247 LDAPS connection from 10.13.18.12:37645 to 10.18.6.45
conn=7462122... (5 Replies)
I have a 2 files in .gz format and it consists of 5 million lines the format of the file would be
gzcat file1.gz | more
abcde
aerere
ffgh56
..
..
12345
gzcat file2.gz | more
abcde , 12345 , 67890,
ffgh56 , 45623 ,12334
whatever the string is in the file1 should be matched... (3 Replies)
I am doing a file patterhn matching for a text file in PERL
I am using this,,, but it says that no file is found
$filepattern = '\d{1,4}.*A0NW9693.NDM.HBIDT.*.AD34XADJ.txt';
Can anyone help me out with Perl Pattern Matching concepts and how to do pattern matching for this txt file:... (4 Replies)
Hi experts,
I have many occurances of the following headers in a file. I need to grep for the word changed/inserted in the header, calculate the difference between the two numbers and list the count incrementally.
Headers in a file look like this:
-------------------
---------------------... (6 Replies)
Hello experts,
I have a file containing the following text(shortened here).
File Begin
----------
< # Billboard.d3fc1302a677.imagePath=S:\\efcm_T4
< Billboard.d3fc1302a677.imagePath=S:\\efcm_T4
---
> # Billboard.d3fc1302a677.imagePath=S:\\efcm_Cassini
>... (2 Replies)
Hello
I got the below one from in one of this forums
For Ex: Loading File System Networking in nature
now i need to extract the patterns between the words File and Networking :
i.e. sample output: System
cmd used : cat <file> | sed 's/.*File //' | sed 's/Closing.*$//'
Actually... (0 Replies)
hi i am trying to get digits inside brackes from file , whose structure is defined below
CREATE TABLE TELM
(SOC_NO CHAR (3) NOT NULL,
TXN_AMOUNT NUMBER (17,3)
SIGN_ON_TIME CHAR (8)
TELLER_APP_LIMIT NUMBER (17,3)
FIL01 ... (2 Replies)