I'm trying to write a perl script that I can run as a cron job in root of my web server that will look for .shtml files get their last modified date and replace it in the sitemap_test.xml file. the problem is the substitution doesn't work and when I print to MYFILE it adds the lastmod to the end of the file. Any help would be appreciated.
The Script:
Code:
#!/usr/bin/perl
use warnings;
use strict;
use POSIX;
my @shtml_files = <*\.shtml>;
my $filename = "sitemap_test.xml";
foreach my $shtml_files (@shtml_files)
{
my $lastmod = strftime("%Y-%m-%dT%H:%M:%S-06:00", localtime((stat($shtml_files))[9]));
open (MYFILE, "+<$filename") or die "Cannot open file: $!";
while (<MYFILE>)
{
if (m/$shtml_files/)
{
my $nextLine = <MYFILE>;
my $nextLine1 = <MYFILE>;
my $sub = $nextLine1;
$sub =~ s/$nextLine1/<lastmod>$lastmod<\/lastmod>/;
print MYFILE "$sub\n";
}
}
close (MYFILE);
}
(a) Your program is quite inefficient since it populates the array "@shtml_files" and then parses the entire file for each array element. If your array has, say, 100 elements then you'll be opening, parsing and closing the "sitemap_test.xml" file 100 times.
(b) A more efficient way would be to populate the array and then walk through the xml file just once. Whenever you encounter the <loc> tag in the file, look up the filename in the array. You can use "grep" operator for that. (Perl 6 will have an operator to emulate "in" i.e. "element x in <array>" that you find in awk/pascal/sql etc.)
(c) The in-place modification of the xml file via the update mode ("+<") works well for small files, but doesn't scale well. For huge files, it could jam your system's virtual memory. Instead, use a temporary file that uses much less memory and creates a backup file.
Here's the updated script that implements all the thoughts above -
Code:
$
$
$ cat -n process.pl
1 #!/usr/bin/perl
2 use warnings;
3 use strict;
4 use POSIX;
5
6 my $lastmod;
7 my @shtml_files;
8 foreach (<*\.shtml>) {
9 push @shtml_files, $_.strftime("%Y-%m-%dT%H:%M:%S-06:00", localtime((stat($_))[9]));
10 }
11
12 my $old = "sitemap_test.xml";
13 my $old_orig = "sitemap_test_orig.xml";
14 my $new = "sitemap_test_new.xml";
15
16 open (OLD, "< $old") or die "Can't open $old: $!";
17 open (NEW, "> $new") or die "Can't open $new: $!";
18 select(NEW); # new default filehandle
19 while (<OLD>) {
20 # process line, change $_ if required and then print to NEW filehandle
21 if (/<loc>(.*?)<\/loc>/ and grep {/^$1/} @shtml_files) {
22 $lastmod = (grep {/^$1/} @shtml_files)[0];
23 $lastmod =~ s/^$1//;
24 } elsif (/<lastmod>(.*?)<\/lastmod>/ and $lastmod ne "") {
25 $_ = "<lastmod>$lastmod<\/lastmod>\n";
26 $lastmod = "";
27 }
28 print NEW $_;
29 }
30 close (OLD) or die "Can't close $old: $!";
31 close (NEW) or die "Can't close $new: $!";
32 rename($old, $old_orig) or die "Can't rename $old to $old_orig: $!";
33 rename($new, $old) or die "Can't rename $new to $old: $!";
34
$
$
#! /usr/bin/perl
use strict;
use warnings;
use POSIX qw~strftime~;
my @shtmls = glob ( '*.shtml' );
my %filemtime;
for ( @shtmls ) {
my $lastmod = strftime("%Y-%m-%dT%H:%M:%S-06:00", localtime((stat($_))[9]));
$filemtime{$_} = $lastmod;
}
open ( F, "sitemap.xml" ) || die "$!\n";
my @xml = <F>;
close F;
my ($new, $ON);
for ( @xml ) {
for my $fn ( keys %filemtime ) { if ( /<loc>.*$fn.*<\/loc>/ ) { $ON = $fn; }}
if ( $ON && s/(<lastmod>).*?(<\/lastmod>)/$1$filemtime{$ON}$2/ ) { $ON = (); }
$new .= $_;
}
open ( G, ">new.xml" ) || die "$!\n";
print G $new;
close G;
Submitted for your perusal. Tyler's is more stable. I was working on this while he posted his....
Thank You Tyler and Deindorfer, I knew there was a more efficient solution than my amateur effort. I'll try both of these and have already learned a lot about scripting in perl. Thanks again for your time and effort helping me out.
Below is the content in my XML file
<name>XXX</name>
<eventType>Uptime</eventType>
<eventType>Delay</eventType>
<eventType>Delay</eventType>
<name>YYY</name>
<eventType>Uptime</eventType>
<eventType>Delay</eventType>
... (12 Replies)
Good Day All
Im quiet new to ksh scripting and need a bit of your help. I am attempting to write a script that reads in an XML and extracts certain field values from an XML file. The values are all alphanumeric and consist of two components: e.g "Test 1".
I need to to create a script that... (2 Replies)
Hi Everyone,
I'm new here and I was checking this old post:
/shell-programming-and-scripting/180669-splitting-file-into-several-smaller-files-using-perl.html
(cannot paste link because of lack of points)
I need to do something like this but understand very little of perl.
I also check... (4 Replies)
we have a CSV which i need to convert to XML using Perl or Unix shell scripting.
I was able to build this XML in oracle database. However, SQL/XML query is running for long time. Hence, I'm considering to write a Perl or shell script to generate this XML file. Basically need to build this XML... (3 Replies)
Hi,
I have a xml file that I need to modify 1 line to change some value from 2 to 10 (or any number).
Sample input:
<!-- some text here>
.
.
.
<message:test name="ryan">
<message:sample-channel charset="UTF-8" max-value="2" wait="20">
... (5 Replies)
Hello,
I have a XML file and need to update the data for a specific XML Attribute in the file. I need a Perl or Awk command to look for <INTERCHANGE_CONTROL_NO>000000601</INTERCHANGE_CONTROL_NO>
in the XML file and change the first two 0 of the value to 9.
For instance ... (4 Replies)
Hi experts,
I have a set of xml files in folder which has the below field.
<mm:sessionID>157.235.206.12900397BE4:A</mm:sessionID>,
I need to update this field regularly with new session id, which I have it from a login file.
Can anyone tell me how to add a new value in <mm:sessionID>... (3 Replies)
I need to know the way. I have got parsing down some nodes. But I was unable to get the child node perfectly. If you have code please send it. It will be very useful for me. (0 Replies)