To get the id value dynamically


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Old 2 Weeks Ago
To get the id value dynamically

Hi,

I have .html file and need to get the id value based on the latest modified date from the html file.

output : 1456.

Thanks in advance

Here is the .html file

Code:
<html>
<head><title>Index</title>
</head>
<body>
<h1> BI EI Team</h1>
<pre>ID   Last modified      Size</pre><hr/>
<pre>
<a href="1456/">1456/</a>  01-Mar-2019 15:49    108MB
<a href="4561/">4561/</a>  28-Feb-2019 11:08    121MB
</pre>
<hr/>
</body>
 </html>



Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 2 Weeks Ago at 05:41 AM.. Reason: Added CODE tags.
# 2  
Old 2 Weeks Ago
Welcome to the forum.

Please become accustomed to provide decent context info of your problem.

It is always helpful to carefully and detailedly phrase a request, and to support it with system info like OS and shell, related environment (variables, directory structures, options), preferred tools, adequate (representative) sample input and desired output data and the logics connecting the two including your own attempts at a solution, and, if existent, system (error) messages verbatim, to avoid ambiguities and keep people from guessing.



So - what have you tried so far?
Do you have GNU date available, or another version that allows a to-be-operated-upon-date parameter?

Last edited by RudiC; 2 Weeks Ago at 05:52 AM..
This User Gave Thanks to RudiC For This Post:
rbatte1 (2 Weeks Ago)
# 3  
Old 2 Weeks Ago
Thanks Rudic,

I tried with below command but not accurate value am getting.

Code:
cat file.html |grep "<a href=" | head -2 |awk -F ">" '{print $2}

output : 1456/</a




Moderator's Comments:
Mod Comment Please use CODE tags as required by forum rules!

Last edited by RudiC; 2 Weeks Ago at 06:10 AM.. Reason: Added CODE tags.
# 4  
Old 2 Weeks Ago
There is a closing single quote missing in your command pipeline, and the output will be the two lines with data from your input file:
Code:
1456/</a
4561/</a

Please be aware that none of the usual *nix text tools is well suited to handle *ml or similar data; any solution based on those will be crooked.

What is the essential criterion to identify the lines to extract from your html file? I used the field count (NF > 4) in this approach:
Code:
awk -F"[- :/]*" '
        {gsub (/<[^>]*>/, "")
        }
!NF     {next
        }
NF > 4  {print $4, $3, $2, $5, $6, $1
        }
' file | LC_ALL=C sort -nr -k1,1 -k2,2Mr -k3 | head -1 | cut -d" " -f6
1456

It depends on sort (GNU coreutils) to sort abbreviated month names.
This User Gave Thanks to RudiC For This Post:
rbatte1 (2 Weeks Ago)
# 5  
Old 2 Weeks Ago
Hi ,

I achieved the result as my own. Is there any better way to do this.

Code:
cat file.html | grep "<a href=" | head -2 |awk -F ">" '{print $2}'| awk -F "/" '{print $1}'


Thx
# 6  
Old 2 Weeks Ago
If you really want the id of the newest match, take RudiCs snippet. If you just assume that the first match is the newest, you can go simpler by using that:

Code:
awk 'match($0,/<a href="([0-9]+)/,res) {print res[1];exit}' data.html

If someone wants to use a parser for more robust operation, one can use this as a starting point:
Code:
xmlstarlet sel -t -v "/html/body/pre" data.html  | awk '/^\s*[0-9]+\// {print $1,$2}'


# output 

1456/ 01-Mar-2019
4561/ 28-Feb-2019

Note: I assume only gnu awk supports match(subject,pattern,array) Other variants of awk only support 2 parameter match like match(subject,pattern), which will cause a syntax error here.

Last edited by stomp; 2 Weeks Ago at 07:34 AM..
These 2 Users Gave Thanks to stomp For This Post:
rbatte1 (2 Weeks Ago) RudiC (2 Weeks Ago)
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Need to get versioning of the branch name dynamically lkeswar Shell Programming and Scripting 5 2 Weeks Ago 03:25 PM
Dynamically split file coweb Shell Programming and Scripting 3 12-02-2015 07:17 PM
Dynamically referencing a Path sudo Shell Programming and Scripting 11 05-15-2014 11:47 AM
Split file Dynamically Pratik4891 Shell Programming and Scripting 2 11-07-2013 11:14 AM
Swap of fields dynamically i150371485 Shell Programming and Scripting 4 10-22-2012 07:48 AM
Dynamically choosing the interpreter pandeesh Shell Programming and Scripting 1 04-16-2012 04:22 AM
Geting a value dynamically from topas jayadeava AIX 4 06-03-2011 11:26 AM
Get Column location Dynamically dinjo_jo Shell Programming and Scripting 7 03-23-2011 02:02 AM
Dynamically using mail host paventhan Shell Programming and Scripting 6 01-25-2011 09:58 AM
Dynamically allocated structures in C brinch Programming 2 07-18-2009 03:49 AM
Dynamically create arrays EdgarTorres Shell Programming and Scripting 0 05-13-2009 09:17 PM
set array name dynamically esham Shell Programming and Scripting 4 02-19-2009 04:04 PM
Dynamically locating a file yoursdavinder Shell Programming and Scripting 1 01-02-2009 11:21 AM
Spliting the file dynamically kingganesh04 Shell Programming and Scripting 4 12-16-2008 05:46 AM
getting value from field dynamically finalight Shell Programming and Scripting 12 05-20-2008 04:13 AM