Sponsored Content
Top Forums Shell Programming and Scripting awk -- Extract data from html within multiple tags as reference Post 302779679 by counfhou on Wednesday 13th of March 2013 08:15:06 AM
Old 03-13-2013
awk -- Extract data from html within multiple tags as reference

Hi, I'm trying to get some data from an html file, but the problem is before it can extract the information I have multiple patterns that need to be passed through.

https://www.unix.com/shell-programmin...tml-files.html

Is a similar problem. The only difference is I have to add more tags,so within the <td> tag then I first need to find a <p> tag etc. I googled a bit around but nowhere I found an example with multiple patterns. Maybe that's not the road to go by?
Anyway if anyone could tell me whether its possible to expand those ranges to multiple ones I would be very grateful.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

extract data from html tables

hi i need to use unix to extract data from several rows of a table coded in html. I know that rows within a table have the tags <tr> </tr> and so i thought that my first step should be to to delete all of the other html code which is not contained within these tags. i could then use this method... (8 Replies)
Discussion started by: Streetrcr
8 Replies

2. Shell Programming and Scripting

How to extract data from BNC xml with reference brackets?

I have data like the following pattern: <change date="2000-01-09" who="#OUCS">Updated all catrefs</change> <change date="2000-01-08" who="#OUCS">Manually updated tagcounts, titlestmt, and title in source</change> <change date="1999-09-13" who="#UCREL">POS codes revised for BNC-2; header... (14 Replies)
Discussion started by: Johnivy
14 Replies

3. Shell Programming and Scripting

SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only: Text Forecasts - Environment Canada I use this: sed -n "/Greater Victoria./,/Fraser Valley./p" But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies

4. UNIX for Dummies Questions & Answers

AWK, extract data from multiple files

Hi, I'm using AWK to try to extract data from multiple files (*.txt). The script should look for a flag that occurs at a specific position in each file and it should return the data to the right of that flag. I should end up with one line for each file, each containing 3 columns:... (8 Replies)
Discussion started by: Liverpaul09
8 Replies

5. UNIX for Dummies Questions & Answers

Using AWK: Extract data from multiple files and output to multiple new files

Hi, I'd like to process multiple files. For example: file1.txt file2.txt file3.txt Each file contains several lines of data. I want to extract a piece of data and output it to a new file. file1.txt ----> newfile1.txt file2.txt ----> newfile2.txt file3.txt ----> newfile3.txt Here is... (3 Replies)
Discussion started by: Liverpaul09
3 Replies

6. Shell Programming and Scripting

extract data with awk from html files

Hello everyone, I'm new to this forum and i am new as a shell scripter. my problem is to have html files in a directory and I would like to extract from these some data that lies between two different lines Here's my situation <td align="default"> oxidizability (mg / l): data_to_extract... (6 Replies)
Discussion started by: sbobotex
6 Replies

7. Shell Programming and Scripting

extract complex data from html table rows

I have bash, awk, and sed available on my portable device. I need to extract 10 fields from each table row from a web page that looks like this: </tr> <tr> <td>28 Apr</td> <td><a... (6 Replies)
Discussion started by: rickgtx
6 Replies

8. Shell Programming and Scripting

Awk/sed HTML extract

I'm extracting text between table tags in HTML <th><a href="/wiki/Buick_LeSabre" title="Buick LeSabre">Buick LeSabre</a></th> using this: awk -F "</*th>" '/<\/*th>/ {print $2}' auto2 > auto3 then this (text between a href): sed -e 's/\(<*>\)//g' auto3 > auto4 How to shorten this into one... (8 Replies)
Discussion started by: p1ne
8 Replies

9. Shell Programming and Scripting

Extract data using a reference

Gents, If there the possibility can to extract data using a reference from other file. input.txt ( big file which contends all data output.txt ( data extracted ) selection.txt ( information to extract the data Example In file input.txt there is big data each record have 56 lines like... (3 Replies)
Discussion started by: jiam912
3 Replies

10. UNIX for Beginners Questions & Answers

awk to extract value after keyword in html

Using awk to extract value after a keyword in an html, and store in ts. The awk does execute but ts is empty. I use the tag as a delimiter and the keyword as a pattern, but there probably is a better way. Thank you :). file <html><head><title>xxxxxx xxxxx</title><style type="text/css"> ... (4 Replies)
Discussion started by: cmccabe
4 Replies
GIT-NAME-REV(1) 						    Git Manual							   GIT-NAME-REV(1)

NAME
git-name-rev - Find symbolic names for given revs SYNOPSIS
git name-rev [--tags] [--refs=<pattern>] ( --all | --stdin | <commit-ish>... ) DESCRIPTION
Finds symbolic names suitable for human digestion for revisions given in any format parsable by git rev-parse. OPTIONS
--tags Do not use branch names, but only tags to name the commits --refs=<pattern> Only use refs whose names match a given shell pattern. The pattern can be one of branch name, tag name or fully qualified ref name. If given multiple times, use refs whose names match any of the given shell patterns. Use --no-refs to clear any previous ref patterns given. --exclude=<pattern> Do not use any ref whose name matches a given shell pattern. The pattern can be one of branch name, tag name or fully qualified ref name. If given multiple times, a ref will be excluded when it matches any of the given patterns. When used together with --refs, a ref will be used as a match only when it matches at least one --refs pattern and does not match any --exclude patterns. Use --no-exclude to clear the list of exclude patterns. --all List all commits reachable from all refs --stdin Transform stdin by substituting all the 40-character SHA-1 hexes (say $hex) with "$hex ($rev_name)". When used with --name-only, substitute with "$rev_name", omitting $hex altogether. Intended for the scripter's use. --name-only Instead of printing both the SHA-1 and the name, print only the name. If given with --tags the usual tag prefix of "tags/" is also omitted from the name, matching the output of git-describe more closely. --no-undefined Die with error code != 0 when a reference is undefined, instead of printing undefined. --always Show uniquely abbreviated commit object as fallback. EXAMPLE
Given a commit, find out where it is relative to the local refs. Say somebody wrote you about that fantastic commit 33db5f4d9027a10e477ccf054b2c1ab94f74c85a. Of course, you look into the commit, but that only tells you what happened, but not the context. Enter git name-rev: % git name-rev 33db5f4d9027a10e477ccf054b2c1ab94f74c85a 33db5f4d9027a10e477ccf054b2c1ab94f74c85a tags/v0.99~940 Now you are wiser, because you know that it happened 940 revisions before v0.99. Another nice thing you can do is: % git log | git name-rev --stdin GIT
Part of the git(1) suite Git 2.17.1 10/05/2018 GIT-NAME-REV(1)
All times are GMT -4. The time now is 03:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy