Sponsored Content
Top Forums Shell Programming and Scripting Retrieve information Text/Word from HTML code using awk/sed Post 302897515 by sk2code on Monday 14th of April 2014 01:19:27 PM
Old 04-14-2014
Yoda and everyone thanks for your help.
The awk command still results in blank output but the following command helped me.

Code:
sed 's#.*txt>##;s#<.*##' file.html

Thank you!!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to use sed to remove html tags including text between them

How to use sed to remove html tags including text between them? Example: User <b> rolvak </b> is stupid. It does not using <b>OOP</b>! and should output: User is stupid. It does not using ! Thank you.. (2 Replies)
Discussion started by: alphagon
2 Replies

2. UNIX for Dummies Questions & Answers

retrieve lines using sed, grep or awk

Hi, I'm looking for a command to retrieve a block of lines using sed or grep, probably awk if that can do the job. In below example, By searching for words "Third line2" i'm expecting to retrieve the full block starting with 'BEGIN' and ending with 'END' of the search. Example: ... (3 Replies)
Discussion started by: learning_linux
3 Replies

3. Shell Programming and Scripting

SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only: Text Forecasts - Environment Canada I use this: sed -n "/Greater Victoria./,/Fraser Valley./p" But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies

4. Shell Programming and Scripting

How to retrieve digital string using sed or awk

Hi, I have filename in the following format: YUENLONG_20070818.DMP HK_20070818_V0.DMP WANCHAI_20070820.DMP KWUNTONG_20070820_V0.DMP How to retrieve only the digital part with sed or awk and return the following format: 20070818 20070818 20070820 20070820 Thanks! Victor (3 Replies)
Discussion started by: victorcheung
3 Replies

5. Shell Programming and Scripting

sed/awk to retrieve max year in column

I am trying to retrieve that max 'year' in a text file that is delimited by tilde (~). It is the second column and the values may be in Char format (double quoted) and have duplicate values. Please help. (4 Replies)
Discussion started by: CKT_newbie88
4 Replies

6. Shell Programming and Scripting

Execute a C program and retrieve information

Hi I have the following script: #!/bin/sh gcc -o program program.c ./program & PID=$! where i execute a C program and i get its pid. I want to retrieve information about this program (e.g memory consumption) using command top. So far i have: top -d 1.0 -p $PID But i dont know how to... (6 Replies)
Discussion started by: nteath
6 Replies

7. Shell Programming and Scripting

cut, sed, awk too slow to retrieve line - other options?

Hi, I have a script that, basically, has two input files of this type: file1 key1=value1_1_1 key2=value1_2_1 key4=value1_4_1 ... file2 key2=value2_2_1 key2=value2_2_2 key3=value2_3_1 key4=value2_4_1 ... My files are 10k lines big each (approx). The keys are strings that don't... (7 Replies)
Discussion started by: fzd
7 Replies

8. Shell Programming and Scripting

Extract word from text (sed,awk, etc...)

Hello, I need some help extracting the number after the RBA e.g 15911688 from the below block of text (e.g: grep RBA |sed .......). The code should be valid for blocks if text generated at different times as well and not for the below text only. ... (2 Replies)
Discussion started by: drbiloukos
2 Replies

9. Shell Programming and Scripting

Perl code to retrieve text from website

perl -MLWP::Simple -le '$s=shift;$c=get("http://www.google.com/intl/en/chrome/devices/chromecast/$s/");$c=~/meta content=(.*?)name=\"Remote free\"/msg; print length($1),"\t$1"' ?gclid=CJDg27OdnL0CFcFlOgodFD8A6Q >output.txt output.txt should be: Chromecast works with devices you already own,... (9 Replies)
Discussion started by: cmccabe
9 Replies

10. Shell Programming and Scripting

Awk/sed HTML extract

I'm extracting text between table tags in HTML <th><a href="/wiki/Buick_LeSabre" title="Buick LeSabre">Buick LeSabre</a></th> using this: awk -F "</*th>" '/<\/*th>/ {print $2}' auto2 > auto3 then this (text between a href): sed -e 's/\(<*>\)//g' auto3 > auto4 How to shorten this into one... (8 Replies)
Discussion started by: p1ne
8 Replies
PYP(1)							      General Commands Manual							    PYP(1)

NAME
pyp - The Pyed Piper: A Modern Python Alternative to awk, sed and Other Unix Text Manipulation Utilities SYNOPSIS
pyp [options] files ... DESCRIPTION
pyp, the Pyed Piper, is a command line tool for text manipulation. It is similar to awk and sed in functionality, but its subcommands are Python based, and thus more familiar to many programmers. It can operate both on a per-line base and on the complete input stream. Different features can be pipelined in a single command by using the pipe character familiar from shell commands. pyp backs up its input for reruns with modified commands, and can save commands as macros. On the downside, the rerun feature makes it unsuitable for continuous pipe operation. OPTIONS
These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. For a complete description, use --manual. -h, --help Show this help message and exit. -m, --manual Prints out extended help. -l, --macro_list Lists all available macros. -s MACRO_SAVE_NAME, --macro_save=MACRO_SAVE_NAME Saves current command as macro. use "#" for adding comments EXAMPLE: pyp -s "great_macro # prints first letter" "p[1]". -f MACRO_FIND_NAME, --macro_find=MACRO_FIND_NAME Searches for macros with keyword or user name. -d MACRO_DELETE_NAME, --macro_delete=MACRO_DELETE_NAME Deletes specified public macro. -g, --macro_group Specify group macros for save and delete; default is user. -t TEXT_FILE, --text_file=TEXT_FILE Specify text file to load. For advanced users, you should typically cat a file into pyp. -x, --execute Execute all commands. -c, --turn_off_color Prints raw, uncolored output. -u, --unmodified_config Prints out generic PypCustom.py config file. -b BLANK_INPUTS, --blank_inputs=BLANK_INPUTS Generate this number of blank input lines; useful for generating numbered lists with variable 'n'. -n, --no_input Use with command that generates output with no input; same as --dummy_input 1. -k, --keep_false Print blank lines for lines that test as False. default is to filter out False lines from the output. -r, --rerun Rerun based on automatically cached data from the last run. Use this after executing "pyp", pasting input into the shell, and hitting CTRL-D. SEE ALSO
awk(1), grep(1), sed(1). AUTHOR
pyp was written by Toby Rosen <tobyrosen@gmail.com>. This manual page was written by Khalid El Fathi <khalid@elfathi.fr>, for the Debian project (and may be used by others). March 19, 2012 PYP(1)
All times are GMT -4. The time now is 11:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy