![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How do I extract text only from html file without HTML tag | los111 | UNIX for Dummies Questions & Answers | 4 | 11-28-2007 04:40 AM |
| Searching for text in a Space delimited File | andyblaylock | UNIX for Dummies Questions & Answers | 6 | 11-27-2007 07:33 PM |
| coverting html data to text in 'c' | phani_sree | High Level Programming | 3 | 10-18-2007 10:06 AM |
| Parsing comma delimited text file | chengwei | Shell Programming and Scripting | 5 | 02-23-2007 05:38 AM |
| Looping thru tab delimited data | tipsy | Shell Programming and Scripting | 6 | 10-17-2006 05:44 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
||||
|
I have a file I've already partially pruned with grep that has data like:
<a href="MasterDetailResults.asp?textfield=a&Application=3D Home Architect 4">3D Home Architect 4</a> </td> Approved </td> -- <a href="MasterDetailResults.asp?textfield=a&Application=3d Home Architect 6">3d Home Architect 6</a> </td> Not Approved </td> -- <a href="MasterDetailResults.asp?textfield=a&Application=A to Zap">A to Zap</a> </td> Approved </td> -- except much, much more of it ;-) I want to get the application name (i.e. 3D Home Architect 4) and the status (i.e. Approved or Not Approved) and turn it into this: 3D Home Architect 4|Approved 3d Home Architect 6|Not Approved A to Zap|Approved etc. for use as a searchable database or import into Excel I want to use bash scripting with sed or gawk to do this in the smallest number of lines (number of lines is not critical, of course ;-) Thanks in advance for your help. |
|
||||
|
Hi,
try Code:
sed -n '/Application/{N;s/.*Application=\([^"]*\).*\n\(.*\)<.*/\1 | \2/p}' file
Code:
sed -n '/Application/{N;s/.*Application=\([^"]*\).*\
\(.*\)<.*/\1 | \2/p}' file
HTH Chris |
|
||||
|
Quote:
Code:
awk -F"\"" '
/Application=/{
sub(".*=","",$2); s=$2
getline; sub(" <.*","")
print s "|" $0
}' file
|
|
||||
|
Thank you all for your solutions. I'm going to use Christoph Spohr's because I'm more comfortable with sed than I am with awk (although I know it's very powerful). I get an output with spaces after the pipe because there are spaces at the beginning of the line. How can I modify
Code:
sed -n '/Application/{N;s/.*Application=\([^"]*\).*\n\(.*\)<.*/\1 | \2/p}' file
Also, what if my input file has another line between the two lines in question: Code:
<tr>
<td height="23" align="default" valign="top">
<a href="MasterDetailResults.asp?textfield=a&Application=3D Home Architect 4">3D Home Architect 4</a> </td>
<td align="default" valign="top">
Approved </td>
</tr>
<td align="default" valign="top"> line with sed before finishing things off with the sed code above. |
| Sponsored Links | ||
|
|
![]() |
| Bookmarks |
| Tags |
| bash, csv, delimited, html, sed awk bash shell |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|