02-21-2010
noob question - is awk the tool to clean dirty text files?
Hi,
nevermind. I think I've found the answer. It appears I was looking for index, match, sub, and gsub.
I want to write a shell script that will clean the html out of a bunch of files and format the data for import into excel.
Awk seems like a powerful tool, but it seems oriented to text that is already formatted and delimited. From my cursory study, awk seems to only be able to access lines and words. Is there a way to find and manipulate chunks of text within an awk "word".
Or perhaps there are better tools...?
Last edited by yogert909; 02-21-2010 at 08:18 PM..
9 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
whitout using awk / sad and so on? (3 Replies)
Discussion started by: umen
3 Replies
2. Programming
I'm just getting started to lean C and I'm using Ubuntu today I found a tutorial at this site: http://einstein.drexel.edu/courses/CompPhys/General/C_basics/c_tutorial.html and I got an error after compiling the fist code:
#include < stdio.h>
void main()
{
printf("\nHello World\n");
} ... (9 Replies)
Discussion started by: arya6000
9 Replies
3. UNIX for Dummies Questions & Answers
Ok here is the deal, I have a command given to me by some systems guy who I cannot get ahold of on the weekend without paying him alot of money to help me. I need to get this done before Monday as I am just getting pummeled by DOS attacks. The comand given was....
netstat -ntu | awk '{print... (1 Reply)
Discussion started by: Hexabah
1 Replies
4. UNIX for Dummies Questions & Answers
I have a file with 3 digit numbers in it formatted as such:
123
065
321
How would I go about seeing if each number is less than 100 and if so outputting it to another file
Yes, I am a bit of a noob. I have tried with grep but I don't think it'll work.
Any general direction would be... (6 Replies)
Discussion started by: kirkm76
6 Replies
5. Ubuntu
I am editing the squid.confi on my server.
I am done editing.
How do I exit the confi file?
Thank you. (2 Replies)
Discussion started by: sethartha
2 Replies
6. Shell Programming and Scripting
This is regarding using awk tool to find lines matching between 2 patterns.
cat file | awk '/pat1/,/pat2/'
But it's not working as expected in the following case.
If pat1 also comes after pat2 then it's matching whole file after pat1.
e.g.
# > cat -n file
1 First line... (3 Replies)
Discussion started by: anand_bh
3 Replies
7. Shell Programming and Scripting
Hello,
I am new to shell scripting and i am trying to figure why is this not working with else statement.
I am searching for every directory in that DIR i am in, however the "else" seems to be triggered whenever the run the script..
Much thanks in advance!
#!/bin/shell
for item in... (3 Replies)
Discussion started by: Reb0rn
3 Replies
8. Shell Programming and Scripting
Experts Good day,
I want to sort two files f1 & f2 to matching with f1's first field with f2's 3rd field like to get in a result file :
I tried with join but getting wrong result, I think there must be something with awk or other unix tool:
cat f1
MYQCI63 srvcmi12
D7QDI ... (4 Replies)
Discussion started by: rveri
4 Replies
9. Shell Programming and Scripting
I have two directories called English and Hindi. Each directory contains the same number of files with the only difference being that in the case of the English Directory the tag is
.english
and in the Hindi one the tag is
.Hindi
The file may contain either a single text or more than one text... (7 Replies)
Discussion started by: gimley
7 Replies
IGAWK(1) Utility Commands IGAWK(1)
NAME
igawk - gawk with include files
SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ...
igawk [ all gawk options ] [ -- ] program-text file ...
DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1).
AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like
@include getopt.awk
in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path.
OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports.
EXAMPLES
cat << EOF > test.awk
@include getopt.awk
BEGIN {
while (getopt(ARGC, ARGV, "am:q") != -1)
...
}
EOF
igawk -f test.awk
SEE ALSO
gawk(1)
Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995.
AUTHOR
Arnold Robbins (arnold@skeeve.com).
Free Software Foundation Nov 3 1999 IGAWK(1)