Sponsored Content
Top Forums Shell Programming and Scripting awk: Print fields between two delimiters on separate lines and send to variables Post 302685143 by tay9000 on Friday 10th of August 2012 09:37:32 PM
Old 08-10-2012
Thanks a lot for your help. It is about 4x faster than my version. And makes me feel better about the amount of disk i/o and processor I am using. I used your script as-is but updated the "ID" variable to use the $FILE variable. The ID is actually just the filename without the path before it. But now since the script is no longer working inside of the folder, the full path gets printed. =[

Now I am getting output like so. I want to go back to my old habits and use sed to remove the extra characters but you'd probably want to smack me haha. And now the major challenge is to create a file for each user in the To: fields and redirect the line output to those files so I can email them to the receiver... again thank you for all the help!
Code:
Processed /home/tay/spam/spam-0fWSqXDpwom4.gz
To: <user1@domain1.com<<<user2@domain1.com< 	<user3@domain1.com<<<user4@domain1.com< 	<user5@domain1.com<<<user6@domain2.com< 	<user7@domain2.com
From: <ret@your.schoolsearch.us
Subject:Your<education<information
Score:12.403
ID:/home/tay/spam/spam-0fWSqXDpwom4.gz

Code:
Processed /home/tay/spam/spam-0fycklYG3rfD.gz
To: <user@domain1.com
From: <searchdentalinsurance.net@beastertaps.com
Subject:Find<affordable<dental<insurance
Score:18.222
ID:/home/tay/spam/spam-0fycklYG3rfD.gz

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

trying to print selected fields of selected lines by AWK

I am trying to print 1st, 2nd, 13th and 14th fields of a file of line numbers from 29 to 10029. I dont know how to put this in one code. Currently I am removing the selected lines by awk 'NR==29,NR==10029' File1 > File2 and then doing awk '{print $1, $2, $13, $14}' File2 > File3 Can... (3 Replies)
Discussion started by: ananyob
3 Replies

2. Shell Programming and Scripting

extract nth line of all files and print in output file on separate lines.

Hello UNIX experts, I have 124 text files in a directory. I want to extract the 45678th line of all the files sequentialy by file names. The extracted lines should be printed in the output file on seperate lines. e.g. The input Files are one.txt, two.txt, three.txt, four.txt The cat of four... (1 Reply)
Discussion started by: yogeshkumkar
1 Replies

3. Shell Programming and Scripting

Compare Tab Separated Field with AWK to all and print lines of unique fields.

Hi. I have a tab separated file that has a couple nearly identical lines. When doing: sort file | uniq > file.new It passes through the nearly identical lines because, well, they still are unique. a) I want to look only at field x for uniqueness and if the content in field x is the... (1 Reply)
Discussion started by: rocket_dog
1 Replies

4. Shell Programming and Scripting

awk print header as text from separate file with getline

I would like to print the output beginning with a header from a seperate file like this: awk 'BEGIN{FS="_";print ((getline < "header.txt")>0)} { if (! ($0 ~ /EL/ ) print }" input.txtWhat am i doing wrong? (4 Replies)
Discussion started by: sdf
4 Replies

5. Shell Programming and Scripting

Print only lines where fields concatenated match strings

Hello everyone, Maybe somebody could help me with an awk script. I have this input (field separator is comma ","): 547894982,M|N|J,U|Q|P,98,101,0,1,1 234900027,M|N|J,U|Q|P,98,101,0,1,1 234900023,M|N|J,U|Q|P,98,54,3,1,1 234900028,M|H|J,S|Q|P,98,101,0,1,1 234900030,M|N|J,U|F|P,98,101,0,1,1... (2 Replies)
Discussion started by: Ophiuchus
2 Replies

6. Shell Programming and Scripting

How to print 1st field and last 2 fields together and the rest of the fields after it using awk?

Hi experts, I need to print the first field first then last two fields should come next and then i need to print rest of the fields. Input : a1,abc,jsd,fhf,fkk,b1,b2 a2,acb,dfg,ghj,b3,c4 a3,djf,wdjg,fkg,dff,ggk,d4,d5 Expected output: a1,b1,b2,abc,jsd,fhf,fkk... (6 Replies)
Discussion started by: 100bees
6 Replies

7. Shell Programming and Scripting

awk sort based on difference of fields and print all fields

Hi I have a file as below <field1> <field2> <field3> ... <field_num1> <field_num2> Trying to sort based on difference of <field_num1> and <field_num2> in desceding order and print all fields. I tried this and it doesn't sort on the difference field .. Appreciate your help. cat... (9 Replies)
Discussion started by: newstart
9 Replies

8. UNIX for Beginners Questions & Answers

How to count lines of CSV file where 2 fields match variables?

I'm trying to use awk to count the occurrences of two matching fields of a CSV file. For instance, for data that looks like this... Joe,Blue,Yes,No,High Mike,Blue,Yes,Yes,Low Joe,Red,No,No,Low Joe,Red,Yes,Yes,Low I've been trying to use code like this... countvar=`awk ' $2~/$color/... (4 Replies)
Discussion started by: nmoore2843
4 Replies

9. Shell Programming and Scripting

awk to print line is values between two fields in separate file

I am trying to use awk to find all the $3 values in file2 that are between $2 and $3 in file1. If a value in $3 of file2 is between the file1 fields then it is printed along with the $6 value in file1. Both file1 and file2 are tab-delimited as well as the desired output. If there is nothing to... (4 Replies)
Discussion started by: cmccabe
4 Replies

10. Shell Programming and Scripting

awk to print lines based on text in field and value in two additional fields

In the awk below I am trying to print the entire line, along with the header row, if $2 is SNV or MNV or INDEL. If that condition is met or is true, and $3 is less than or equal to 0.05, then in $7 the sub pattern :GMAF= is found and the value after the = sign is checked. If that value is less than... (0 Replies)
Discussion started by: cmccabe
0 Replies
spamassassin-run(3)					User Contributed Perl Documentation				       spamassassin-run(3)

NAME
spamassassin - simple front-end filtering script for SpamAssassin SYNOPSIS
spamassassin [options] [ < mailmessage | path ... ] spamassassin -d [ < mailmessage | path ... ] spamassassin -r [ < mailmessage | path ... ] spamassassin -k [ < mailmessage | path ... ] spamassassin -W|-R [ < mailmessage | path ... ] Options: -L, --local Local tests only (no online tests) -r, --report Report message as spam -k, --revoke Revoke message as spam -d, --remove-markup Remove spam reports from a message -C path, --configpath=path, --config-file=path Path to standard configuration dir -p prefs, --prefspath=file, --prefs-file=file Set user preferences file --siteconfigpath=path Path for site configs (def: /etc/mail/spamassassin) --cf='config line' Additional line of configuration -x, --nocreate-prefs Don't create user preferences file -e, --exit-code Exit with a non-zero exit code if the tested message was spam --mbox read in messages in mbox format --mbx read in messages in UW mbx format -t, --test-mode Pipe message through and add extra report to the bottom --lint Lint the rule set: report syntax errors -W, --add-to-whitelist Add addresses in mail to persistent address whitelist --add-to-blacklist Add addresses in mail to persistent address blacklist -R, --remove-from-whitelist Remove all addresses found in mail from persistent address list --add-addr-to-whitelist=addr Add addr to persistent address whitelist --add-addr-to-blacklist=addr Add addr to persistent address blacklist --remove-addr-from-whitelist=addr Remove addr from persistent address list --ipv4only, --ipv4-only, --ipv4 Disable attempted use of ipv6 for DNS --progress Print progress bar -D, --debug [area=n,...] Print debugging messages -V, --version Print version -h, --help Print usage message DESCRIPTION
spamassassin is a simple front-end filter for SpamAssassin. Using the SpamAssassin rule base, it uses a wide range of heuristic tests on mail headers and body text to identify "spam", also known as unsolicited bulk email. Once identified, the mail is then tagged as spam for later filtering using the user's own mail user-agent application. The default tagging operations that take place are detailed in "TAGGING" in spamassassin. By default, message(s) are read in from STDIN (< mailmessage), or from specified files and directories (path ...) STDIN and files are assumed to be in file format, with a single message per file. Directories are assumed to be in a format where each file in the directory contains only one message (directories are not recursed and filenames containing whitespace or beginning with "." or "," are skipped). The options --mbox and --mbx can override the assumed format, see the appropriate OPTION information below. Please note that SpamAssassin is not designed to scan large messages. Don't feed messages larger than about 500 KB to SpamAssassin, as this will consume a huge amount of memory. OPTIONS
-e, --error-code, --exit-code Exit with a non-zero error code, if the message is determined to be spam. -h, --help Print help message and exit. -V, --version Print version and exit. -t, --test-mode Test mode. Pipe message through and add extra report. Note that the report text assumes that the message is spam, since in normal use it is only visible in this case. Pay attention to the score instead. If you run this with -d, the message will first have SpamAssassin markup removed before being tested. -r, --report Report this message as manually-verified spam. This will submit the mail message read from STDIN to various spam-blocker databases. Currently, these are the Distributed Checksum Clearinghouse "http://www.rhyolite.com/anti-spam/dcc/", Pyzor "http://pyzor.sourceforge.net/", Vipul's Razor "http://razor.sourceforge.net/", and SpamCop "http://www.spamcop.net/". If the message contains SpamAssassin markup, the markup will be stripped out automatically before submission. The support modules for DCC, Pyzor, and Razor must be installed for spam to be reported to each service. SpamCop reports will have greater effect if you register and set the "spamcop_to_address" option. The message will also be submitted to SpamAssassin's learning systems; currently this is the internal Bayesian statistical-filtering system (the BAYES rules). (Note that if you only want to perform statistical learning, and do not want to report mail to third- parties, you should use the "sa-learn" command directly instead.) -k, --revoke Revoke this message. This will revoke the mail message read from STDIN from various spam-blocker databases. Currently, these are Vipul's Razor. Revocation support for the Distributed Checksum Clearinghouse, Pyzor, and SpamCop is not currently available. If the message contains SpamAssassin markup, the markup will be stripped out automatically before submission. The support modules for Razor must be installed for spam to be revoked from the service. The message will also be submitted as 'ham' (non-spam) to SpamAssassin's learning systems; currently this is the internal Bayesian statistical-filtering system (the BAYES rules). (Note that if you only want to perform statistical learning, and do not want to report mail to third-parties, you should use the "sa-learn" command directly instead.) --lint Syntax check (lint) the rule set and configuration files, reporting typos and rules that do not compile correctly. Exits with 0 if there are no errors, or greater than 0 if any errors are found. -W, --add-to-whitelist Add all email addresses, in the headers and body of the mail message read from STDIN, to a persistent address whitelist. Note that you must be running "spamassassin" or "spamd" with a persistent address list plugin enabled for this to work. --add-to-blacklist Add all email addresses, in the headers and body of the mail message read from STDIN, to the persistent address blacklist. Note that you must be running "spamassassin" or "spamd" with a persistent address list plugin enabled for this to work. -R, --remove-from-whitelist Remove all email addresses, in the headers and body of the mail message read from STDIN, from a persistent address list. STDIN must contain a full email message, so to remove a single address you should use --remove-addr-from-whitelist instead. Note that you must be running "spamassassin" or "spamd" with a persistent address list plugin enabled for this to work. --add-addr-to-whitelist Add the named email address to a persistent address whitelist. Note that you must be running "spamassassin" or "spamd" with a persistent address list plugin enabled for this to work. --add-addr-to-blacklist Add the named email address to a persistent address blacklist. Note that you must be running "spamassassin" or "spamd" with a persistent address list plugin enabled for this to work. --remove-addr-from-whitelist Remove the named email address from a persistent address whitelist. Note that you must be running "spamassassin" or "spamd" with a persistent address list plugin enabled for this to work. --ipv4only, --ipv4-only, --ipv4 Do not use IPv6 for DNS tests. Normally, SpamAssassin will try to detect if IPv6 is available, using only IPv4 if it is not. Use if the existing tests for IPv6 availability produce incorrect results or crashes. -L, --local Do only the ''local'' tests, ones that do not require an internet connection to operate. Normally, SpamAssassin will try to detect whether you are connected to the net before doing these tests anyway, but for faster checks you may wish to use this. Note that SpamAssassin's network rules are run in parallel. This can cause overhead in terms of the number of file descriptors required if --local is not used; it is recommended that the minimum limit on fds be raised to at least 256 for safety. -d, --remove-markup Remove SpamAssassin markup (the "SpamAssassin results" report, X-Spam-Status headers, etc.) from the mail message. The resulting message, which will be more or less identical to the original, pre-SpamAssassin input, will be output to STDOUT. (Note: the message will not be exactly identical; some headers will be reformatted due to some features of the Mail::Internet package, but the body text will be.) -C path, --configpath=path, --config-file=path Use the specified path for locating the distributed configuration files. Ignore the default directories (usually "/usr/share/spamassassin" or similar). --siteconfigpath=path Use the specified path for locating site-specific configuration files. Ignore the default directories (usually "/etc/mail/spamassassin" or similar). --cf='config line' Add additional lines of configuration directly from the command-line, parsed after the configuration files are read. Multiple --cf arguments can be used, and each will be considered a separate line of configuration. For example: spamassassin -t --cf="body NEWRULE /text/" --cf="score NEWRULE 3.0" -p prefs, --prefspath=prefs, --prefs-file=prefs Read user score preferences from prefs (usually "$HOME/.spamassassin/user_prefs"). --progress Prints a progress bar (to STDERR) showing the current progress. This option will only be useful if you are redirecting STDOUT (and not STDERR). In the case where no valid terminal is found this option will behave very much like the --showdots option in other SpamAssassin programs. -D [area,...], --debug [area,...] Produce debugging output. If no areas are listed, all debugging information is printed. Diagnostic output can also be enabled for each area individually; area is the area of the code to instrument. For example, to produce diagnostic output on bayes, learn, and dns, use: spamassassin -D bayes,learn,dns Higher priority informational messages that are suitable for logging in normal circumstances are available with an area of "info". For more information about which areas (also known as channels) are available, please see the documentation at: L<http://wiki.apache.org/spamassassin/DebugChannels> -x, --nocreate-prefs Disable creation of user preferences file. --mbox Specify that the input message(s) are in mbox format. mbox is a standard Unix message folder format. --mbx Specify that the input message(s) are in UW .mbx format. mbx is the mailbox format used within the University of Washington's IMAP implementation; see "http://www.washington.edu/imap/". SEE ALSO
sa-learn(1) spamd(1) spamc(1) Mail::SpamAssassin::Conf(3) Mail::SpamAssassin(3) PREREQUISITES
"Mail::SpamAssassin" BUGS
See <http://issues.apache.org/SpamAssassin/> AUTHORS
The SpamAssassin(tm) Project <http://spamassassin.apache.org/> COPYRIGHT
SpamAssassin is distributed under the Apache License, Version 2.0, as described in the file "LICENSE" included with the distribution. perl v5.16.3 2011-06-06 spamassassin-run(3)
All times are GMT -4. The time now is 05:41 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy