Since the OP says that file1.txt and file2.txt actually does have BOLD tags (i.e., [B] and [/B]) in the data to mark the digit strings to be matched and awk's match function will treat those tags in the variable i as matching expressions instead of as literal text, RudiC's code doesn't seem to work. I didn't get any output from pravin27's perl script either (I assume for the same reason, but haven't dug into it.) And given that we have been shown lines with "VPN ACC" and some with "VPNACC" and some lines with "KCC script," and some lines with "ID = <digit_string>,", I'm not confident that the bolded string will always be in field 25 in file1.txt. Note also that the OP said he was looking for _ and 3 or 4 digits (not just 3) in file2.txt following the matching bolded digit string (but none of the sample input had 4 digits).
The following seems to work in my tests, but we could probably do something more efficient if we had a better definition of the input file record formats:
Code:
awk '
BEGIN { bds = "[[]B[]]([0-9])+[[]/B[]]" # ERE to match a bolded digit string.
}
# If a line in the 1st file ...
FNR == NR {
# contains a bolded string of digits, save that bolded string.
if(match($0, bds))
bs[substr($0, RSTART, RLENGTH)]
printf("NR=%d, bs[%s] created\n", NR, substr($0, RSTART, RLENGTH))
next
}
# If a line in the 2nd input file contains a bolded string followed by an
# underscore, 3 or 4 digits, and a space...
/[[]\/B[]]_([0-9]){3,4} / {
# print the line if the bolded digit string in this line appeared in
# the first file.
if(match($0, bds) && substr($0, RSTART, RLENGTH) in bs) print
}' file1.txt file2.txt
With the sample data given in message #3 in this thread, it produces the output:
Code:
NR=1, bs[193708] created
NR=2, bs[15819] created
NR=3, bs[15820] created
< (VPN) 999993080285668, (VPNPRO) mtel-MVPN_(VPNACC) 15819_200 >
< (VPN) 999993080285669, (VPNPRO) mtel-MVPN_(VPNACC) 15820_201 >
Note that the bold text in the output above is from BOLD tags in file1.txt and file2.txt; not from me massaging the output HTML to highlight the matched fields as they are displayed in this forum. The printf statement shown in red in the script is a debugging aid that produces the text in red in the output. Remove the printf to get just the output requested.
As always, if you want to run this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk, /usr/xpg6/bin/awk, or nawk.
I am still not confidant that this will work. If relatively long lines are split (as they were in the 1st message in this thread (but not in the 3rd message in this thread), there is nothing to prevent the bolded string from being split across two lines in file1.txt. I could modify the script to work even if this happens, but given the poor specification of input file formats, I didn't take the extra time to do that.
I have file1 and file2:
file1:
11 xxx kksd ...
22 kkk kdsglg...
33 sss kdfjdksa...
44 kdsf dskjfkas ...
hh kdkf kdkkd..
jg dkf dfkdk ...
...
file2:
jg
22
hh
...
I need to check each line of file1. if the field one is in file2, I will keep it; if not, the whole line will be... (17 Replies)
I have searched about 30 threads, a load of Google pages and cannot find what I am looking for. I have some of the parts but not the whole. I cannot seem to get the puzzle fit together.
I have three folders, two of which contain different versions of multiple files, dist/file1.php dist/file2.php... (4 Replies)
Hi, all:
I've got two folders, say, "folder1" and "folder2".
Under each, there are thousands of files.
It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command.
However, if I change the above question a... (1 Reply)
I have four files, I need to compare these files together.
As such i know "sdiff and comm" commands but these commands compare 2 files together. If I use sdiff command then i have to compare each file with other which will increase the codes.
Please suggest if you know some commands whcih can... (6 Replies)
Please help me with awk.I have two files with the below details
file1
123456789 2012
987654321 2011
a1234567892012
a1234abcde2012
b1234567892012
c1234567892012
98765a12342012
file2
a1234
01234
b1234
33333
I need to check whether the items in file2 is present in file1 .If it is... (2 Replies)
I want to compare two files, and search for items that are in both. Then override the first file with that containing only elements which were in both files. I imagine something with diff, but not sure.
File 1
One
Two
Three
Four
Five
File 2
One
Three
Four
Six
Eight (2 Replies)
I have this code
awk 'NR==FNR{a=$1;next} a' file1 file2
which does what I need it to do, but for only two files. I want to make it so that I can have multiple files (for example 30) and the code will return only the items that are in every single one of those files and ignore the ones... (7 Replies)
hi all,
Thanks to all for your great help...
I have a scenario that I have two files (file1 & file2). I need to compare two files entire row by row and share the output if any discrepancies within two files.
File1:
DB1|TB1|C1,C3
DB2|TB2|C1,C2
DB3|TB3|C1,C2,C3,C4
File2:
... (2 Replies)
Discussion started by: Selva_2507
2 Replies
LEARN ABOUT OPENDARWIN
sadc
sadc(8) BSD System Manager's Manual sadc(8)NAME
sadc -- system activity data collector
SYNOPSIS
/usr/lib/sa/sadc [-m mode] [t n] [ofile]
DESCRIPTION
The sadc tool is used to collect cumulative system activity data. The sample system data is collected at intervals t seconds apart, in a
loop n times. The binary sample data is written to ofile if specified. Otherwise, the binary data is written to stdout. If the ofile file
does not exist, it is created, otherwise it is truncated.
sadc is intended to be used as the engine behind the sar(1) command, and is not typically invoked on the command line. Two shell scripts,
sa1 and sa2 are provided to drive the typical sampling, saving, and reporting process.
OPTIONS
The following options modify the way data is collected by sadc.
-m mode Modify the collection of system statistics as specified by mode. Currently only one mode is supported.
PPP By default, the collection of ppp network interface statistics is turned off. This is because the number of ppp connec-
tions can be very high, causing the raw data file to grow unexpectedly large, especially when samples are collected at
short intervals. Use the PPP mode to turn the collection back on.
EXAMPLES
/usr/lib/sa/sadc 15 20 /tmp/sample.out
This call collects 20 samples at 15 second intervals. The binary data is written to the /tmp/sample.out file
FILES
/var/log/sa/sadd Default daily activity file that holds the binary sampling data. dd are digits that represent the day of the month.
/usr/lib/sa/sa1 Shell script used to drive the sar data collection.
/usr/lib/sa/sa2 Shell script used to drive the sar data reporting.
SEE ALSO sa1(8), sa2(8), sar(1), iostat(8), vm_stat(1), netstat(1), top(1), sc_usage(1), fs_usage(1)Mac OS X Jul 25 2003 Mac OS X