I am processing a file using awk to get few input variables which I'll use later in my script. I am learning to script using awk so please advise in any mistakes I made in my code. File sample is as follows
Code:
# cat junk1.jnk
Folder1 : test_file (File)
test1_file (File)
test2_file (File)
Lines (9):
00140 Li CHAR 188
00141 Li CHAR 188
00142 Li CHAR 188
00143 Li CHAR 188
00144 Li CHAR 188
00145 Li CHAR 375
00146 Li CHAR 375
00147 Li CHAR 375
I am trying to extract comma separated list of file names identified by last field in braces (File) followed by Number of Lines which is (9) and comma separated list of uniq CHAR - last field of the line starting with HEX values after string "Lines (9):". I am using following code. I get the file names and Line number but unable to get the comma separated list of uniq CHAR. In this case it should be 188,375.
My Current O/P is as follows. As you can see the only value I get for CHAR is last one - 375. Also if you can help me understand why am I getting file name test2_file,test2_file twice.
I am processing a file using awk to get few input variables which I'll use later in my script. I am learning to script using awk so please advise in any mistakes I made in my code. File sample is as follows
Code:
# cat junk1.jnk
Folder1 : test_file (File)
test1_file (File)
test2_file (File)
Lines (9):
00140 Li CHAR 188
00141 Li CHAR 188
00142 Li CHAR 188
00143 Li CHAR 188
00144 Li CHAR 188
00145 Li CHAR 375
00146 Li CHAR 375
00147 Li CHAR 375
I am trying to extract comma separated list of file names identified by last field in braces (File) followed by Number of Lines which is (9) and comma separated list of uniq CHAR - last field of the line starting with HEX values after string "Lines (9):". I am using following code. I get the file names and Line number but unable to get the comma separated list of uniq CHAR. In this case it should be 188,375.
My Current O/P is as follows. As you can see the only value I get for CHAR is last one - 375. Also if you can help me understand why am I getting file name test2_file,test2_file twice.
As usual you guys are rock stars and would appreciate your help.
There is no reason to use cat to feed data to awk; awk is perfectly capable of reading files on its own. Using cat causes all of the data to be read and written an extra time, consumes more system resources, and slows down your script.
Note that in your code that I marked in red above, you are careful to print each filename value (followed by a comma) when you find one. (But you then also print the last filename found when you get to the END clause in your awk script.
You don't do that with the values you find that you store in the CHR variable (so you just print the last value found) instead of all of them. And there isn't any check in your code to look for matching values to eliminate duplicates.
You might have also noticed that your two heading lines don't line up with each other nor with the data line that you print at the end.
The code rdrtx1 suggested accumulates the comma-separated value strings always adding a comma to the end of the string when a new value is added and then removes the last comma in the END clause. That code also lines up header columns and data columns as long as the list of filenames isn't more than 40 characters long.
The following code self adjusts headings to match the data found in the file being processed. It takes a short-cut assuming that no field will contain data that is longer than 61 characters. If your real data will have one or more fields longer than that, the DASHES variable needs to have more dashes added to its value, or the second printf in the END clause needs to be replaced by three loops that print as many dashes as are needed for each of the three headings. (I will leave that adjustment as an exercise for the reader.)
It also uses a function to add values to the two string variables and only adds a comma as a subfield-separator when the string isn't empty to start with.
Code:
awk '
function AddVal(Value, String) {
# Add "Value" to a comma-separated value string identified by "String"
# or, if it does not already exist, create it.
String = ((String == "" ? "" : String ",")) Value
# Return the new value for "String".
return(String)
}
$NF == "(File)" {
# Add a filename to the CSG variable.
CSG = AddVal($(NF - 1), CSG)
next
}
$1 == "Lines" {
# Grab the number of lines to be reported.
match($0, /[[:digit:]]+/) # I assume this is a decimal number.
LNN = substr($0, RSTART, RLENGTH)
next
}
$1 ~ /^[[:xdigit:]]{5}$/ {
# We found a 5 hexadecimal digit string in $1, determine if we have
# seen the value in the last field before...
if($NF in seen)
next # We have seen it, move on to the next input record.
# We have not seen it before. Note that we have seen it now...
seen[$NF]
# and add this value to the CHR variable.
CHR = AddVal($NF, CHR)
}
END { # Set DASHES to a long string of dashes...
DASHES = "-------------------------------------------------------------"
# Calculate the longest string to be printed in the filenames field...
fnl = ((l1 = length("File Names")) > (l2 = length(CSG))) ? l1 : l2
# and in the lines field...
ll = ((l1 = length("Lines")) > (l2 = length(LNN))) ? l1 : l2
# and in the CHARS field.
vall = ((l1 = length("CHARS")) > (l2 = length(CHR))) ? l1 : l2
# Print the two line header adjusted to fit the actual data.
printf("%-*.*s %-*.*s %-*.*s\n", fnl, fnl, "File Names",
ll, ll, "Lines", vall, vall, "CHARS")
printf("%-*.*s %-*.*s %-*.*s\n", fnl, fnl, DASHES,
ll, ll, DASHES, vall, vall, DASHES)
# Print the accumulated data.
printf ("%*.*s %*.*s %*.*s\n", fnl, fnl, CSG,
ll, ll, LNN, vall, vall, CHR)
}' junk1.jnk
Hello Everyone ,
Iam a newbie to shell programming and iam reaching out if anyone can help in this :-
I have two files
1) Insert.txt
2) partition_list.txt
insert.txt looks like this :-
insert into emp1 partition (partition_name)
(a1,
b2,
c4,
s6,
d8)
select
a1,
b2,
c4, (2 Replies)
Hi all,
If i would like to process a file input as below:
col1 col2 col3 ...col100
1 A C E A ...
3 D E G A
5 T T A A
6 D C A G
how can i perform a for loop to count the occurences of letters in each column? (just like uniq -c ) in every column.
on top of that, i would also like... (8 Replies)
Hi,
I wasn't quite sure how to title this one! Here goes:
I have some already partially parsed log files, which I now need to extract info from. Because of the way they are originally and the fact they have been partially processed already, I can't make any assumptions on the number of... (8 Replies)
Hello,
I can extract lines in a file, between two strings but only one time.
If there are multiple occurencies, my command show only one block.
Example, monfichier.txt contains :
debut_sect
texte L1
texte L2
texte L3
texte L4
fin_sect
donnees inutiles 1
donnees inutiles 2
... (8 Replies)
I am trying to extract multiple strings from snmp-mib files like below.
-----
$ cat IF-MIB.mib
<snip>
linkDown NOTIFICATION-TYPE
OBJECTS { ifIndex, ifAdminStatus, ifOperStatus }
STATUS current
DESCRIPTION
"A linkDown trap signifies that the SNMP entity, acting in... (5 Replies)
Hi,
I am trying to get lines between the last occurrences of two patterns. I have files that have several occurrences of “Standard” and “Visual”. I will like to get the lines between “Standard” and “Visual” but I only want to retain only the last one e.g.
Standard
Some words
Some words
Some... (4 Replies)
Hi Fellows,
I have been struggling to fix an issue in csv records to compose sql statements and have been really losing sleep over it. Here is the problem:
I have csv files in the following pipe-delimited format:
Column1|Column2|Column3|Column4|NEWLINE
Address Type|some descriptive... (4 Replies)
Hi guys,
say I have a few files in a directory (58 text files or somthing)
each one contains mulitple strings that I wish to replace with other strings
so in these 58 files I'm looking for say the following strings:
JAM (replace with BUTTER)
BREAD (replace with CRACKER)
SCOOP (replace... (19 Replies)