Sponsored Content
Top Forums Shell Programming and Scripting awk does not find ids with semi-colon in the name Post 302971675 by Don Cragun on Saturday 23rd of April 2016 05:32:32 PM
Old 04-23-2016
There really isn't any need to count the number of times you have seen an ID in the lookup[] and seen[] arrays. Assuming that your sample input data isn't really representative of the sizes of your real input files, the following suggestion might be faster or slower than Scrutinizer's suggestion since it handles the list of IDs in lookup[] (from the 1st input file) and the list of IDs in seen[] (from the 2nd input file) differently:
  • Scrutinizer's code trims lookup[] and adds entries to seen[] as it processes each input line. So seen[] will only contain elements that had previously been in lookup[].
  • The following code doesn't look at lookup[] while its reading the 2nd input file. It adds elements to seen[] for each ID found in field 5 in lines in the 2nd input file. It then makes a single walk through lookup[] at the end removing entries for IDs that are also found in seen[]. (Note that this might be a little more portable to other versions of awk because it doesn't depend on being able to use length(array name) which is an extension not required by the standards.)

You might want to compare the time taken by our two approaches with some of your real data.
Code:
awk '
FNR == NR {
	lookup[$1]
	next
}
{	for(i = split($5, F, /;/); i; i--)
		seen[F[i]]
}
END {	for(id in lookup)
		if(id in seen) {
			found++
			delete lookup[id]
		}
	print found, "of", NR - FNR, "ids found"
	for(id in lookup)
		print id, "is missing"
}' list input

which, with your sample input files produces the output:
Code:
4 of 5 ids found
PRAF is missing

If you don't want the additional information shown in red in the output above, remove the code shown in red in the above script.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to find Unix ids

Hi How can find the Unix ids for couple of users i am not sure of the command , can anyone help me on this :) (1 Reply)
Discussion started by: raghav1982
1 Replies

2. Shell Programming and Scripting

bash aliases and command chaining with ; (semi-colon)

What am I doing wrong here? Or is this not possible? A bug? alias f='find . >found 2>/dev/null &' f ; sleep 20 ; ls -l -bash: syntax error near unexpected token `;' (2 Replies)
Discussion started by: star_man
2 Replies

3. Shell Programming and Scripting

Running multiple commands stored as a semi-colon separated string

Hi, Is there a way in Korn Shell that I can run multiple commands stored as a semi-colon separated string, e.g., # vs="echo a; echo b;" # $vs a; echo b; I want to be able to store commands in a variable, then run all of it once and pipe the whole output to another program without using... (2 Replies)
Discussion started by: svhyd
2 Replies

4. Shell Programming and Scripting

Colon in awk script output

I'm using AIX 5.3 and running a awk replace to modify data as follows: echo 1234: 1234 123 123 444 555 666 7777 | awk '/^:/{split($2,N);n=N} {n=$1} {sub(n,n+10000000)}1' 10001234 1234 123 123 444 555 666 7777 dumb question.. how do I get the colon back in, so it outputs 10001234: 1234... (4 Replies)
Discussion started by: say170
4 Replies

5. Shell Programming and Scripting

Need a script to convert comma delimited files to semi colon delimited

Hi All, I need a unix script to convert .csv files to .skv files (changing a comma delimited file to a semi colon delimited file). I am a unix newbie and so don't know where to start. The script will be scheduled using cron and needs to convert each .csv file in a particular folder to a .skv... (4 Replies)
Discussion started by: CarpKing
4 Replies

6. Homework & Coursework Questions

C++ Attempting to modify this function to read from a (;) semi-colon-separated file

After some thought. I am uncomfortable issuing my professors name where, there may be unintended side effects from any negative responses/feedback. Willing to re post if I can omit school / professor publicly, but can message moderator for validation? I am here for knowledge and understanding,... (1 Reply)
Discussion started by: briandanielz
1 Replies

7. Shell Programming and Scripting

Find first n element by matching IDs

Hi All I have a problem that I am not able to resolve. Briefly, I have a file like this: ID_1 10 ID_2 15 ID_3 32 ID_4 45 ID_5 66 ID_6 79 ID_7 88This file is numerically ordered for the 2th column. And another file containing a list of IDs(just one in this example) ID_4What I... (7 Replies)
Discussion started by: giuliangiuseppe
7 Replies

8. UNIX for Dummies Questions & Answers

awk colon separated items

Hi, I need to filter my data based on items in column 23. Column 1 until column 23 are tab separated. This is how column 23 looks like: PRIMARY=<0/1:504:499,5:.:.:.:0.01:1:15:.> I want to extract lines if items 7 (separated by : ) in column 23 are more than 0.25 . In example above , item... (2 Replies)
Discussion started by: janshamsani
2 Replies

9. Shell Programming and Scripting

awk unique count of partial match with semi-colon

Trying to get the unique count of the below input, but if the text in beginning of $5 is a partial match to another line in the file then it is not unique. awk awk '!seen++ {n++} END {print n}' input 7 input chr1 159174749 159174770 chr1:159174749-159174770 ACKR1 chr1 ... (2 Replies)
Discussion started by: cmccabe
2 Replies

10. Shell Programming and Scripting

Delete all lines without a trailing semi colon

shell : bash os : RHEL 7.2 I have a file like below 61265388 1-11Y5C-7690 1-11Y4Q-6763 INSERT INTO emp VALUES('oramds:test.xref','CBS_01','MIGWO161265388','61265388','N',SYSDATE); INSERT INTO emp VALUES('oramds:test.xref','COMMON','MIGWO161265388','MIG1COMMON61265388','N',SYSDATE);... (3 Replies)
Discussion started by: kraljic
3 Replies
IGAWK(1)							 Utility Commands							  IGAWK(1)

NAME
igawk - gawk with include files SYNOPSIS
igawk [ all gawk options ] -f program-file [ -- ] file ... igawk [ all gawk options ] [ -- ] program-text file ... DESCRIPTION
Igawk is a simple shell script that adds the ability to have ``include files'' to gawk(1). AWK programs for igawk are the same as for gawk, except that, in addition, you may have lines like @include getopt.awk in your program to include the file getopt.awk from either the current directory or one of the other directories in the search path. OPTIONS
See gawk(1) for a full description of the AWK language and the options that gawk supports. EXAMPLES
cat << EOF > test.awk @include getopt.awk BEGIN { while (getopt(ARGC, ARGV, "am:q") != -1) ... } EOF igawk -f test.awk SEE ALSO
gawk(1) Effective AWK Programming, Edition 1.0, published by the Free Software Foundation, 1995. AUTHOR
Arnold Robbins (arnold@skeeve.com). Free Software Foundation Nov 3 1999 IGAWK(1)
All times are GMT -4. The time now is 10:18 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy