Sponsored Content
Top Forums Shell Programming and Scripting awk does not find ids with semi-colon in the name Post 302971660 by cmccabe on Saturday 23rd of April 2016 09:25:38 AM
Old 04-23-2016
awk does not find ids with semi-colon in the name

I am using awk to search $5 of the "input" file using the "list" file as the search criteria. So if the id in line 1 of "list" is found in "search" then it is counted in the ids found. However, if the line in "list" is not found in "search", then it is outputted as is missing. The awk below runs and works for most but the ids with a ; in them are missing but can be manually found in the file. I am not sure where to add this though. Thank you Smilie.

input
Code:
chrX    48933012    48933134    chrX:48933012-48933134    PRAF2;WDR45
chrX    48934078    48934193    chrX:48934078-48934193    PRAF2;WDR45
chrX    48934293    48934422    chrX:48934293-48934422    PRAF2;WDR45
chr17    42426522    42426680    chr17:42426522-42426680    GRN;L01117
chr17    42426783    42426929    chr17:42426783-42426929    GRN;L01117
chr17    30814628    30815572    chr17:30814628-30815572    AK307275;CDK5R1
chr2    234668923    234669807    chr2:234668923-234669807    UGT1A1;UGT1A10;UGT1A3;UGT1A4;UGT1A5;UGT1A6;UGT1A7;UGT1A8;UGT1A9
chr2    234675669    234675821    chr2:234675669-234675821    UGT1A1;UGT1A10;UGT1A3;UGT1A4;UGT1A5;UGT1A6;UGT1A7;UGT1A8;UGT1A9
chr12    9221325    9221448    chr12:9221325-9221448    A2M
chr12    9222330    9222419    chr12:9222330-9222419    A2M

list
Code:
PRAF
GRN
CDK5R1
UGT1A1
A2M

current output
Code:
1 ids found
CDK5R1 is missing
PRAF is missing
GRN is missing
UGT1A1 is missing

desired output
Code:
5 ids found

Code:
awk '
    NR==FNR { lookup[$0]++; next }
    ($5 in lookup) { seen[$5]++ } 
    END {
      print length(seen)" ids found"; 
      for (id in seen) delete lookup[id]; 
      for (id in lookup) print id " is missing"
}' list input > count

awk with error
Code:
awk '
>     NR==FNR { lookup[$0]+|;++; next }
>     ($5 in lookup) { seen[$5]++ } 
>     END {
>       print length(seen)" ids found"; 
>       for (id in seen) delete lookup[id]; 
>       for (id in lookup) print id " is missing"
> }' list2 input > count
awk: cmd. line:2:     NR==FNR { lookup[$0]+|;++; next }
awk: cmd. line:2:                          ^ syntax error
awk: cmd. line:2:     NR==FNR { lookup[$0]+|;++; next }
awk: cmd. line:2:                              ^ syntax error


Last edited by cmccabe; 04-23-2016 at 10:28 AM.. Reason: added awk error
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need to find Unix ids

Hi How can find the Unix ids for couple of users i am not sure of the command , can anyone help me on this :) (1 Reply)
Discussion started by: raghav1982
1 Replies

2. Shell Programming and Scripting

bash aliases and command chaining with ; (semi-colon)

What am I doing wrong here? Or is this not possible? A bug? alias f='find . >found 2>/dev/null &' f ; sleep 20 ; ls -l -bash: syntax error near unexpected token `;' (2 Replies)
Discussion started by: star_man
2 Replies

3. Shell Programming and Scripting

Running multiple commands stored as a semi-colon separated string

Hi, Is there a way in Korn Shell that I can run multiple commands stored as a semi-colon separated string, e.g., # vs="echo a; echo b;" # $vs a; echo b; I want to be able to store commands in a variable, then run all of it once and pipe the whole output to another program without using... (2 Replies)
Discussion started by: svhyd
2 Replies

4. Shell Programming and Scripting

Colon in awk script output

I'm using AIX 5.3 and running a awk replace to modify data as follows: echo 1234: 1234 123 123 444 555 666 7777 | awk '/^:/{split($2,N);n=N} {n=$1} {sub(n,n+10000000)}1' 10001234 1234 123 123 444 555 666 7777 dumb question.. how do I get the colon back in, so it outputs 10001234: 1234... (4 Replies)
Discussion started by: say170
4 Replies

5. Shell Programming and Scripting

Need a script to convert comma delimited files to semi colon delimited

Hi All, I need a unix script to convert .csv files to .skv files (changing a comma delimited file to a semi colon delimited file). I am a unix newbie and so don't know where to start. The script will be scheduled using cron and needs to convert each .csv file in a particular folder to a .skv... (4 Replies)
Discussion started by: CarpKing
4 Replies

6. Homework & Coursework Questions

C++ Attempting to modify this function to read from a (;) semi-colon-separated file

After some thought. I am uncomfortable issuing my professors name where, there may be unintended side effects from any negative responses/feedback. Willing to re post if I can omit school / professor publicly, but can message moderator for validation? I am here for knowledge and understanding,... (1 Reply)
Discussion started by: briandanielz
1 Replies

7. Shell Programming and Scripting

Find first n element by matching IDs

Hi All I have a problem that I am not able to resolve. Briefly, I have a file like this: ID_1 10 ID_2 15 ID_3 32 ID_4 45 ID_5 66 ID_6 79 ID_7 88This file is numerically ordered for the 2th column. And another file containing a list of IDs(just one in this example) ID_4What I... (7 Replies)
Discussion started by: giuliangiuseppe
7 Replies

8. UNIX for Dummies Questions & Answers

awk colon separated items

Hi, I need to filter my data based on items in column 23. Column 1 until column 23 are tab separated. This is how column 23 looks like: PRIMARY=<0/1:504:499,5:.:.:.:0.01:1:15:.> I want to extract lines if items 7 (separated by : ) in column 23 are more than 0.25 . In example above , item... (2 Replies)
Discussion started by: janshamsani
2 Replies

9. Shell Programming and Scripting

awk unique count of partial match with semi-colon

Trying to get the unique count of the below input, but if the text in beginning of $5 is a partial match to another line in the file then it is not unique. awk awk '!seen++ {n++} END {print n}' input 7 input chr1 159174749 159174770 chr1:159174749-159174770 ACKR1 chr1 ... (2 Replies)
Discussion started by: cmccabe
2 Replies

10. Shell Programming and Scripting

Delete all lines without a trailing semi colon

shell : bash os : RHEL 7.2 I have a file like below 61265388 1-11Y5C-7690 1-11Y4Q-6763 INSERT INTO emp VALUES('oramds:test.xref','CBS_01','MIGWO161265388','61265388','N',SYSDATE); INSERT INTO emp VALUES('oramds:test.xref','COMMON','MIGWO161265388','MIG1COMMON61265388','N',SYSDATE);... (3 Replies)
Discussion started by: kraljic
3 Replies
All times are GMT -4. The time now is 05:41 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy