Checking data with source


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Checking data with source
# 1  
Old 04-27-2015
Checking data with source

I can derive a group from 2 sources

Group Table
Code:
Derived Source1 Source2
BAC GA1 TR
HGT FG1 RF2

If a variable between both sources match in the value (col3) then the variable in the derived must match the same value.

Code:
Group Variable Value
GA1 V1 AA
TR V1 AA

then we know

Code:
Group Variable Value
BAC V1 AA

If one or both sources is absent then we cant say anything about the derived group.

If both source values are different then also we cant say anything about the derived group.

From the actual data , I need to identify which groups have data which do not match the source.


Code:
BAC V1 AG
GA1 V1 AG
TR  V1 AA
BAC V2 AT
GA1 V2 TG
TR  V2 TG
BAC V3 AT
GA1 V3 TG
GH5 V2 TG
HGT V1 AC
FG1 V1 AC
RF2 V1 AC
HGT V4 AC
FG1 V4 AG
RF2 V4 AG



So the output I`m looking for is

Code:
Group Variable Value SHOULD_BE
BAC V2 AT TG
HGT V4 AC AG


I am stuck here , please help. I am happy to explain more if needed.

Code:
awk 'NR==FNR{ g[$1]; s1[$1]=$2;s2[$1]=$3;next} $1 in g{ c[$1,$2]=$3; if (c[g,$2]!=c[s1[[$1],$2] && c[[s1[$1],$2]==c[[s2[$1],$2] ) print $0,c[[s2[$1],$2]}' group data


Last edited by abh.kumar; 04-27-2015 at 07:52 PM..
# 2  
Old 04-28-2015
You obviously didn't run your snippet even once, as it has syntax errors.

---------- Post updated at 13:20 ---------- Previous update was at 12:56 ----------

Anyhow, try
Code:
awk '
FNR==1          {FC++}
FC==1           {s1[$1]=$2
                 s2[$1]=$3
                 next
                }
FC==2           {D[$1,$2]=$3
                 next
                }
FC==3           {TMP = D[s1[$1],$2]
                 if     (TMP &&
                         TMP == D[s2[$1],$2] &&
                         TMP != $3) print $0, TMP
                }

' group data data
BAC V2 AT TG
HGT V4 AC AG

You'll need to go through data twice as the groups/sources don't come in the right order. The check for non-empty TMP may be omitted if $3 is guaranteed to never be empty.
This User Gave Thanks to RudiC For This Post:
# 3  
Old 04-28-2015
yes i am getting syntax errors Rudic. I ran my script. I tried some other variations of it as well, but cant seem to apply the array of array logic. What I am trying to do is put the first column in the group in array g. Then the 2 sources as arrays indexed by the first column,

I`m struggling with the syntax where the 2nd file has to be probably read entirely into memory before the if checking occurs.
# 4  
Old 04-28-2015
Getting what syntax errors?

Show exactly what you did, word for word, letter for letter, keystroke for keystroke.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Inserting data from another source

Hi, I am creating a script using Vi to go out on emails. Each email has is own set of attachments, each with a unique number eg Q12343 and Q67897. I have managed to get the script to put in the first attachment customer number ie Q12343, but can't figure out how to get the second one to come in ie... (4 Replies)
Discussion started by: Cheryl1234
4 Replies

2. Shell Programming and Scripting

Source data validation

I am using below logic to validate whether i am expecting the correct data from source,if not logic should give which column has error.i am running below logic in linux awk -F, ' NR==1{next} {f=" "} $1!~/^{0,5}$|^$/{f=f?f" emp_id-error":"emp_id-error"} $4!~/^{0,6}$|^$/{f=f?f"... (4 Replies)
Discussion started by: katakamvivek
4 Replies

3. Shell Programming and Scripting

Checking data in csv file after headers

I had a requirement to check if data exists after headers (typically row 2 & so on) in csv file. please help how we can check through shellscript in linux. Thank you !! (1 Reply)
Discussion started by: chandu123
1 Replies

4. UNIX for Dummies Questions & Answers

Mapping a data in a file and delete line in source file if data does not exist.

Hi Guys, Please help me with my problem here: I have a source file: 1212 23232 343434 ASAS1 4 3212 23232 343434 ASAS2 4 3234 23232 343434 QWQW1 4 1134 23232 343434 QWQW2 4 3212 23232 343434 QWQW3 4 and a mapping... (4 Replies)
Discussion started by: kokoro
4 Replies

5. UNIX for Dummies Questions & Answers

Checking Data in File

Using If_ELSE, how do check a particular piece of data in a file. My data file has three columns and say I want to check the data in the 2nd column. I want to do a comparsion against a another piece of data using IF_ELSE. How do I get that data in file to check? (1 Reply)
Discussion started by: razer212
1 Replies

6. Shell Programming and Scripting

Help with checking reference data frequency count

reference data GHTAS QER CC N input data NNWQERPROEGHTASTTTGHTASNCC Desired output GHTAS 2 QER 1 CC 1 N 3 (2 Replies)
Discussion started by: perl_beginner
2 Replies

7. Infrastructure Monitoring

Alert / Monitoring / Historial Data Open Source

hi guys my boss asked me for a tool to monitor and alert my Linux boxes... and some Win boxes... but not only monitor for high CPU Usage, hign memory usage and so on....he also wants a tool for historical data I mean something like I want to know the CPU Usage and memory usage for the last... (1 Reply)
Discussion started by: karlochacon
1 Replies

8. Shell Programming and Scripting

Checking the directory and concatenate the data of all the log files in that dir

Hi Gurus, I am new to unix and need your help to make a shell script. I have a requirement, would appreciate if you could please help me on it: Requirement: ------------- I will pass 2 parameters in shell script 1). Directory name say errors 2). file extension say .log First of all this... (4 Replies)
Discussion started by: anshulinpc
4 Replies

9. Shell Programming and Scripting

Perl data type checking

I am using perl 5.8.0. I need to check some values to see it they are floats. Our system does not have Data::Types so I can't use is_float. Is there something else that I can use? The only thing in Data is Dump.pm. I am not allowed to download anything to our system so I have to use what I have.... (3 Replies)
Discussion started by: ajgwin
3 Replies

10. Shell Programming and Scripting

Source and Target count checking

Hi My source database is seibel and the target is oracle. I need to check the count for source query and target query. if it is equal then i need to send and success mail to an email id. If it is not equal need to send an failure mail to an email id. pl let me know about the script in ksh. ... (2 Replies)
Discussion started by: ksmbabu
2 Replies
Login or Register to Ask a Question