Contention Identifier Script


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Contention Identifier Script
# 1  
Old 08-16-2011
Contention Identifier Script

Hi Experts,

Need some help with a script which is definetly beyond my scripting skills.

Here is flat file that I have with 4 Key Columns

Code:
KEYCOLUMN1              KEYCOLUMN2              KEYCOLUMN3      KEYCOLUMN4

123ABC                  AEG                     MANCHESTER      BIGBOX
123ABC                  SAMSUNG                 HEYWOOD         BIGBOX
123ABC                  SIEMENS                 BOLTON          BIGBOX
123ABC                  GORENJE                 STOCKPORT       BIGBOX
123ABC                  HAIER                   DUKINFIELD      BIGBOX
123ABC                  WHIRLPOOL               ALTRINCHAM      RETAIL

234BCD                  SAMSUNG                 DREWSBURY       BIGBOX
234BCD                  SIEMENS                 WAKEFIELD       BIGBOX
234BCD                  GORENJE                 CASTLEFORD      BIGBOX
234BCD                  AEG                     BRADFORD        BIGBOX
234BCD                  HAIER                   BRIGHOUSE       BIGBOX
234BCD                  SAMSUNG                 DREWSBURY       RETAIL

345CDE                  SAMSUNG                 DREWSBURY       BIGBOX
345CDE                  ASKO                    WAKEFIELD       BIGBOX
345CDE                  GORENJE                 CASTLEFORD      BIGBOX
345CDE                  AEG                     BRADFORD        BIGBOX
345CDE                  HAIER                   BRIGHOUSE       BIGBOX
345CDE                  ASKO                    WAKEFIELD       BIGBOX

In the above set of data, I need to find all retailers that are dealing in the same brand with Contention. Contention here is defined as the "Same brand being sold by more than one BIGBOX store or the same brand is being sold by BIGBOX store as well as in another RETAIL location.

So when you apply this contention rule to the above data and in case of the Productid, 123ABC there is no contention.

However when it comes to product 234BCD, SAMSUNG is being sold at the same DREWSBURY location but under a retail Umbrella as well as by a BIGBOX chain. So, this record needs to be identified.

In the same fashion, Productid, 345CDE has ASKO brand being sold and serviced in WAKEFIELD by more than 2 BIGBOX stores. This needs to be identified as well.

I know this can be done by importing the data into mysql database but is this even be possible using a script ?

My skills are so limited on the unix side that I cant seem to go anywhere with this problem at hand.

Please help !
Thanks,
P Gonzalez

Last edited by Scott; 08-16-2011 at 04:39 AM.. Reason: Code tags
# 2  
Old 08-16-2011
What output do you want - exactly.

This "identifies" the scenario your defined
Code:
awk '{arr[$1 $2]++; next} 
     END{for (i in arr) 
     {
        if(arr[i]>1)
        {print i, "other retail=", $4 } 
     }'  inputfile

This User Gave Thanks to jim mcnamara For This Post:
# 3  
Old 08-17-2011
Mr. Macnamara,

That was Sheer Brilliance ! Absolutely mind boggling.

Thank you and super big Thank you !

Now, here are my questions. First I edited your code in here by adding another bracket as you might have overlooked and it just worked in two different ways.

Code:
$awk '{arr[$1 $2]++; next}
END{ for (i in arr)
{
if(arr[i]>1)}   <--
{print i, "other retail=", $4 } 
}' RETAILERSSAMPLEFILE

and I got the following result -
345CDEHAIER other retail=

Also, when I edited as below -

Code:
$awk '{arr[$1 $2]++; next}
END{for (i in arr)
{
if(arr[i]>1)
{print i, "other retail=", $4 }
}}' <---- RETAILERSSAMPLEFILE

& here is what I see in the resultant file

other retail=
234BCDSAMSUNG other retail=
345CDEASKO other retail=


And I do want both these records as they are both Contentious records. Now, here are my big questions. How scalable is this ? Can this script handle a million rows ?

regards,
Gonzalez.

---------- Post updated 08-17-11 at 02:52 AM ---------- Previous update was 08-16-11 at 11:03 AM ----------

Jim mcnamara,

Thank you for taking time in answering the question that I had on hand.

I need to add to you the following question, how do I basically print the whole line as when I do the following it is not printing in the entire original record as such -

Code:
awk '{arr[$1 $4 ]++; next} END{for (i in arr) { if(arr[i]>1) {print i, $0 }}}' RETAILERSSAMPLEFILE

& this returns the following output.

Code:
234BCDSAMSUNG
345CDEASKO

Which is not the desired output in completeness & is also not in the format as seen here.

Is there any possible tweak here that you can induce to see the entire record as such ? I mean the entire line in its unformatted originality as such. As far as scalabitliy goes, I ran this on a 50,000 line file & it worked just as fast as I could expect but Im apprehensive about a 1000000 records file as I dont want this to suck up all of the cpu time.

Once again, thanks for the big help !

regards,
P Gonzalez.

Last edited by PG3; 08-17-2011 at 05:01 AM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Variable not an identifier when script is run as another user

I am new to scripting I keep getting the error var2= is not an identifier when I run this script as another user. BUT when I run it as myself, the script completes without error. Any idea why? I assume it is because the new user has a different environment . How do I make the variables active... (5 Replies)
Discussion started by: newbie2010
5 Replies

2. HP-UX

Potential file system contention on directory

We have an 8-processor Itanium system running HP-UX 11.23 connected to shared SAN discs. We have an application that creates files (about 10) in a specific directory. When the application terminates, these files are removed (unlink) and a few others are updated. The directory contains... (8 Replies)
Discussion started by: FDesrochers
8 Replies

3. Shell Programming and Scripting

is not an identifier

Hi Guys... I am using the following codes in my script: SID_L=`cat /var/opt/oracle/oratab|grep -v "^#"|cut -f1 -d: -s` SID_VAR=$SID_L for SID_RUN in $SID_VAR do ORACLE_HOME=`grep ^$SID_RUN /var/opt/oracle/oratab | \ awk -F: '{print $2}'` ;export ORACLE_HOME export... (2 Replies)
Discussion started by: Phuti
2 Replies

4. Shell Programming and Scripting

not an identifier

Hi I have already gone through this topic on this forum, but still i am getting same problem. I am using solaris 10. my login shell is /usr/bash i have got a script as below /home/gyan> cat 3.cm #!/usr/bin/ksh export PROG_NAME=rpaa001 if i run this script as below , it works fine... (3 Replies)
Discussion started by: gyanibaba
3 Replies

5. Shell Programming and Scripting

Identifier In Shell script

Hi, I have a shell script and inside that shell script it calls a local .env file to set the environment for the shell script ,but the thing is that i got a error while running the script like ./myscript.sh it gives DB_NAME=dvcl021: is not an identifier that DB_Name is accessed from my .env... (6 Replies)
Discussion started by: malickhat
6 Replies

6. AIX

how to handle potential file contention

I need to change how a posting procedure currently works in order to improve load balancing but I am hitting a potential file contention problem that I was wondering if someone here could assist me with... In a directory called FilePool I would have a bunch of files that are constantly coming in... (3 Replies)
Discussion started by: philplasma
3 Replies

7. Shell Programming and Scripting

shell script to find and replace a line using a identifier

Hi all im having trouble starting with a shell script, i hope someone here can help me i have 2 files file1: 404905.jpg 516167 404906.jpg 516168 404917.psd 516183 404947.pdf 516250 file2: 516250 /tmp/RecyclePoster18241.pdf 516167 /tmp/ReunionCardFINAL.jpg 516168... (7 Replies)
Discussion started by: kenray
7 Replies

8. Solaris

-sh: is not an identifier

Hi , I am getting the following message when log into my unix account in sun solaris (version5.9)server. -sh: ORACLE_HOME=/apps/oracle/product/10.2.0/client_1: is not an identifier The ORACLE_HOME is set in .profile file. Another thing is that SID is also set inside .profile like... (4 Replies)
Discussion started by: megh
4 Replies

9. UNIX for Dummies Questions & Answers

Shell Script Unique Identifier Question

i All I have scripting question. I have a file "out.txt" which is generated by another script the file contains the following my_identifier8859574 logout The number is generated in the script and I have put the my_identifier bit in front of it as a unique identifier I now have... (7 Replies)
Discussion started by: grahambo2005
7 Replies

10. Shell Programming and Scripting

Problem with shell script...ORA-00904:invalid identifier

Guys, Please suggest me what's wrong with this script? #!/usr/bin/sh ############################################################################## # Author : Bhagat Singh # # # Date : Nov 13,2006 #... (12 Replies)
Discussion started by: bhagat.singh-j
12 Replies
Login or Register to Ask a Question