Appending code in a directory recursively based on a certain criteria


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Appending code in a directory recursively based on a certain criteria
# 1  
Old 11-28-2012
Appending code in a directory recursively based on a certain criteria

i am stuck with this strange problem..... maybe you can help.

i have one master_file which has two column username and id_number separated by , somewhat like :
Code:
cat master_file :
sample,1234567
javacode,4567891
companion,23456719
adamsandler,1237681
tomcruise,56328910
bradpitt,901236781
sample,782109432
tomcruise,89210321
adamsandler,90812314
bradpitt,78325610
..............
.....and so on

now there is a directory named sample that consist of multiple directories whose name is same as that of first column of "master_file" like :
Code:
ls -l sample

javacode
companion
sample
bradpitt
tomcruise
joseadams
adamsandler
...... and so on

and each of these directory contains two files namely primery.txt, secondry.txt somewhat like :
Code:
ls -l /sample/javacode/

primery.txt
secondry.txt

now my problem is to pick record from master_file search the corresponding directory in sample and append the id_number (second column from master_file) to primery.txt and secondry.txt in a way that primery.txt should contain atmost five entries and the rest will get append to secondry.txt

I came up with this script but its not working... also i know its not one of the standard form of scripting, maybe you can help me with a better solution
Code:
#!/bin/sh
while read line
do
   user_name=$(echo $line | cut -d, -f1)
   id_number=$(echo $line | cut -d, -f2)
   main_directory=$(find /usr/myproject/sample -type d -name $user_name)
   lines_in_popular=$(wc -l $main_directory/popular.txt | awk '{print $1 ;}')
   if [ $lines_in_popular -lt 5 ] ; then
      echo $id_number >> $main_directory/primery.txt ;
   else
      echo $id_number >> $main_directory/secondry.txt ;
   fi
done < /usr/myproject/content_to_add/master_file

getting error :[: -lt: unary operator expected

Last edited by mukulverma2408; 11-28-2012 at 03:31 PM.. Reason: typo
# 2  
Old 11-28-2012
if "find" returns more than one file then "wc" will return more than one line which will break the -lt test. Try adding " | tail -1" after the find command inside the parenthesis.
# 3  
Old 11-28-2012
Quote:
Originally Posted by mukulverma2408
Code:
#!/bin/sh
while read line
do
   user_name=$(echo $line | cut -d, -f1)
   id_number=$(echo $line | cut -d, -f2)
   main_directory=$(find /usr/myproject/sample -type d -name $user_name)
   lines_in_popular=$(wc -l $main_directory/popular.txt | awk '{print $1 ;}')
   if [ $lines_in_popular -lt 5 ] ; then
      echo $code >> $main_directory/popular.txt ;
   else
      echo $code >> $main_directory/notpopular.txt ;
   fi
done < /usr/myproject/content_to_add/master_file

getting error :[: -lt: unary operator expected
I suppose your master file has unique lines and doesn not contain doubles like with "adamsandler" from your example. Is it so?

Your script has several rather weak spots. Lets go over them one by one:

The first thing is you do not explicitly state a shell to be used. This is not an error, but why take chances? Always state in the first line your intended shell with a "shebang": "#! /path/to/your/shell". I will use "/bin/ksh" in my examples, but change that to whatever you really want to use.

Then:
Code:
   user_name=$(echo $line | cut -d, -f1)
   id_number=$(echo $line | cut -d, -f2)

You first read in a line, then spend several commands to split this line. You can save an awful lot of execution time by using shell variable expansion instead of "cut":

Code:
user_name="${line%%,*}"
id_number="${line##*,}"

but even faster and saving even more would be to let the shell itself do the splitting by redefining the IFS so that word splitting is done implicitly:

Code:
while IFS=',' read user_name id_number ; do
    ....
done < /path/to/master.file

Another point is:

Code:
main_directory=$(find /usr/myproject/sample -type d -name $user_name)

The output of "find" would be several lines if "/usr/myproject/sample/$user_name" would contain a subdirectory or several subdirectories) with the username. Consider the following directory structure:

/usr/myproject/sample/adamsandler
/usr/myproject/sample/adamsandler/adamsandler

Furthermore you do not take any precautions against the directory missing at all. Instead of finding a directory you already know to be there you could just construct its name and then test against it (see "-d" option of "test"):

Code:
#! /bin/ksh

while IFS=',' read user_name id_number ; do
     main_dir="/usr/myproject/sample/$user_name"
     if [ -d "$main_dir" ] ; then
          print - "The directory $main_dir exists."
     else
          print - "something went wrong, $main_dir does not exist."
     fi
     ...
done < /path/to/master.file


Next point:

Code:
lines_in_popular=$(wc -l $main_directory/popular.txt | awk '{print $1 ;}')

This is an awful complicated way of getting the number of lines, isn't it? The following is a bit shorter and probably faster:

Code:
lines_in_popular=$(sed -n '$ =' $main_directory/popular.txt)

It would also help to declare "lines_in_popular" to be of type integer before:

Code:
#! /bin/ksh

typeset -i lines_in_popular=0

while IFS=',' read user_name id_number ; do
main_dir="/usr/myproject/sample/$user_name"
     if [ -d "$main_dir" ] ; then
          print - "The directory $main_dir exists."
     else
          print - "something went wrong, $main_dir does not exist."
          continue
     fi

     lines_in_popular=$(sed -n '$ =' $main_directory/popular.txt)
     ...
done < /path/to/master.file


Finally, you use "code" but it is neither declared nor given any content. Should that read "id_number" instead?

Code:
   if [ $lines_in_popular -lt 5 ] ; then
      echo $code >> $main_directory/popular.txt ;
   else
      echo $code >> $main_directory/notpopular.txt ;
   fi

And you speak about "primery.txt" (sic!) and "secondry.txt" (sic!) but here you use "popular" and "notpopular" - is that a typo too or intended?

I hope this helps.

bakunin
# 4  
Old 11-28-2012
Thanks Bakunin,
Quote:
I suppose your master file has unique lines and doesn not contain doubles like with "adamsandler" from your example. Is it so?
No, my master file does contain duplicate username but i dont think it should make any difference since we are trying to execute script for each and every individual line.. please correct me if i am wrong
Quote:
Always state in the first line your intended shell with a "shebang"
Thanks i will take care of it in future
Quote:
You can save an awful lot of execution time by using shell variable expansion
I am not very good at shell variable expansion could you suggest me some references to read
Quote:
The output of "find" would be several lines if "/usr/myproject/sample/$user_name" would contain a subdirectory or several subdirectories) with the username. Consider the following directory structure:
My directory structure does not have multiple directory but yes there was no precaution in case if directory does not exist, i am gonna adopt your method from now on
Quote:
Finally, you use "code" but it is neither declared nor given any content. Should that read "id_number" instead?
Yes that's a typo and signifies the id_number, same with popular.txt and notpopular.txt too
I am gonna try and change the script as per your suggestion could you please also help for the error that ai am getting while testing the number of lines? i.e

:[: -lt: unary operator expected

Thanks again for your help.

Last edited by mukulverma2408; 11-29-2012 at 07:03 AM.. Reason: changed color to normal
# 5  
Old 11-29-2012
First off, could you please use standard colour for your normal text, instead of red? If you use the proper tags ("quote", "code", like you did) it is easy to diffrentiate between question and answer and other colours may make it hard to read when usign different colour schemes. Thank you.

Quote:
Originally Posted by mukulverma2408
No, my master file does contain duplicate username but i dont think it should make any difference since we are trying to execute script for each and every individual line.. please correct me if i am wrong
You are right. I just wanted to get a better picture of what can be expected as input for the script.

Quote:
Originally Posted by mukulverma2408
I am not very good at shell variable expansion could you suggest me some references to read
Just read the man page for ksh/bash (in this regard they work the same). You will find it under "variable expansion" or "parameter expansion" and it is basically a set of string-related functions with which you can extract sub-strings from variables. Example: i wrote in my first answer

Code:
user_name="${line%%,*}"
id_number="${line##*,}"

This means: "$user_name" is loaded with the contents of "$line" up to the first "," - the same as your "cut -d',' -f1" - and "id_number" is loaded with "$line" from the first "," to the end. This will do the same as your "echo ... | cut" but without the necessity to start an external program. The difference might not be much in absolute time but calling an external program weighs in by a factor of about 100 compared to an internal shell-function. Do it often enough and you will see a very noticeable difference.

Quote:
Originally Posted by mukulverma2408
I am gonna try and change the script as per your suggestion could you please also help for the error that ai am getting while testing the number of lines?

:[: -lt: unary operator expected
That is probably coming from the file "primery.txt" not being there. You once call the two files in this directory "primery.txt" and "secondry.txt" and the other time you call them "popular.txt" and "notpopular.txt" - what is it gonna be?

I hope this helps.

bakunin
# 6  
Old 11-29-2012
Thanks bakunin
The color coding has been changed to normal

Quote:
That is probably coming from the file "primery.txt" not being there. You once call the two files in this directory "primery.txt" and "secondry.txt" and the other time you call them "popular.txt" and "notpopular.txt" - what is it gonna be?
That was a typo while posting....but my file does exist and still i m getting the same error. Any idea/suggestion for the same?
# 7  
Old 11-29-2012
Quote:
Originally Posted by mukulverma2408
Thanks bakunin
The color coding has been changed to normal


That was a typo while posting....but my file does exist and still i m getting the same error. Any idea/suggestion for the same?
As it is: no. Insert "set -xv" into the script and let it run again. It will produce a lot of output at <stderr>, so start it with

Code:
script 2>&1 | more

I hope this helps.

bakunin
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Linux

Merge two files based on matching criteria

Hi, I am trying to merge two csv files based on matching criteria: File description is as below : Key_File : 000|ÇÞ|Key_HF|ÇÞ|Key_FName 001|ÇÞ|Key_11|ÇÞ|Sort_Key22|ÇÞ|Key_31 002|ÇÞ|Key_12|ÇÞ|Sort_Key23|ÇÞ|Key_32 003|ÇÞ|Key_13|ÇÞ|Sort_Key24|ÇÞ|Key_33 050|ÇÞ|Key_15|ÇÞ|Sort_Key25|ÇÞ|Key_34... (3 Replies)
Discussion started by: PK29
3 Replies

2. Shell Programming and Scripting

Delete duplicate row based on criteria

Hi, I have an input file as shown below: 20140102;13:30;FR-AUD-LIBOR-1W;2.495 20140103;13:30;FR-AUD-LIBOR-1W;2.475 20140106;13:30;FR-AUD-LIBOR-1W;2.495 20140107;13:30;FR-AUD-LIBOR-1W;2.475 20140108;13:30;FR-AUD-LIBOR-1W;2.475 20140109;13:30;FR-AUD-LIBOR-1W;2.475... (2 Replies)
Discussion started by: shash
2 Replies

3. Shell Programming and Scripting

Match based on criteria to file

Trying to match $1 of target.txt to $5 of file.txt. If there is a match then in an output.txt file $1,$1 (row underneath),$6,$4,$7 from file.txt are printed on the same line as $1 of target.txt. The input is from excel and the output should be tab-deliminated. Thank you :). target.txt... (2 Replies)
Discussion started by: cmccabe
2 Replies

4. Shell Programming and Scripting

Need To Delete Lines Based On Search Criteria

Hi All, I have following input file. I wish to retain those lines which match multiple search criteria. The search criteria is stored in a variable seperated from each other by comma(,). SEARCH_CRITERIA = "REJECT, DUPLICATE" Input File: ERROR,MYFILE_20130214_11387,9,37.75... (3 Replies)
Discussion started by: angshuman
3 Replies

5. UNIX for Dummies Questions & Answers

How to fetch files right below based on some matching criteria?

I have a requirement where in i need to select records right below the search criteria qwertykeyboard white 10 20 30 30 40 50 60 70 80 qwertykeyboard black 40 50 60 70 90 100 qwertykeyboard and white are headers separated by a tab. when i execute my script..i would be searching... (4 Replies)
Discussion started by: vinnu10
4 Replies

6. UNIX for Dummies Questions & Answers

How to select files based on a criteria?

I have a file..... xxx 2345 455 abc 345 555 cdf 456 777 fff 555 888 Now my requirement is, Say if, i want to select only those records prior to the record fff 555 888... how do i go about doing this in unix.... The fff would be hardcoded as it wud be fixed and everytime when i... (7 Replies)
Discussion started by: saggiboy10
7 Replies

7. Shell Programming and Scripting

Merging Lines based on criteria

Hello, Need help with following scenario. A file contains following text: {beginning of file} New: This is a new record and it is not on same line. Since I have lost touch with script take this challenge and bring all this in one line. New: Hello losttouch. You seem to be struggling... (4 Replies)
Discussion started by: losttouch
4 Replies

8. Shell Programming and Scripting

substract column based on some criteria

Please guide if you know how to solve this. I have a tab delimited INPUT FILE where each record is separated by ----- ----- ABC 4935402 4936680 Pattern=Cheers07080.1 ABC 4932216 4932368 Pattern=Cheers07080.1 ABC 4931932 4932122 ... (8 Replies)
Discussion started by: sam_2921
8 Replies

9. Shell Programming and Scripting

Delete new lines based on search criteria

Hi all! A bit of background: I am trying to create a script that formats SQL statements. I have gotten so far as to add new lines based on certain match criteria like commas, keywords etc. In the process, I end up adding newlines where I don't want. For example: substr(colName, 1, 10)... (3 Replies)
Discussion started by: jayarkay
3 Replies

10. Shell Programming and Scripting

remove lines based on score criteria

Hi guys, Please guide for Solution. PART-I INPUT FILE (has 2 columns ID and score) TC5584_1 93.9 DV161411_2 79.5 BP132435_5 46.8 EB682112_1 34.7 BP132435_4 29.5 TC13860_2 10.1 OUTPUT FILE (It shudn't contain the line ' BP132435_4 29.5 ' as BP132435 is repeated... (2 Replies)
Discussion started by: smriti_shridhar
2 Replies
Login or Register to Ask a Question