Picking up files conditionally


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Picking up files conditionally
# 1  
Old 08-25-2015
Picking up files conditionally

Hi
I have a scenario:
I have a directory say DIR1 (no sub directories) and have few files in that directory as given below:

Code:
app-cnd-imp-20150820.txt
app-cxyzm-imp-20150820.txt
app-petco-imp-20150820.txt
app-mobility-imp-20150820.txt
app-mobility-imp-20150821.txt
app-mobility-imp-20150822.txt
app-cellular-imp-20150824.txt

I have a pseudo code in a script something like below that grabs the filename above and pass it to another script.

Code:
 
for filename in `ls -tr app-*-imp-*.txt`

 ksh Script2.sh ${filename} &   à Script2.sh will run in parallel and consume these file and remove the respective file from directory once complete processing.
  
done
  
 wait
  
 .....
 ......

Issue: I need to pass in the filename to Script2.sh above as shown. But if the same file is coming more than once with different dates (for example: app-mobility-imp-*.txt) then I need to have to use them in next FOR LOOP pass one by one. So if the same name file exists more than once, I need to process the earliest file first but one by one in different loop.

For example:
So in first pass in the FOR loop, I want to pass the filename below to Script2.sh:
Code:
app-cnd-imp-20150820.txt
app-cxyzm-imp-20150820.txt
app-petco-imp-20150820.txt
app-mobility-imp-20150820.txt
app-cellular-imp-20150824.txt

Once the file is processed by Script2.sh, it will get deleted by Script2.sh. So in the next pass of the FOR loop the file above


So in second pass in the FOR loop, I want to pass the filename below to Script2.sh:
app-mobility-imp-20150821.txt

So in third pass in the FOR loop, I want to pass the filename below to Script2.sh:
app-mobility-imp-20150822.txt

I would really appreciate your help and guidance.
Thanks
Moderator's Comments:
Mod Comment Please DO NOT use FONT and SIZE tags in your posts and please use CODE tags for all sample input, output, and code segments.

Last edited by Don Cragun; 08-25-2015 at 03:42 PM.. Reason: Get rid of FONT, COLOR, and SIZE tags; add CODE tags.
# 2  
Old 08-25-2015
Quote:
Code:
for filename in `ls –tr app-*-imp-*.txt`

 ksh Script2.sh ${filename} &   à Script2.sh will run in parallel and consume these file and remove the respective file from directory once complete processing.
  
done

This is a useless use of backticks.

As for separating the dates, how about list them all, extract the dates, and sort -u:

Code:
ls app-*-imp-*.txt | sed 's/[^0-9]//g' | sort -u | while read DATE
do
        for FILE in *"${DATE}.txt"
        do
                echo "Processing $FILE"
        done
done

As an aside, running 30 simultaneous processes does not mean your machine or disk can handle 30 simultaneous processes.

Last edited by Don Cragun; 08-25-2015 at 03:46 PM.. Reason: add missing close quote tag
# 3  
Old 08-25-2015
Code:
Script2.sh

actually kicks off external software that identifies the process based on the first part of the filename passed before date.
Code:
 
 For example: 
 app-mobility-imp will kick off the mobility app process.

If we process file sequentially as suggested, it may run for days.
I somehow am looking for the logic to somehow determine the files with same name and run only those files one by one and remaining ones can run in parallel in first pass of the loop.

Thanks
# 4  
Old 08-25-2015
What kind of load do they put on the machine/disk/network? Overloading them will waste more time, not less.

Does having to process those files sequentially mean you have to wait for everything to stop, before you launch more? Otherwise, one of your background ones might finish in-between files A, B, C, and D.
# 5  
Old 08-25-2015
It is one of the ETL servers that loads these files into databases. Each file pertains to different load process. So passing filename in parrallel will load various tables based on filename simultaneously.
Currently it is working fine in production without any issues. But ocassionally now we started receiving multiple files( not high in number) with same name but different dates, hence kicking off same file will execute the same ETL code causing it to fail multiple times. I am trying to avoid a situation of failure and want to keep the parallel execution in place for individual files and for the ones that are more than one file with same name that needs to be sequential load one after the other.

Code:
 
 1.    app-cnd-imp-20150820.txt
 2.    app-cxyzm-imp-20150820.txt
 3.    app-petco-imp-20150820.txt
 4.    app-mobility-imp-20150820.txt
 5.    app-mobility-imp-20150821.txt
 6.    app-mobility-imp-20150822.txt
 7.    app-cellular-imp-20150824.txt


So in the file list above I can run file number 1,2,3,4,7 in one pass of loop and wait for completion and number 5 file in second pass of the loop and wait for completion and number 6 file in third pass of loop as the 4, 5,6 pertains to the same ETL code and will fail the load process.

Thanks

Last edited by Saanvi1; 08-25-2015 at 05:11 PM.. Reason: spelling correction
# 6  
Old 08-25-2015
Another option might be something like this, if the file names may not have spaces in them:

Code:
ls app-*-imp-*.txt | awk '{$NF="*.txt"}!A[$0]++' FS=- OFS=- |
while read subpattern
do
  for i in $subpattern
  do
    echo "processing file $i"
  done &
done
wait

What happens is that for each sub-pattern a for loop is processed in the background. Each sub-pattern expands to one or more related files which are in alphabetical order, which is the right order because of the way the files are named... So different files will be processed in parallel. If there are more than one files with a sub-pattern, these wil be processed sequentially..

Last edited by Scrutinizer; 08-25-2015 at 05:12 PM..
# 7  
Old 08-25-2015
Thanks Scrutinizer. Let me try that out

---------- Post updated at 03:22 PM ---------- Previous update was at 03:12 PM ----------

Hi,
I tried the script below:

Code:
 
 #!/bin/ksh
ls app-*-imp-*.txt | awk '{$NF="*.txt"}!A[$0]++' FS=- OFS=- |
while read pattern
do
  for i in $pattern
  do
    echo "processing file $i"
  done &
done

I am getting the error below. I am using Sun Solaris box.

awk: syntax error near line 1
awk: bailing out near line 1
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Picking up last two specific files, for transfer

Hi I need to transfer files with csv extension from one server to another (all running solaris). But the files to be picked are in a directory that also has: /directory deposit_698.csv deposit_698.checksum deposit_699.csv deposit_699.checksum deposit_700.csv deposit_700.checksum... (15 Replies)
Discussion started by: fretagi
15 Replies

2. UNIX for Dummies Questions & Answers

Sorting files based on timestamp and picking the latest file

Hi Friends, Newbie to shell scripting Currently i have used the below to sort data based on filenames and datestamp $ printf '%s\n' *.dat* | sort -t. -k3,4 filename_1.dat.20120430.Z filename_2.dat.20120430.Z filename_3.dat.20120430.Z filename_1.dat.20120501.Z filename_2.dat.20120501.Z... (12 Replies)
Discussion started by: robertbrown624
12 Replies

3. Shell Programming and Scripting

Urgent ...pls Sorting files based on timestamp and picking the latest file

Hi Friends, Newbie to shell scripting. Currently i have used the below to sort data based on filenames and datestamp $ printf '%s\n' *.dat* | sort -t. -k3,4 filename_1.dat.20120430.Z filename_2.dat.20120430.Z filename_3.dat.20120430.Z filename_1.dat.20120501.Z filename_2.dat.20120501.Z... (1 Reply)
Discussion started by: robertbrown624
1 Replies

4. Shell Programming and Scripting

How to conditionally replace a pattern?

Hi, How to replace only the function calls with a new name and skip the function definition and declarations. consider the following code. There are 2 functions defined here returnint and returnvoid. I need to replace returnint with giveint and returnvoid with givevoid only in the function... (2 Replies)
Discussion started by: i.srini89
2 Replies

5. UNIX for Dummies Questions & Answers

How to conditionally replace a pattern?

Hi, How to replace only the function calls with a new name and skip the function definition and declarations. consider the following code. There are 2 functions defined here returnint and returnvoid. I need to replace returnint with giveint and returnvoid with givevoid only in the function... (1 Reply)
Discussion started by: i.srini89
1 Replies

6. Shell Programming and Scripting

conditionally combine text from two files into one

Hi! I'm trying to take multiple text files (6), which have text on some lines but not others, and combine them. I'd also like to make the values in one column of some of the files (files 4-6) negative. I'm trying to write a short script (see below) as I have to do this with a large number of... (2 Replies)
Discussion started by: felix.echidna
2 Replies

7. Shell Programming and Scripting

Conditionally delete last X lines

delete last X lines,which start with + example file: test1 test2 remove1 remove2 one liner shell is preferred. (8 Replies)
Discussion started by: honglus
8 Replies

8. Shell Programming and Scripting

Get min from a column conditionally

hi, i have a file with folowing content: STORAGE PERCENTAGE FLAG: /storage_01 64% 0 /storage_02 17% 1 /storage_03 10% 0 /storage_04 50% 1 I need to get the value of STORAGE from those with FLAG=0 and which has the min PERCENTAGE i am able to get the STORAGE corresponding to... (8 Replies)
Discussion started by: kichu
8 Replies

9. UNIX for Dummies Questions & Answers

Conditionally joining lines in vi

I've done this before but I can't remember how. Too long away from vi. I want to do a search are replace, but I want the replace to be a join. Example see spot run see spot walk see spot run fast see spot hop %s/run$/<somehow perform a join with the next line>/g so the results... (0 Replies)
Discussion started by: ifermon
0 Replies

10. Shell Programming and Scripting

Email from script conditionally

I have a script that is run from the Cron 3 times an hour, here is the cron line: 02,22,42 7-18 * * 1-5 /hci/TEST/bin/myscript.ksh TEST 1>/hci/TEST/logs/myscript.info 2>/hci/TEST/logs/myscript I am curious as to whether the time parameters from cron, ( 02, 22, 42 etc) can be accessed from the... (2 Replies)
Discussion started by: dfb500
2 Replies
Login or Register to Ask a Question