Hello,
I need to grep/read files from multiple directories, one by one.
I mean something like shuffling the cards uniformly.
Code:
cd /directoryA
for i in *.txt;
do
some codes
cd ../directoryB
for i in *.txt;
do
some codes
cd ../directoryC
for i in *.txt;
do
some codes
done
done
done
Directory A Includes: (Lets say totally 100 files)
Code:
1.txt
2.txt
3.txt
4.txt
5.txt
..
..
..
100.txt
Directory B Includes: (Lets say totally 30files)
Code:
a.txt
b.txt
c.txt
d.txt
e.txt
..
..
..
z.txt
Directory C Includes: (Let's say totally 901 files)
I would first read the content of each directory into a separate array, in this case 3 arrays (A, B, C). Then I would set up an associative array (let's call it SEEN), where I enter the basename from each file which has already been processed. Checking SEEN before processing a file allows me to skip the files which have already been processed. I then would have a single loop, ranging over the indexes of the longest array. Inside the loop, I would use the loop index to access the arrays A, B and C.
One design decision is, whether the number of directories is always constant (3) or can be variable too. If there is no inherent necessity, why it must be 3 directories, and not 2 or 4, I would make this variable too.
Now it comes for choosing the programming language. You need a language which supports arrays and associative arrays. For shell scripting, it means that you can use Zsh or bash or - I think - ksh.
If you decided to make the number of directories a variable too, you **can** do it in shell scripts, but I find it a bit invonvenient. For this type of task, I would consider a more general programming language, such as Ruby or Perl.
I think I understand the principle, but as a starting point, let me indent your code for clarity:-
Code:
cd /directoryA
for loopa in *.txt;
do
some codes
cd ../directoryB
for loopb in *.txt;
do
some codes
cd ../directoryC
for loopc in *.txt;
do
some codes
done
done
done
As you can see, I have changed the variable for the loops else results will be unpredictable.
With this, you would be trying to read everything in directoryC 3,000~ish times (for every file in directoryA multiplied by every file in directoryB) Is this really what you want?
You also have the problem that you are changing directory just before a loop, but on leaving the loop you do not change back, so for the second loop and after (e.g. file b.txt) of directoryB, your shell would be in directoryC. When processing the second loop and after of files in directoryA (e.g. 2.txt), your shell would also be either in directoryC so your some codes statement would have to handle being in various places. The for loop will already have been formed, so the loop as a whole will process as you are telling it, but very likely in the wrong directory.
You you just want to process each file once, you need to move the done statements, which would give you this:-
Code:
cd /directoryA
for i in *.txt;
do
some codes
cd ../directoryB
done
for i in *.txt;
do
some codes
done
cd ../directoryC
for i in *.txt;
do
some codes
done
If you really do want to process every file in directoryC 3,000~ish times, consider using pushd & popd to handle directory transitions like this:-
Code:
pushd /directoryA
for loopa in *.txt;
do
some codes
pushd /directoryB
for loopb in *.txt;
do
some codes
pushd /directoryC
for loopc in *.txt;
do
some codes
done
popd
done
popd
done
popd
They will handle the moving in and out of directories safely. It is better to use a fully qualified directory path rather than trying to assume where you are and issuing cd ../directoryX
Sorry I've gone on for a while, but I hope that this helps.
Can you tell us what you are actually trying to achieve? So sort of logical steps you want to do and we can work no it a bit better.
Kind regards,
Robin
These 2 Users Gave Thanks to rbatte1 For This Post:
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 2,288
Thanks Given: 430
Thanked 480 Times in 395 Posts
Hi.
Apologies for the lengthy post.
This is probably not an improvement as much as it is a meta-answer. This is how we went about solving problems like this by generalizing.
I'll assume that you could get the contents of the directories by looking at the content, and one such command might be:
Code:
ls -1 A > data1
and so on for B, C, etc.
We then would use a local code gather to obtain at least one line from each of the files data1, data2, etc.
The companion program is scatter.
As I have noted before we have not yet decided to publish our codes, but when we post meta-solutions like this, we can post the documentation. Then folks can decide whether this is a reasonable approach for them to pursue.
Here is a demonstration that exercises code gather:
Code:
#!/usr/bin/env bash
# @(#) s1 Demonstrate intersperse, shuffle of lines, gather.
# Utility functions: print-as-echo, print-line-with-visual-space, debug.
# export PATH="/usr/local/bin:/usr/bin:/bin"
LC_ALL=C ; LANG=C ; export LC_ALL LANG
pe() { for _i;do printf "%s" "$_i";done; printf "\n"; }
pl() { pe;pe "-----" ;pe "$*"; }
em() { pe "$*" >&2 ; }
db() { ( printf " db, ";for _i;do printf "%s" "$_i";done;printf "\n" ) >&2 ; }
db() { : ; }
C=$HOME/bin/context && [ -f $C ] && $C gather dixf scatter
FILE=${1-data1}
pl " Input data files data*:"
head data*
pl " Results, 1 item from each list:"
gather data*
pl " Results, 2 from column A, one from B and C:"
gather data1:2 data2 data3
pl " Results, as previous, but separator is newline:"
gather -s '\n' data1:2 data2 data3
pl " Help from gather:"
gather -h
pl " Details about gather, scatter:"
dixf gather scatter
exit 0
producing:
Code:
$ ./s1
Environment: LC_ALL = C, LANG = C
(Versions displayed with local utility "version")
OS, ker|rel, machine: Linux, 3.16.0-4-amd64, x86_64
Distribution : Debian 8.8 (jessie)
bash GNU bash 4.3.30
gather (local) 1.1
dixf (local) 1.49
scatter (local) 1.4
-----
Input data files data*:
==> data1 <==
1.txt
2.txt
3.txt
4.txt
5.txt
==> data2 <==
a.txt
b.txt
c.txt
d.txt
e.txt
==> data3 <==
101.txt
102.txt
103.txt
104.txt
105.txt
-----
Results, 1 item from each list:
1.txt
a.txt
101.txt
2.txt
b.txt
102.txt
3.txt
c.txt
103.txt
4.txt
d.txt
104.txt
5.txt
e.txt
105.txt
-----
Results, 2 from column A, one from B and C:
1.txt 2.txt
a.txt
101.txt
3.txt 4.txt
b.txt
102.txt
5.txt 103.txt
c.txt
104.txt
d.txt
105.txt
e.txt
-----
Results, as previous, but separator is newline:
1.txt
2.txt
a.txt
101.txt
3.txt
4.txt
b.txt
102.txt
5.txt
103.txt
c.txt
104.txt
d.txt
105.txt
e.txt
-----
Help from gather:
gather - Read, shuffle, weave, intersperse lines from multiple inputs to STDOUT.
usage: gather [options] -- [files]
options:
--separator=SEP
Set output token separator to SEP, default " ", and "\n" is
accepted as NEWLINE. The SEP is used when more than one line is
read from a file. This allows one to easily capture short lines
into a single output line.
--help (or -h)
print this message and quit.
[files]
filename1:count1 filename2:count2 ... filename3:count3
Each filenamei will be read in succession, and will be written
to standard output, continuing until EOF on every file.
-----
Details about gather, scatter:
gather Read, shuffle, weave, intersperse lines from multiple input files. (what)
Path : ~/bin/gather
Version : 1.1
Length : 224 lines
Type : Perl script, ASCII text executable
Shebang : #!/usr/bin/env perl
Help : probably available with -h,--help
Modules : (for perl codes)
strict 1.08
warnings 1.23
English 1.09
Carp 1.3301
Getopt::Long 2.42
feature 1.36_01
scatter Write, deal, unravel disperse lines to multiple output files. (what)
Path : ~/bin/scatter
Version : 1.4
Length : 190 lines
Type : Perl script, ASCII text executable
Shebang : #!/usr/bin/env perl
Modules : (for perl codes)
strict 1.08
warnings 1.23
English 1.09
Carp 1.3301
Data::Dumper 2.151_01
Getopt::Long 2.42
feature 1.36_01
You can also, with the right shell, use embedded commands like this:
Code:
$ gather <( ls -1 A ) <( ls -1 B )
1.txt
a.txt
2.txt
b.txt
3.txt
c.txt
4.txt
d.txt
5.txt
e.txt
I'd like to inform you that MadeInGermany's script worked out as expected.
As I am not familiar with different kind of softwares/scripts etc, I am unable to provide feedback about the output of other alternatives.
Quote:
Originally Posted by MadeInGermany
The following bash script is typed from a mobile device and untested.
It uses 3 extra file handles, so it can read in a round-robin order.
Code:
while
read a <&3; aexit=$?
read b <&4; bexit=$?
read c <&5; cexit=$?
[ $aexit -eq 0 ] || [ $bexit -eq 0 ] || [ $cexit -eq 0 ]
do
[ $aexit -eq 0 ] && printf "%s\n" "$a"
[ $bexit -eq 0 ] && printf "%s\n" "$b"
[ $cexit -eq 0 ] && printf "%s\n" "$c"
done 3< <( ls A/ ) 4< <( ls B/ ) 5< <( ls C/ )
I have searched this quite a long time but couldn't find the right method for me to use. I need to assign read write permission to the user for specific directories and it's sub directories and files. I do not want to use ACL. This is for Solaris. Please help. (1 Reply)
Hi Guys,
I need to access multiple directories whcih is following similar structure and need to copy those files in desitination path.
for eg :
if ]
then
cd ${DIR}/Mon/loaded
echo "copying files to $GRS_DIR"
cp * ${DIR}/Mon/
echo "Files of Monday are Copied"
fi
if ]
then... (5 Replies)
hi,
this is my script #!/bin/ksh
cat temp_file.dat | while read line
do
read test
if ]; then
break
else echo "ERROR"
fi
done
when i execute this code , the script does wait for the user input . it directly prints "ERROR" and terminates after the no. of times as there... (3 Replies)
Hi ;
I want to write a shell script to read all files and directories(recursively) given in path along with their user permissions and store that result in one file as
File path Userpermissions
===== ===========
I m new to linux and my dont kno much abt shell scripting.
I will... (5 Replies)
Hello,
I am a bit stumped on this. I am attempting to create 24 empty directories with a loop. Seems like I have incorrect syntax. When I run the following command I get the error below.
Command
$ for i in {2..24}; do mkdir $i_MAY_2011 ; doneError x 24
mkdir: missing operand
Try `mkdir... (2 Replies)
Hi,
I have various log files in different paths. e.g.
a/b/c/d/e/server.log
a/b/c/d/f/server.log
a/b/c/d/g/server.log
a/b/c/h/e/server.log
a/b/c/h/f/server.log
a/b/c/h/g/server.log
a/b/c/i/e/server.log
a/b/c/i/e/server.log
a/b/c/i/e/server.log
and above these have an archive folder... (6 Replies)
I want a bit of shell script that will let me loop round all the sub-directories in a directory (i.e. ignoring any ordinary files in that directory). Let's say I just want to echo the names of the sub-directories. This sounds like it should be pretty easy - but not for me, it isn't!
All help... (4 Replies)
Hi,
Please help me on this.
Suppose i have the following directory structure.
/app/data
/app/data/eng
/app/data/med
/app/data/bsc
each of the directories data,data/eng,data/med,data/bsc holds files with date extension like
a.20081230
b.20081230 and so on
I need a script to loop... (9 Replies)
I'm trying to write a script that will loop through all files and directories down from a path I give it, and change the permissions and ACL. I was able to do the obvious way and change the files and folders on the same level as teh path...but I need it to continue on deeper into the file... (2 Replies)