No, you can't look at the names of files and magically guess how many hyphens are in the 3rd lines of those files. And, as noted before using find | ... | mv ... may miss files depending on filesystem type when you have a directory with this many files in it...
If I have correctly understood what you want to do, the following script will move *.txt files with names that do not contain any space characters from /Users/Nexeu/Documents/Dict to subdirectories under /Users/Nexeu/Documents/Syllable. The target directory and subdirectories will be created if they do not already exist. This script will give errors if you try to move files with more than 33 different values for the number of hyphens contained on the 3rd line. If you save the output containing those errors, extract lines from that output that start and end with a ' character, and feed those lines into a modified script that runs in $SRCDIR and just runs the 2nd awk script, it will create list files for another 17 target subdirectories and the last part of the script will use those list files to move those files into the proper target directories. Or, you can just run the entire script again to process up to 33 more different hyphen counts (but that will take longer if there are still lots of files to process).
Code:
#!/bin/ksh
# USAGE: mvhyphen
# DESCRIPTION:
# This script depends on having two variables defined:
# SRCDIR: Absolute pathname of directory containing files to be
# processed. Results are unspecified in there are any
# subdirectories in this directory.
# DESTDIR:Destination base directory. This can be an absolute
# pathname or a pathname relative to $SRCDIR. (whichever
# of these is shorter is preferred.)
# This script moves to $SRCDIR and processes files with names
# ending with ".txt". Files with names containing a space are
# ignored. The 3rd line of each file is read. If that line does
# not match the pattern '^[[].*[]]', the file is also ignored.
# Otherwise, the number of hyphens between the '[' and the first
# comma or ']' after that are counted. For each unique count
# value, a list file is created in $DESTDIR named "listNH" where
# "N" is the count value. After the lists have been created,
# files in each list will be moved from $SRCDIR to $DESTDIR/"N"H.
# $DESTDIR and $DESTDIR/*H will be created if they are not already
# present.
# This script is OS X specific. It is tuned to work within the
# number of files awk can have open at once (stdin, stdout, and 17
# more files) and uses the non-standard xargs -J option. This
# script whould be able to handle up to 33 different values for
# the number of hyphens found in the 1st word in the 3rd line in
# the files being processed.
# Initialize variables...
DESTDIR=../Syllable
SRCDIR=/Users/Nexeu/Documents/Dict
# Move to source directory and process the files found there...
cd "$SRCDIR" || exit 1
mkdir -p "$DESTDIR" || exit 2
find . -name '*.txt' ! -name '* *' |
awk -v sq="'" -v dest="$DESTDIR" '
{ # Open and read the 1st three lines of the file named on the input line.
f = substr($0, 3) # Discard the leading "./" from find.
getline x < f
getline x < f
rc = getline x < f
close(f) # Close the file.
if(rc != 1) {
printf("Cannot read 3 lines from file: %s\n", f)
next
}
if(x !~ /^[[].*[]]/) {
printf("File line 3 bad format: %s\n", f)
next
}
sub(/[],].*/, "", x) # Discard all but 1st word...
nh = gsub(/-/, "", x) # count hyphens remaining on the line.
if(!(nh in flist)) {
# Add to the list of known counts.
flist[nh] = sprintf("%s/list%dH", dest, nh)
fd[nh] = ++nfd
}
if(fd[nh] <= 16) {
# Write this filename directly to the appropriate list file.
print sq f sq > flist[nh]
} else {# Write the list file filename and this filename to stdout
# to be processed by the 2nd awk in the pipeline...
print sq flist[nh] sq f sq
}
}' | \
# The following awk script interprets lines of the form:
# 'listfile_filename'file_to_be_move_filename'
# (without the double quotes) as a request to add the 2nd filename to
# the list of files in the 1st filename. Other lines are copied
# directly to stdout assuming that they are diagnostics from the
# previous awk script. If more than 17 different listfile pathnames are
# found in the input, lines for those listfiles will also be copied to
# stdout (so another copy of this script can be used to create upto 17
# more listfiles without running the find and the 1st awk again.
awk -F "'" -v sq="'" -v pat="^'[^']*'[^']*'$" '
$0 ~ pat { # Process listfile data lines...
if(!($2 in flist) && nfd++ < 17) {
# Add to known file list file array.
flist[$2]
}
if($2 in flist) {
print sq $3 sq > $2
next
}
print "Too many list files to process..."
}
1'
# Now that all of the list files have been created, create the destination
# directories and move the files included in the list files into them.
for listpath in "$DESTDIR"/list*H
do dirpath="${listpath%%list*}${listpath##*list}"
printf 'Processing list file: "%s"\n' "$listpath"
mkdir -p "$dirpath"
xargs -J '#' mv '#' "$dirpath" < "$listpath" && rm "$listpath"
# xargs -J '#' -t mv '#' "$dirpath" < "$listpath" && rm "$listpath"
done
When tested on a MacBook Pro running OS X Yosemite 10.10.3, it did what I expected with a couple of hundred files with 35 different hyphen counts. Obviously, it has not been tested in an environment with 690,000 files.
If you want it to provide a verbose list of the mv commands it uses while moving files from .../Dict to subdirectories of .../Syllable, uncomment the next to the last line in the script and comment out the line before that.
Good luck!
This User Gave Thanks to Don Cragun For This Post:
Hi Friends,
Can any of you explain me about the below line of code?
mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`
Im not able to understand, what exactly it is doing :confused:
Any help would be useful for me.
Lokesha (4 Replies)
Hi,
I have line in input file as below:
3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL
My expected output for line in the file must be :
"1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL"
Can someone... (7 Replies)
strange :)
can you tell why?:cool:
#!/bin/bash
echo " enter your age "
read age
if ; then
echo " you do not have to pay tax "
elif ]; then
echo " you are eligible for income tax "
else
echo " you dont have to pay tax "
fi (3 Replies)
Hi ,
i have some files of specific pattern ...i need to look for files which are having size greater than zero and move those files to another directory..
Ex...
abc_0702,
abc_0709,
abc_782
abc_1234 ...etc
need to find out which is having the size >0 and move those to target directory..... (7 Replies)
I have a bunch of random character lines like ABCEDFG. I want to find all lines with "A" and then change any "E" to "X" in the same line. ALL lines with "A" will have an "X" somewhere in it. I have tried sed awk and vi editor. I get close, not quite there. I know someone has already solved this... (10 Replies)
How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address
and column 3 contains “cc” e-mail address to include with same email.
Sample input file, email.txt
Below is an sample code where... (2 Replies)
Okay, so I have a rather large text file and will have to process many more and this will save me hours of work.
I'm not very good at scripting, so bear with me please.
Working on Linux RHEL
I've been able to filter and edit and clean up using sed, but I have a problem with moving lines.
... (9 Replies)
Hi everybody,
I am new at Unix/Bourne shell scripting and with my youngest experiences, I will not become very old with it :o
My code:
#!/bin/sh
set -e
set -u
export IFS=
optl="Optl"
LOCSTORCLI="/opt/lsi/storcli/storcli"
($LOCSTORCLI /c0 /vall show | grep RAID | cut -d " "... (5 Replies)
Hello.
System : opensuse leap 42.3
I have a bash script that build a text file.
I would like the last command doing :
print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt
where :
print_cmd ::= some printing... (1 Reply)
Hi 2 all,
i have had AIX 7.2
:/# /usr/IBMAHS/bin/apachectl -v
Server version: Apache/2.4.12 (Unix)
Server built: May 25 2015 04:58:27
:/#:/# /usr/IBMAHS/bin/apachectl -M
Loaded Modules:
core_module (static)
so_module (static)
http_module (static)
mpm_worker_module (static)
... (3 Replies)