Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Move File Containing More Than two "-" at 3rd Line To New Directory Post 302944193 by Don Cragun on Sunday 17th of May 2015 03:18:21 AM
Old 05-17-2015
No, you can't look at the names of files and magically guess how many hyphens are in the 3rd lines of those files. And, as noted before using find | ... | mv ... may miss files depending on filesystem type when you have a directory with this many files in it...

If I have correctly understood what you want to do, the following script will move *.txt files with names that do not contain any space characters from /Users/Nexeu/Documents/Dict to subdirectories under /Users/Nexeu/Documents/Syllable. The target directory and subdirectories will be created if they do not already exist. This script will give errors if you try to move files with more than 33 different values for the number of hyphens contained on the 3rd line. If you save the output containing those errors, extract lines from that output that start and end with a ' character, and feed those lines into a modified script that runs in $SRCDIR and just runs the 2nd awk script, it will create list files for another 17 target subdirectories and the last part of the script will use those list files to move those files into the proper target directories. Or, you can just run the entire script again to process up to 33 more different hyphen counts (but that will take longer if there are still lots of files to process).

Code:
#!/bin/ksh
# USAGE: mvhyphen
# DESCRIPTION:
#	This script depends on having two variables defined:
#	SRCDIR:	Absolute pathname of directory containing files to be
#		processed.  Results are unspecified in there are any
#		subdirectories in this directory.
#	DESTDIR:Destination base directory.  This can be an absolute
#		pathname or a pathname relative to $SRCDIR.  (whichever
#		of these is shorter is preferred.)

#	This script moves to $SRCDIR and processes files with names
#	ending with ".txt".  Files with names containing a space are
#	ignored.  The 3rd line of each file is read.  If that line does
#	not match the pattern '^[[].*[]]', the file is also ignored.
#	Otherwise, the number of hyphens between the '[' and the first
#	comma or ']' after that are counted.  For each unique count
#	value, a list file is created in $DESTDIR named "listNH" where
#	"N" is the count value.  After the lists have been created,
#	files in each list will be moved from $SRCDIR to $DESTDIR/"N"H.
#	$DESTDIR and $DESTDIR/*H will be created if they are not already
#	present.

#	This script is OS X specific.  It is tuned to work within the
#	number of files awk can have open at once (stdin, stdout, and 17
#	more files) and uses the non-standard xargs -J option.  This
#	script whould be able to handle up to 33 different values for
#	the number of hyphens found in the 1st word in the 3rd line in
#	the files being processed.

# Initialize variables...
DESTDIR=../Syllable
SRCDIR=/Users/Nexeu/Documents/Dict

# Move to source directory and process the files found there...
cd "$SRCDIR" || exit 1
mkdir -p "$DESTDIR" || exit 2
find . -name '*.txt' ! -name '* *' |
awk -v sq="'" -v dest="$DESTDIR" '
{	# Open and read the 1st three lines of the file named on the input line.
	f = substr($0, 3)	# Discard the leading "./" from find.
	getline x < f
	getline x < f
	rc = getline x < f
	close(f)		# Close the file.
	if(rc != 1) {
		printf("Cannot read 3 lines from file: %s\n", f)
		next
	}
	if(x !~ /^[[].*[]]/) {
		printf("File line 3 bad format: %s\n", f)
		next
	}
	sub(/[],].*/, "", x)	# Discard all but 1st word...
	nh = gsub(/-/, "", x)	# count hyphens remaining on the line.
	if(!(nh in flist)) {
		# Add to the list of known counts.
		flist[nh] = sprintf("%s/list%dH", dest, nh)
		fd[nh] = ++nfd
	}
	if(fd[nh] <= 16) {
		# Write this filename directly to the appropriate list file.
		print sq f sq > flist[nh]
	} else {# Write the list file filename and this filename to stdout
		# to be processed by the 2nd awk in the pipeline...
		print sq flist[nh] sq f sq
	}
}' | \
# The following awk script interprets lines of the form:
#	'listfile_filename'file_to_be_move_filename'
# (without the double quotes) as a request to add the 2nd filename to
# the list of files in the 1st filename.  Other lines are copied
# directly to stdout assuming that they are diagnostics from the
# previous awk script.  If more than 17 different listfile pathnames are
# found in the input, lines for those listfiles will also be copied to
# stdout (so another copy of this script can be used to create upto 17
# more listfiles without running the find and the 1st awk again.
awk -F "'" -v sq="'" -v pat="^'[^']*'[^']*'$" '
$0 ~ pat { # Process listfile data lines...
	if(!($2 in flist) && nfd++ < 17) {
		# Add to known file list file array.
		flist[$2]
	}
	if($2 in flist) {
		print sq $3 sq > $2
		next
	}
	print "Too many list files to process..."
}
1'

# Now that all of the list files have been created, create the destination
# directories and move the files included in the list files into them.
for listpath in "$DESTDIR"/list*H
do	dirpath="${listpath%%list*}${listpath##*list}"
	printf 'Processing list file: "%s"\n' "$listpath"
	mkdir -p "$dirpath"
	xargs -J '#' mv '#' "$dirpath" < "$listpath" && rm "$listpath"
#	xargs -J '#' -t mv '#' "$dirpath" < "$listpath" && rm "$listpath"
done

When tested on a MacBook Pro running OS X Yosemite 10.10.3, it did what I expected with a couple of hundred files with 35 different hyphen counts. Obviously, it has not been tested in an environment with 690,000 files.

If you want it to provide a verbose list of the mv commands it uses while moving files from .../Dict to subdirectories of .../Syllable, uncomment the next to the last line in the script and comment out the line before that.

Good luck!
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Explain the line "mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'`"

Hi Friends, Can any of you explain me about the below line of code? mn_code=`env|grep "..mn"|awk -F"=" '{print $2}'` Im not able to understand, what exactly it is doing :confused: Any help would be useful for me. Lokesha (4 Replies)
Discussion started by: Lokesha
4 Replies

2. Shell Programming and Scripting

awk command to replace ";" with "|" and ""|" at diferent places in line of file

Hi, I have line in input file as below: 3G_CENTRAL;INDONESIA_(M)_TELKOMSEL;SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL My expected output for line in the file must be : "1-Radon1-cMOC_deg"|"LDIndex"|"3G_CENTRAL|INDONESIA_(M)_TELKOMSEL"|LAST|"SPECIAL_WORLD_GRP_7_FA_2_TELKOMSEL" Can someone... (7 Replies)
Discussion started by: shis100
7 Replies

3. UNIX for Dummies Questions & Answers

script works well but displays " line 6: =: No such file or directory"

strange :) can you tell why?:cool: #!/bin/bash echo " enter your age " read age if ; then echo " you do not have to pay tax " elif ]; then echo " you are eligible for income tax " else echo " you dont have to pay tax " fi (3 Replies)
Discussion started by: me.
3 Replies

4. UNIX for Dummies Questions & Answers

look for file size greater than "0" of specific pattern and move those to another directory

Hi , i have some files of specific pattern ...i need to look for files which are having size greater than zero and move those files to another directory.. Ex... abc_0702, abc_0709, abc_782 abc_1234 ...etc need to find out which is having the size >0 and move those to target directory..... (7 Replies)
Discussion started by: dssyadav
7 Replies

5. Shell Programming and Scripting

Find lines with "A" then change "E" to "X" same line

I have a bunch of random character lines like ABCEDFG. I want to find all lines with "A" and then change any "E" to "X" in the same line. ALL lines with "A" will have an "X" somewhere in it. I have tried sed awk and vi editor. I get close, not quite there. I know someone has already solved this... (10 Replies)
Discussion started by: nightwatchrenba
10 Replies

6. UNIX for Dummies Questions & Answers

Using "mailx" command to read "to" and "cc" email addreses from input file

How to use "mailx" command to do e-mail reading the input file containing email address, where column 1 has name and column 2 containing “To” e-mail address and column 3 contains “cc” e-mail address to include with same email. Sample input file, email.txt Below is an sample code where... (2 Replies)
Discussion started by: asjaiswal
2 Replies

7. Shell Programming and Scripting

Move a line containg "char" above line containing "xchar"

Okay, so I have a rather large text file and will have to process many more and this will save me hours of work. I'm not very good at scripting, so bear with me please. Working on Linux RHEL I've been able to filter and edit and clean up using sed, but I have a problem with moving lines. ... (9 Replies)
Discussion started by: rex007can
9 Replies

8. Shell Programming and Scripting

Failure: if grep "$Var" "$line" inside while read line loop

Hi everybody, I am new at Unix/Bourne shell scripting and with my youngest experiences, I will not become very old with it :o My code: #!/bin/sh set -e set -u export IFS= optl="Optl" LOCSTORCLI="/opt/lsi/storcli/storcli" ($LOCSTORCLI /c0 /vall show | grep RAID | cut -d " "... (5 Replies)
Discussion started by: Subsonic66
5 Replies

9. Shell Programming and Scripting

Bash script - Print an ascii file using specific font "Latin Modern Mono 12" "regular" "9"

Hello. System : opensuse leap 42.3 I have a bash script that build a text file. I would like the last command doing : print_cmd -o page-left=43 -o page-right=22 -o page-top=28 -o page-bottom=43 -o font=LatinModernMono12:regular:9 some_file.txt where : print_cmd ::= some printing... (1 Reply)
Discussion started by: jcdole
1 Replies

10. AIX

Apache 2.4 directory cannot display "Last modified" "Size" "Description"

Hi 2 all, i have had AIX 7.2 :/# /usr/IBMAHS/bin/apachectl -v Server version: Apache/2.4.12 (Unix) Server built: May 25 2015 04:58:27 :/#:/# /usr/IBMAHS/bin/apachectl -M Loaded Modules: core_module (static) so_module (static) http_module (static) mpm_worker_module (static) ... (3 Replies)
Discussion started by: penchev
3 Replies
All times are GMT -4. The time now is 04:10 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy