Count the number of files to delete doesnt match

 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers Count the number of files to delete doesnt match
# 1  
Old 11-19-2016
Count the number of files to delete doesnt match

Good evening, need your help please

Need to delete certain files before octobre 1 2016, so need to know how many files im going to delete, for instance

Code:
ls -lrt file_20160*.lis!wc -l

but using grep -c to another file called bplist which contains the list of all files backed up doesn match the count

Code:
grep -c file_20160 bplist.txt

the first query gives me 568 files and the second query returns 1120 records-

So before deleting ive got to make sure ive got the right amount of files to delete, so waht am i doing wrong.

After matching the amount of files to delete i need to add a new file

1 using grep to bplist.txt to clasify files to delete this way:


Code:
for file in $(ls -lrt file_20160*.lis!awk '{print $9}')
do
grep $file bplist.txt >>filetodelete.txt
done

is any better and faster way to do that ?


Id appreciate your help in adavnced

Last edited by Don Cragun; 11-19-2016 at 11:01 PM..
# 2  
Old 11-19-2016
Moderator's Comments:
Mod Comment To keep the forums high quality for all users, please take the time to format your posts correctly.

Please take the time while your account is in read-only mode to review the tutorial below that explains how to correctly use CODE tags. We know that you have seen this tutorial many times before, but if you continue to post without correctly marking sample input, sample output, and code segments with CODE tags, you may be permanently banned from this site...

First of all, use Code Tags when you post any code or data samples so others can easily read your code. You can easily do this by highlighting your code and then clicking on the # in the editing menu. (You can also type code tags [code] and [/code] by hand.)



Second, avoid adding color or different fonts and font size to your posts. Selective use of color to highlight a single word or phrase can be useful at times, but using color, in general, makes the forums harder to read, especially bright colors like red.

Third, be careful when you cut-and-paste, edit any odd characters and make sure all links are working property.

Thank You.

The UNIX and Linux Forums
# 3  
Old 11-20-2016
First: Note that elements of a pipeline are separated by pipe symbols (|); not exclamation points (!). So the code you showed us in post #1 in this thread can't possibly produce the output you described.

Second: We have absolutely no idea what the format is for the data in bplist (or bplist.txt, depending on which part of your post we are to believe). We have absolutely no idea what the format is for the filenames (or pathnames) being processed.

Third: You have not explained why you need to count files to be removed instead of just identifying files to be removed and removing them.

Fourth: You have not given us any indication whether there are duplicates in one or both of your lists, whether files in one list are different than files in the other list, nor if there is any indication that there is a problem with the contents of either list (other than that the line counts are different).

Fifth: Why use the complicated:
Code:
for file in $(ls -lrt file_20160*.lis!awk '{print $9}')

which involves creating a subshell and invoking two utilities and can fail miserably if there are any whitespace characters in any of your filenames, when:
Code:
for file in file_20160*.lis

would be MUCH faster and, if you properly quoted the expansion (i.e., "$file") in your for loop, suffers none of the problems possible in your current loop.
# 4  
Old 11-20-2016
Further to Don's remarks, if you are using file name expansion with the for loop,

The first attempt might look like this:

Code:
for file in file_20160*.lis
do
  grep "$file" bplist.txt >>filetodelete.txt
done

It is important to test for the case when there are zero files that fit the pattern, otherwise you end up with an a variable that contains file_20160*.lis, which would then become a regular expression, since that is what grep, like so:

Code:
grep file_20160*.lis bplist.txt >>filetodelete.txt

which would then delete any file names that start with "file_2016" followed by zero or more zeroes and ".lis" from the file..

Now probably those files do not exist in your case, but it is best to avoid a possible loop hole altogether, by testing if a file exists and use string matching instead of regex matching, using grep's -F parameter. To avoid partial file name matches (where the pattern or string that grep is looking for is a subset of the filename) another important parameter would be the -x option, which forces line matches. A third thing would be to avid the possibility that files that start with a - sign could be interpreted as an option flag to grep. One way to stop this is by using the -- flag. Because you file pattern starts with file that will not be an issue here, but it is good practice to do that anyways, so that in future if you ever change the pattern so that it starts with an *, this will not break things.

So then it becomes:
Code:
for file in file_20160*.lis
do
  if [ -f "$file" ]; then 
    grep -Fx -- "$file" bplist.txt >>filetodelete.txt
  fi
done

Now that last thing here is that you are appending to the file here, probably out of necessity, otherwise the file would be overwritten with very loop. An alternative would be to redirect the loop itself so the file would only be opened once and you do not have to delete the file prior to running the loop:

Code:
for file in file_20160*.lis
do
  if [ -f "$file" ]; then
    grep -Fx -- "$file" bplist.txt
  fi
done > filetodelete.txt

One last thing. This is still an expensive way to do it because an external program in a subshell is used to perform the operations for every iteration in the for loop, which is resource intensive.

An alternative would be to use a pipe (|) and grep's - operator for stdin, which most grep's (but not all) will honor, together with the file flag -f
Code:
ls file_20160*.lis | grep -Fxf -  -- bplist.txt > filetodelete.txt

if there are not too many files in the directory.

Or use the more robust:
Code:
for file in file_20160*.lis
do
  if [ -f "$file" ]; then
    printf "%s\n" "$file"
  fi
done |
grep -Fxf -  -- bplist.txt > filetodelete.txt

Since the - operator for stdin is not universally supported in grep, another way would be to use process substitution ( <( ... ) ) that is used in for modern bash, ksh93 or zsh:
Code:
grep -Fxf <(ls file_20160*.lis) -- bplist.txt > filetodelete.txt

if there are not too many files in the directory.

Or again the more robust variety:
Code:
grep -Fxf <(
  for file in file_20160*.lis
  do
    if [ -f "$file" ]; then
      printf "%s\n" "$file"
    fi
  done
) -- bplist.txt > filetodelete.txt

# 5  
Old 11-26-2016
OK, i will take into account your Recommendations. Thank you very much everyone of you for the options you gave me to reach out what i want.

First you were right bplist file had duplicated lines, thats why number of records didn match,so by removing duplicates i typed:


Code:
sort List-prosclbt00c-xpbatch-01112016_Q607965.txt|uniq > List-prosclbt00c-xpbatch-01112016_Q607965_nd.txt

this file has this format:


Code:
-r   1452 Nov 01 10:10 /produccion/explotacion/xpbatch/SHELL_PLAN_FAMILIA-SEA_MOVIMIENTO/logs/LogMonitoreoSeaMovimiento20150601_2200
00.log
-r--r--r-- xpbatch   explotaci          40 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/LICENSE
-r--r--r-- xpbatch   explotaci          40 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/jre/LICEN
SE
-r--r--r-- xpbatch   explotaci          46 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/jre/READM
E
-r--r--r-- xpbatch   explotaci         113 Nov 01 10:10 /produccion/explotacion/xpbatch/local.cshrc
-r--r--r-- xpbatch   explotaci         159 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/README.ht
ml
-r--r--r-- xpbatch   explotaci         580 Nov 01 10:10 /produccion/explotacion/xpbatch/local.profile
-r--r--r-- xpbatch   explotaci         607 Nov 01 10:10 /produccion/explotacion/xpbatch/local.login
-r--r--r-- xpbatch   explotaci         632 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/jre/lib/c
mm/GRAY.pf
-r--r--r-- xpbatch   explotaci         955 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/jre/Welco
me.html
-r--r--r-- xpbatch   explotaci        1044 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/jre/lib/c
mm/LINEAR_RGB.pf
-r--r--r-- xpbatch   explotaci        2856 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/jre/lib/m
anagement/jmxremote.password.template
-r--r--r-- xpbatch   explotaci        3144 Nov 01 10:10 /produccion/explotacion/xpbatch/CreacionEnvioContratos/jdk1.8.0_45/jre/lib/c
mm/sRGB.pf

Now i wanted to search some files in bplist, but it is likely grep options are not supported (Sun operating system), it yields error,ie:

Code:
for Archivo in log_HistoricoRecargas??062016*.log*
do
  if [ -f "$Archivo" ]; then
    grep  -Fx -- "$Archivo" List-prosclbt00c-xpbatch-01112016_Q607965_nd.txt
  fi
done > Archivotodelete.txt

grep: illegal option -- F
grep: illegal option -- x
Usage: grep -hblcnsviw pattern file . . .
grep: illegal option -- F

So i remove grep options and run the shell but id didn work because it didnt find records

Code:
for Archivo in log_HistoricoRecargas??062016*.log*
do
  if [ -f "$Archivo" ]; then
    grep  "$Archivo" List-prosclbt00c-xpbatch-01112016_Q607965_nd.txt
  fi
done > Archivotodelete.txt

Code:
SCEL /SCEL/logs1/xpbatch #ls -lrt Archivotodelete.txt
-rw-r--r--   1 xpbatch  explotacion       0 Nov 25 21:47 Archivotodelete.txt

Thats because grep is literally looking the pattern ?? and * characters:

But listing those files does really exist

Code:
ls log_HistoricoRecargas??062016*.log*|wc -l
24220

A appreciate your help in advanced
# 6  
Old 11-26-2016
On Solaris/SunOS systems use /usr/xpg4/bin/grep -Fx or fgrep -x instead of grep -Fx.
# 7  
Old 11-27-2016
Thanks, but using grep or fgrep doesnt interpret special characters like *,??
for example this search doesnt list anything

Code:
fgrep -x log_HistoricoRecargas??062016*.log*  List-prosclbt00c-logs1-01112016_Q607965_sindupli.txt

I appreciate your help in advanced

---------- Post updated 11-27-16 at 01:39 AM ---------- Previous update was 11-26-16 at 09:35 PM ----------

or should i use egrep that supports wildcard patterns instead?

On Solaris/SunOS systems support egrep ?

Thanks for your support in advanced
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count the pipes "|" in line and delete line if count greter then number.

Hello, I have been working on Awk/sed one liner which counts the number of occurrences of '|' in pipe separated lines of file and delete the line from files if count exceeds "17". i.e need to get records having exact 17 pipe separated fields(no more or less) currently i have below : awk... (1 Reply)
Discussion started by: ketanraut
1 Replies

2. Shell Programming and Scripting

How to count number of files in directory and write to new file with number of files and their name?

Hi! I just want to count number of files in a directory, and write to new text file, with number of files and their name output should look like this,, assume that below one is a new file created by script Number of files in directory = 25 1. a.txt 2. abc.txt 3. asd.dat... (20 Replies)
Discussion started by: Akshay Hegde
20 Replies

3. Shell Programming and Scripting

Count number of match words

Input: some random text SELECT TABLE1 some more random text some random text SELECT TABLE2 some more random text some random text SELECT TABLE3 some more random text some random text SELECT TABLE1 some more random text Output: 'SELECT TABLE1' 2 'SELECT TABLE2' 1 'SELECT TABLE3' 1 I... (5 Replies)
Discussion started by: chitech
5 Replies

4. Shell Programming and Scripting

Count the delimeter from a file and delete the row if delimeter count doesnt match.

I have a file containing about 5 million rows, in the file there are some records which has extra delimiter at random position. (we dont know the positions), now we have to Count the delimeter from each row and if the count of delimeter is not matching then I want to delete those rows from the... (5 Replies)
Discussion started by: Akumar1
5 Replies

5. UNIX for Dummies Questions & Answers

Count number of files in directory excluding existing files

Hi, Please let me know how to find out number of files in a directory excluding existing files..The existing file format will be unknown..each time.. Thanks (3 Replies)
Discussion started by: ammu
3 Replies

6. Shell Programming and Scripting

Match and count the number of times

ile1 Beckham Ronaldo file2 Beckham Beckham_human Ronaldo Ronaldo_spain Ronaldo Ronaldo_brazil Beckham Beckham_manch Zidane Zidane_Fran Rooney Rooney_Eng Output shud be (1 Reply)
Discussion started by: cdfd123
1 Replies

7. UNIX for Dummies Questions & Answers

Comparing two files and count number of lines that match

Hello all, I always found help for my problems using the search option, but this time my request is too specific. I have two files that I want to compare. File1 is the index and File2 contains the data: File1: chr1 protein_coding exon 500 600 . + . gene_id "20532";... (0 Replies)
Discussion started by: DerSeb
0 Replies

8. Shell Programming and Scripting

Awk Array doesnt match for substring

Awk Array doesnt match for substring nawk -F"," 'FNR==NR{a=$2 OFS $3;next} a{print $1,$2,a}' OFS="," file1 file2 I want cluster3 in file1 to match with cluster3int in file2 output getting: Output required: Help is appreciated (8 Replies)
Discussion started by: pinnacle
8 Replies

9. UNIX for Dummies Questions & Answers

compare two files if doesnt match then display error message

hi , i have one file ,i need to search particular word from this file and if content is matched then echo MATCHED else NOT MATCHED file contains : mr x planned to score 75% in exam but end up with 74%. word to be searched id 75% please help me out . waiting for reply thanks in advance (2 Replies)
Discussion started by: atl@mav
2 Replies

10. Shell Programming and Scripting

Grep, count and match two files

I am writing the below script to do a grep and count number of occurances between two tab delimited files. I am trying to achieve.. 1) Extract column 2 and column 3 from the S.txt file. Put it in a temp pattern file 2) Grep and count column 2 in D.txt file 3) Compare the counts between... (19 Replies)
Discussion started by: madhunk
19 Replies
Login or Register to Ask a Question