awk error when increasing number of files in folder


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk error when increasing number of files in folder
# 1  
Old 07-01-2015
awk error when increasing number of files in folder

I have a folder with several files of which I want to eliminate all of the terms that they have in common using `awk`.
Here is the script that I have been using:

Code:
    awk '                
    FNR==1 {
        if (seen[FILENAME]++) {
            firstPass = 0
            outfile = FILENAME "_new"
        }
        else {
            firstPass = 1
            numFiles++
            ARGV[ARGC++] = FILENAME
        }
    }
    firstPass { count[$2]++; next }
    count[$2] != numFiles { print > outfile }
    ' *


An example of the information in the files would be:

Code:
File1

    3	oil 
    4	and  
    8	vinegar

Code:
File2

    4	hot  
    2	and  
    9	cold

The output should be:

Code:
    File1_new
    
        3	oil   
        8	vinegar

Code:
    File2_new
    
        4	hot  
        9	cold

It works when I use a small number of files (i.e. 10), but when I start to increase that number, I get the following error message:

awk: file20_new makes too many open files input record number 27, file file20_new source line number 14


Where is the error coming from when I use larger amounts of files?

Last edited by owwow14; 07-01-2015 at 11:58 AM..
# 2  
Old 07-01-2015
Try using the close function once you are done with each file.
See...

regex - awk error "makes too many open files" - Stack Overflow
# 3  
Old 07-01-2015
Quote:
Originally Posted by blackrageous
Try using the close function once you are done with each file.
See...

regex - awk error "makes too many open files" - Stack Overflow
Following your suggestion and closing the file according to that post, I get the same result:

Code:
awk '                
FNR==1 {
    if (seen[FILENAME]++) {
        firstPass = 0
        close(outfile = FILENAME "_new")
    }
    else {
        firstPass = 1
        numFiles++
        ARGV[ARGC++] = FILENAME
    }
}
firstPass { count[$2]++; next }
count[$2] != numFiles { print > outfile }
' *

# 4  
Old 07-01-2015
Please help me out - what's the purpose of
Code:
            ARGV[ARGC++] = FILENAME

?

---------- Post updated at 17:14 ---------- Previous update was at 17:13 ----------

I'm afraid it will blow up your input file list...

---------- Post updated at 18:33 ---------- Previous update was at 17:14 ----------


OK, I've gotten it now. Appends every file name exactly once to the file list, so you work on the file list again when the total No. of duplicate words is found.

Last edited by RudiC; 07-01-2015 at 01:36 PM..
# 5  
Old 07-01-2015
Is the given algorithm correct?
If only the unique words per file should be printed, shouldn't it be
Code:
awk '
FNR==1 {
  # close the previous file
  if (NR!=1) close(fname)
  fname=FILENAME
}
# main code
{ total[$2]++; perfile[fname,$2]++ }
END {
  for (fw in perfile) {
    split (fw,idx,SUBSEP)
    f=idx[1]; w=idx[2]
    if (perfile[fw]==total[w]) print f,w
  }
}
' *

The solution to the problem is the first block; in the next block simply replace all FILENAME by fname.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Loop awk command on files in a folder

Hi, I'd like to loop an action over all files with given extension within a folder. The "main" action is: awk -F "\t" 'BEGIN{OFS="\t"}{if ($10=="S") print$0; }' input.txt > output.txt The input.txt should be every file in the folder with *.subVCF extension; and the output should be a file... (3 Replies)
Discussion started by: dovah
3 Replies

2. Shell Programming and Scripting

How to use a loop for multiple files in a folder to run awk command?

Dear folks I have two data set which there names are "final.map" and "1.geno" and look like this structures: final.map: gi|358485511|ref|NC_006088.3| 2044 gi|358485511|ref|NC_006088.3| 2048 gi|358485511|ref|NC_006088.3| 2187 gi|358485511|ref|NC_006088.3| 17654 ... (2 Replies)
Discussion started by: sajmar
2 Replies

3. Shell Programming and Scripting

Split a folder with huge number of files in n folders

We have a folder XYZ with large number of files (>350,000). how can i split the folder and create say 10 of them XYZ1 to XYZ10 with 35,000 files each. (doesnt matter which files go where). (12 Replies)
Discussion started by: AlokKumbhare
12 Replies

4. Shell Programming and Scripting

Total number of files in the folder should be listed

Hi All, When i give the ls -lrt to list out all files with total number of files , i get the output as ls -lrt total 72 -rw-r--r-- 1 hari staff 796 Jul 11 09:17 va.txt -rw-r--r-- 1 hari staff 169 Jul 13 00:20 a.log -rwxr-xr-x 1 hari staff 659 Aug... (9 Replies)
Discussion started by: Kalaihari
9 Replies

5. Shell Programming and Scripting

For loop for number of files in a folder

Hi All, Need a for loop which should run for number of files in a folder and should pass the file name as parameter to another shell script for each loop. Please help me. Thanks. (2 Replies)
Discussion started by: chillblue
2 Replies

6. Shell Programming and Scripting

Increasing a number and appending it to next line of a text file

Hi all, I have text file having a number P100. what i need is when i run a script, it should add 1 to the above number and append it to the next line of a same text file.. when i use the script next time it should check the last line and add 1 to the last number and so on.. like the text... (5 Replies)
Discussion started by: smarty86
5 Replies

7. Shell Programming and Scripting

Changing one number in all files using awk

Hi I want to change the number 70 mentioned in my file to 76 by using awk. I know how to change all same digits but not one particular number. I have 29 files almost similar to this. One of my files looks like #Input file for 200K NPT molecular dynamics of final 70%XL made from 58.5%... (3 Replies)
Discussion started by: ananyob
3 Replies

8. UNIX for Dummies Questions & Answers

unix command to cound the number of files in a folder

Hi All Can some one help me out. Please tell the unix command to cound the number of files in a folder. Ungent please# Thanks manas (6 Replies)
Discussion started by: manas6
6 Replies

9. Shell Programming and Scripting

Getting error by renaming all the files in a folder

Hi All, I have a folder name as postscript folder and it contains the following postscript files. package1.ps package2.ps package3.ps when i am renaming all the ps files to xps files by using the following command mv /postscript/*.ps /postscript/*.xps Then i am getting the... (4 Replies)
Discussion started by: sunitachoudhury
4 Replies

10. UNIX for Dummies Questions & Answers

How to learn the number of files under a particular folder, containing subfolders

Hi ALL I would like know how many files there under a particular folder, which contains subfolders. Thanks (5 Replies)
Discussion started by: cy163
5 Replies
Login or Register to Ask a Question