Sponsored Content
Top Forums Shell Programming and Scripting Bash-awk to process thousands of files Post 302926845 by Don Cragun on Thursday 27th of November 2014 09:20:51 PM
Old 11-27-2014
What OS (including version) are you using? (If you don't know, show us the output from the command: uname -a).

What have you tried to solve this problem?

Are the target directories you mentioned to be created in the directory that contains these files or in a different directory?

Are the output files from running the awk commands to be placed in the directory that originally contained the files, in the directory where the files being processed by each awk command have been moved, or in some other directory?

Do the directories to which the files are to be moved already exist? If so, are other files (that are not to be processed by the awk command for the files to be moved to that directory) in those directories?

What is the maximum number of files that could be moved into one of these target directories? (Or, more importantly, will invoking awk with the awk script and the absolute pathname of all of the moved files run into ARG_MAX limits? If there are enough files that that could be an issue, will the output from the commands:
Code:
cat FILE-2014-10-30-*| awk 'your awk script' > Codes-2014-10-30.txt

and:
Code:
awk 'your awk script' FILE-2014-10-30-* > Codes-2014-10-30.txt

be different?)
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies

2. Shell Programming and Scripting

trnsmiting thousands ftp files and get an error message

Im transmiting thousands ftp files to a server, when type the command mput *, an error comes and say. args list to long. set to I. So ihave to transmit them in batch or blocks, but its too sloww. what shoul i do?. i need to do a program, or with a simple command i could solve the problem? (3 Replies)
Discussion started by: alexcol
3 Replies

3. Shell Programming and Scripting

Can awk do lookups to other files and process results

I know that 'brute-force' scripting could accomplish this with lots of cat/echo/cut/grep and more. But, because my real file has 800k records, and the matching files have 10-20k records, this is not time-possible or efficient. I have input file: > cat file_in... (4 Replies)
Discussion started by: joeyg
4 Replies

4. UNIX for Advanced & Expert Users

Copying Thousands of Tiny or Empty Files?

There is a procedure I do here at work where I have to synchronize file systems. The source file system always has three or four directories of hundreds of thousands of tiny (1k or smaller) or empty files. Whenever my rsync command reaches these directories, I'm waiting for hours for those files... (3 Replies)
Discussion started by: deckard
3 Replies

5. Shell Programming and Scripting

[KSH/Bash] Starting a parent process from a child process?

Hey all, I need to launch a script from within 2 other scripts that can run independently of the two parent scripts... Im having a hard time doing this, if anyone knows how please let me know. More detail. ScriptA (bash), ScriptB (ksh), ScriptC (bash) ScriptA, launches ScriptB ScirptB,... (7 Replies)
Discussion started by: trey85stang
7 Replies

6. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

7. Shell Programming and Scripting

How to calculate mean in AWK? line by line several files, thousands of lines

I'm kinda stuck on this one, I have 7 files with 30.000 lines/file like this 050 0.023 0.504336 050 0.024 0.529521 050 0.025 0.538908 050 0.026 0.537035 I want to find the mean line by line of the third column from the files named like this: Stat-f-1.dat .... Stat-f-7.dat Stat-s-1.dat... (8 Replies)
Discussion started by: AriasFco
8 Replies

8. Shell Programming and Scripting

Search for patterns in thousands of files

Hi All, I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error: /ms.sh: xrealloc: subst.c:5173: cannot allocate... (17 Replies)
Discussion started by: danish0909
17 Replies

9. Shell Programming and Scripting

Process multiple large files with awk

Hi there, I'm camor and I'm trying to process huge files with bash scripting and awk. I've got a dataset folder with 10 files (16 millions of row each one - 600MB), and I've got a sorted file with all keys inside. For example: a sample_1 200 a.b sample_2 10 a sample_3 10 a sample_1 10 a... (4 Replies)
Discussion started by: camor
4 Replies

10. Shell Programming and Scripting

Bash find with expression - process all files except the starting-points

Hello. This command is correct : find /home/user_install \( \ \( -type d \( -iname "*firefox*" -o -iname ".cache" -o -iname "libreoffice" \ -o -iname "session" -o -wholename "/home/user_install/dir1/dir2/¬¬ICONS_WALLPAPERS_THEMES" \) \) -prune -o \ \( -type f \( -iname... (1 Reply)
Discussion started by: jcdole
1 Replies
folders(1)						      General Commands Manual							folders(1)

NAME
folders - list folders and contents (only available within the message handling system, mh) SYNOPSIS
folders [+folder] [msg] [options] OPTIONS
Lists only the name of folders, with no additional information. This is faster because the folders need not be read. Prints a list of the valid options to this command. Lists the contents of the folder-stack. No +folder argument is allowed with this option. Re-numbers mes- sages in the folders. Messages are re-numbered sequentially, and any gaps in the numbering are removed. The default operation is -nopack, which does not change the numbering in the folder. Discards the top of the folder-stack, after setting the current folder to that value. No +folder argument is allowed with this option. This corresponds to the popd operation in the C-shell; see csh(1). The -push and -pop options are mutually exclusive: the last occurrence of either one overrides any previous occurrence of the other. Pushes the current folder onto the folder-stack, and makes the +folder argument into the current folder. If +folder is not given, the current folder and the top of the folder-stack are exchanged. This corresponds to the pushd operation in the C-shell; see csh(1). The -push switch and the -pop switch are mutually exclusive: the last occurrence of either one overrides any previous occurrence of the other. Lists folders recur- sively. Information on each folder is displayed, followed by information on any sub-folders which it contains. Displays only the total number of messages and folders in your Mail directory. This option does not print any information about individual folders. It can be sup- pressed using the -nototal option. The defaults for folders are: +folder defaults to all msg defaults to none -nofast -noheader -nototal -nopack -norecurse DESCRIPTION
The folders command displays the names of your folders and the number of messages that they each contain. The folders command displays a list of all the folders in your Mail directory. The folders are sorted alphabetically, each on its own line. This is illustrated in the following example: Folder # of messages ( range ); cur msg (other files) V2.3 has 3 messages ( 1- 3). adrian has 20 messages ( 1- 20); cur= 2. brian has 16 messages ( 1- 16). chris has 12 messages ( 1- 12). copylog has 242 messages ( 1- 242); cur= 225. inbox+ has 73 messages ( 1- 127); cur= 127. int has 4 messages ( 1- 4); cur= 2 (others). jack has 17 messages ( 1- 17); cur= 17. TOTAL= 387 messages in 8 folders. The plus sign (+) after inbox indicates that it is the current folder. The information about the int folder includes the term (others). This indicates that the folder int contains files which are not messages. These files may be either sub-folders, or files that do not belong under the MH file naming scheme. The folders command is identical to the effect of using the -all option to the folder command. If you use folders with the +folder argument, it will display all the subfolders within the named folder. as shown in the following exam- ple: % folders +test Folder # of messages ( range ); cur msg (other files) test+ has 18 messages ( 1- 18); (others). test/testone has 1 message ( 1- 1). test/testtwo has no messages. TOTAL= 19 messages in 3 folders. See refile(1) for more details of sub-folders. RESTRICTIONS
MH does not allow you to have more than 100 folders at any level in your Mail directory. PROFILE COMPONENTS
Path: To determine your MH directory Folder-Protect: To set protections when creating a new folder Folder-Stack: To determine the folder stack lsproc: Program to list the contents of a folder FILES
The user profile. SEE ALSO
csh(1), folder(1), refile(1), mhpath(1) folders(1)
All times are GMT -4. The time now is 10:39 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy