Sponsored Content
Top Forums Shell Programming and Scripting Bash-awk to process thousands of files Post 302926844 by Ophiuchus on Thursday 27th of November 2014 07:55:45 PM
Old 11-27-2014
Bash-awk to process thousands of files

Hi to all,


I have thousand of files in a folder with names with format "FILE-YYYY-MM-DD-HHMM" for what I want to send the following AWK command
Code:
awk '/Code.*/' FILE-2014*

I'd like to separate all files that have the same date to a folder named with the corresponding date. For example, if I have these files


FILE-2014-10-30-1750
FILE-2014-10-30-2130
FILE-2014-10-31-2330
FILE-2014-11-02-0520
FILE-2014-11-02-1500
FILE-2014-11-02-1815
FILE-2014-11-12-1345


- I want to send "FILE-2014-10-30-1750" and "FILE-2014-10-30-2130" to folder "FILES-2014-10-30"
- I want to send "FILE-2014-10-31-2330" to folder "FILES-2014-10-31"
- I want to send "FILE-2014-11-02-0520", "FILE-2014-11-02-1500" and "FILE-2014-11-02-1815" to folder "FILES-2014-11-02"
- I want to send "FILE-2014-10-31-2330" to folder "FILES-2014-10-31"


Once the files are stored in their respective folder I want to run the AWK command above and generate an output file for each date, for example:


- The matched lines after run awk command for files of date "2014-10-30" store them in file "Codes-2014-10-30.txt"
- The matched lines after run awk command for files of date "2014-10-31" store them in file "Codes-2014-10-31.txt" etc, etc.


May somebody help me to achieve this please.


Thanks in advance.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding a specific pattern from thousands of files ????

Hi All, I want to find a specific pattern from approximately 400000 files on solaris platform. Its very heavy for me to grep that pattern to each file individually. Can anybody suggest me some way to search for specific pattern (alpha numeric) from these forty thousand files. Please note that... (6 Replies)
Discussion started by: aarora_98
6 Replies

2. Shell Programming and Scripting

trnsmiting thousands ftp files and get an error message

Im transmiting thousands ftp files to a server, when type the command mput *, an error comes and say. args list to long. set to I. So ihave to transmit them in batch or blocks, but its too sloww. what shoul i do?. i need to do a program, or with a simple command i could solve the problem? (3 Replies)
Discussion started by: alexcol
3 Replies

3. Shell Programming and Scripting

Can awk do lookups to other files and process results

I know that 'brute-force' scripting could accomplish this with lots of cat/echo/cut/grep and more. But, because my real file has 800k records, and the matching files have 10-20k records, this is not time-possible or efficient. I have input file: > cat file_in... (4 Replies)
Discussion started by: joeyg
4 Replies

4. UNIX for Advanced & Expert Users

Copying Thousands of Tiny or Empty Files?

There is a procedure I do here at work where I have to synchronize file systems. The source file system always has three or four directories of hundreds of thousands of tiny (1k or smaller) or empty files. Whenever my rsync command reaches these directories, I'm waiting for hours for those files... (3 Replies)
Discussion started by: deckard
3 Replies

5. Shell Programming and Scripting

[KSH/Bash] Starting a parent process from a child process?

Hey all, I need to launch a script from within 2 other scripts that can run independently of the two parent scripts... Im having a hard time doing this, if anyone knows how please let me know. More detail. ScriptA (bash), ScriptB (ksh), ScriptC (bash) ScriptA, launches ScriptB ScirptB,... (7 Replies)
Discussion started by: trey85stang
7 Replies

6. Shell Programming and Scripting

help to parallelize work on thousands of files

I need to find a smarter way to process about 60,000 files in a single directory. Every night a script runs on each file generating a output on another directory; this used to take 5 hours, but as the data grows it is taking 7 hours. The files are of different sizes, but there are 16 cores... (10 Replies)
Discussion started by: vhope07
10 Replies

7. Shell Programming and Scripting

How to calculate mean in AWK? line by line several files, thousands of lines

I'm kinda stuck on this one, I have 7 files with 30.000 lines/file like this 050 0.023 0.504336 050 0.024 0.529521 050 0.025 0.538908 050 0.026 0.537035 I want to find the mean line by line of the third column from the files named like this: Stat-f-1.dat .... Stat-f-7.dat Stat-s-1.dat... (8 Replies)
Discussion started by: AriasFco
8 Replies

8. Shell Programming and Scripting

Search for patterns in thousands of files

Hi All, I want to search for a certain string in thousands of files and these files are distributed over different directories created daily. For that I created a small script in bash but while running it I am getting the below error: /ms.sh: xrealloc: subst.c:5173: cannot allocate... (17 Replies)
Discussion started by: danish0909
17 Replies

9. Shell Programming and Scripting

Process multiple large files with awk

Hi there, I'm camor and I'm trying to process huge files with bash scripting and awk. I've got a dataset folder with 10 files (16 millions of row each one - 600MB), and I've got a sorted file with all keys inside. For example: a sample_1 200 a.b sample_2 10 a sample_3 10 a sample_1 10 a... (4 Replies)
Discussion started by: camor
4 Replies

10. Shell Programming and Scripting

Bash find with expression - process all files except the starting-points

Hello. This command is correct : find /home/user_install \( \ \( -type d \( -iname "*firefox*" -o -iname ".cache" -o -iname "libreoffice" \ -o -iname "session" -o -wholename "/home/user_install/dir1/dir2/¬¬ICONS_WALLPAPERS_THEMES" \) \) -prune -o \ \( -type f \( -iname... (1 Reply)
Discussion started by: jcdole
1 Replies
folders(1mh)															      folders(1mh)

Name
       folders - list folders and contents

Syntax
       folders [ +folder ] [ msg ] [ options ]

Description
       The command displays the names of your folders and the number of messages that they each contain.

       The  command displays a list of all the folders in your Mail directory.	The folders are sorted alphabetically, each on its own line.  This
       is illustrated in the following example:
	 Folder      # of messages (  range  ); cur  msg  (other files)
	   V2.3  has	3 messages (   1-   3).
	 adrian  has   20 messages (   1-  20); cur=   2.
	  brian  has   16 messages (   1-  16).
	  chris  has   12 messages (   1-  12).
	copylog  has  242 messages (   1- 242); cur= 225.
	  inbox+ has   73 messages (   1- 127); cur= 127.
	    int  has	4 messages (   1-   4); cur=   2  (others).
	   jack  has   17 messages (   1-  17); cur=  17.

		 TOTAL= 387 messages in 8 folders.
       The plus sign (+) after inbox indicates that it is the current folder.  The information about the folder includes the term (others).   This
       indicates  that the folder contains files which are not messages.  These files may be either sub-folders, or files that do not belong under
       the MH file naming scheme.

       The command is identical to the effect of using the -all option to the command.

       If you use with the +folder argument, it will display all the subfolders within the named folder.  as shown in the following example:
       % folders +test
       Folder		# of messages (  range	); cur	msg  (other files)
       test+ has	  18 messages (   1-  18);	     (others).
       test/testone has    1 message  (   1-   1).
       test/testtwo has   no messages.

	      TOTAL=   19 messages in 3 folders.
       See for more details of sub-folders.

Options
       -fast
       -nofast	 Lists only the name of folders, with no additional information.  This is faster because the folders need not be read.

       -help	 Prints a list of the valid options to this command.

       -list
       -nolist	 Lists the contents of the folder-stack.  No +folder argument is allowed with this option.

       -pack
       -nopack	 Re-numbers messages in the folders.  Messages are re-numbered sequentially, and any gaps  in  the  numbering  are  removed.   The
		 default operation is -nopack, which does not change the numbering in the folder.

       -pop	 Discards  the	top of the folder-stack, after setting the current folder to that value.  No +folder argument is allowed with this
		 option.  This corresponds to the operation in the C-shell; see The -push and -pop options are mutually exclusive: the last occur-
		 rence of either one overrides any previous occurrence of the other.

       -push	 Pushes  the  current  folder  onto  the  folder-stack, and makes the +folder argument into the current folder.  If +folder is not
		 given, the current folder and the top of the folder-stack are exchanged.  This corresponds to the operation in the  C-shell;  see
		 The  -push switch and the -pop switch are mutually exclusive: the last occurrence of either one overrides any previous occurrence
		 of the other.

       -recurse
       -norecurse
		 Lists folders recursively.  Information on each folder is displayed, followed by information on any  sub-folders  which  it  con-
		 tains.

       -total
       -nototal  Displays  only the total number of messages and folders in your Mail directory.  This option does not print any information about
		 individual folders.  It can be suppressed using the -nototal option.

       The defaults for are:

	      +folder defaults to all
	      msg defaults to none
	      -nofast
	      -noheader
	      -nototal
	      -nopack
	      -norecurse

Restrictions
       MH does not allow you to have more than 100 folders at any level in your Mail directory.

Profile Components
       Path:		 To determine your MH directory
       Folder-Protect:	 To set protections when creating a new folder
       Folder-Stack:	 To determine the folder stack
       lsproc:		 Program to list the contents of a folder

Files
       The user profile.

See Also
       csh(1), folder(1mh), refile(1mh), mhpath(1mh)

																      folders(1mh)
All times are GMT -4. The time now is 09:24 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy