awk to parse huge files Post: 302852739

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare 2 huge files wrt to a key using awk

Hi Folks, I need to compare two very huge file ( i.e the files would contain a minimum of 70k records each) using awk or sed. The comparison needs to be done with respect to a 'key'. For example : File1 ********** 1234|TONY|Y75634|20/07/2008 1235|TINA|XCVB56|30/07/2009...

2. Shell Programming and Scripting

Comparing two huge files

Hi, I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of...

3. UNIX for Advanced & Expert Users

Huge files manipulation

Hi , i need a fast way to delete duplicates entrys from very huge files ( >2 Gbs ) , these files are in plain text. I tried all the usual methods ( awk / sort /uniq / sed /grep .. ) but it always ended with the same result (memory core dump) In using HP-UX large servers. Any advice will...

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Hi, all: I've got two folders, say, "folder1" and "folder2". Under each, there are thousands of files. It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command. However, if I change the above question a...

5. Shell Programming and Scripting

awk script to parse results from TWO files

I am trying to parse two files and get data that does not match in one of the columns ( column 3 in my case ) Data for two files are as follows A.txt ===== abc 10 5 0 1 16 xyz 16 1 1 0 18 efg 30 8 0 2 40 ijk 22 2 0 1 25 B.txt ===== abc...

6. Shell Programming and Scripting

AWK failing to parse on certain files

Dear Unix Gurus, need your expertise to help troubleshoot a certain problem i'm having. I crated a shell script which will ftp get 1 crash log from multiple servers (listed in a text file). Each log will then be parsed by calling an awk script. The problem is, for certain log its parsing...

7. Shell Programming and Scripting

How to parse a huge 600MB zipped file?

I'm new to Unix, trying to parse a huge 600MB zipped file... I need to bzcat this file once and do some calculations (word count) on the lines based on certain criteria (see script) the correct result/output should be: column1=6 column2=4 the problem is that I'm getting column2=0 (see...

8. Shell Programming and Scripting

awk does not work well with huge data?

Dear all , I found that if we work with thousands line of data, awk does not work perfectly. It will cut hundreds line (others are deleted) and works only on the remain data. I used this command : awk '$1==1{$1="Si"}{print>FILENAME}' coba.xyz to change value of first column whose value is 1...

9. Shell Programming and Scripting

awk Parse And Create Multiple Files Based on Field Value

Hello: I am working parsing a large input file which will be broken down into multiples based on the second field in the file, in this case: STORE. The idea is to create each file with the corresponding store number, for example: Report_$STORENUM_$DATETIMESTAMP , and obtaining the...

10. Shell Programming and Scripting

Parse input of two files to be the same in awk

I have two files that I am going to use diff to find the differences but need to parse them before I do that. I have include the format of each file1 and file2 with the desired output of each (the first 5 fields in each file). The first file has a "chr" before the # that needs to be removed. I...

LEARN ABOUT DEBIAN

file::read

File::Read(3pm) 					User Contributed Perl Documentation					   File::Read(3pm)

NAME

       File::Read - Unique interface for reading one or more files

VERSION

       Version 0.0801

SYNOPSIS

	   use File::Read;

	   # read a file
	   $file = read_file($path);

	   # read several files
	   @files = read_files(@paths);

	   # aggregate several files
	   $file = read_files(@paths);

	   # read a file as root, skip comments and blank lines
	   $file = read_file({ as_root => 1, skip_comments => 1, skip_blanks => 1 }, $path);

DESCRIPTION

       This module mainly proposes functions for reading one or more files, with different options. See below for more details and examples.

   Rationale
       This module was created to address a quite specific need: reading many files, some as a normal user and others as root, and eventually do a
       little more processing, all while being at the same time compatible with Perl 5.004. "File::Slurp" addresses the first point, but not the
       others, hence the creation of "File::Read". If you don't need reading files as root or the post-processing features, then it's faster to
       directly use "File::Slurp".

EXPORT

       By default, this module exports all the functions documented afterhand.	It also recognizes import options. For example

	   use File::Read 'err_mode=quiet';

       set "read_file()"'s "err_mode" option default value to "quiet".

FUNCTIONS

       read_file()
	   Read the files given in argument and return their content, as as list, one element per file, when called in list context, or as one big
	   chunk of text when called in scalar context.  Options can be set using a hashref as first parameter.

	   Options

	   o   "aggregate" controls how the function returns the content of the files that were successfully read. By default, When set to true
	       (default), the function returns the content as a scalar; when set to false, the content is returned as a list.

	   o   "as_root" tells the function to read the given file(s) as root using the command indicated by the "cmd" option.

	   o   "cmd" sets the shell command used for reading files as root. Default is "sudo cat". Therefore you need sudo(8) and cat(1) on your
	       system, and sudoers(5) must be set so the user can execute cat(1).

	   o   "err_mode" controls how the function behaves when an error occurs.  Available values are "croak", "carp" and "quiet".  Default
	       value is "croak".

	   o   "skip_comments" tells the functions to remove all comment lines from the read files.

	   o   "skip_blanks" tells the functions to remove all blank lines from the read files.

	   o   "to_ascii" tells the functions to convert the text to US-ASCII using "Text::Unidecode". If this module is not available, non-ASCII
	       data are deleted.

	   Examples

	   Just read a file:

	       my $file = read_file($path);

	   Read a file, returning it as list:

	       my @file = read_file({ aggregate => 0 }, $path);

	   Read a file, skipping comments:

	       my $file = read_file({ skip_comments => 1 }, $path);

	   Read several files, skipping blank lines and comments:

	       my @files = read_file({ skip_comments => 1, skip_blanks => 1 }, @paths);

       read_files()
	   "read_files()" is just an alias for "read_file()" so that it look more sane when reading several files.

DIAGNOSTICS

       "Bad value '%s' for option '%s'"
	   (E) You gave a bad value for the indicated option. Please check the documentation for the valid values.

       "This function needs at least one path"
	   (E) You called a function without giving it argument.

SEE ALSO

       File::Slurp

       IO::All

AUTHOR

       Sebastien Aperghis-Tramoni, "<sebastien at aperghis.net>"

BUGS

       Please report any bugs or feature requests to "bug-file-read at rt.cpan.org", or through the web interface at
       <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=File-Read>.  I will be notified, and then you'll automatically be notified of progress on
       your bug as I make changes.

SUPPORT

       You can find documentation for this module with the perldoc command.

	   perldoc File::Read

       You can also look for information at:

       o   AnnoCPAN: Annotated CPAN documentation - <http://annocpan.org/dist/File-Read>

       o   CPAN Ratings - <http://cpanratings.perl.org/d/File-Read>

       o   RT: CPAN's request tracker - <http://rt.cpan.org/NoAuth/Bugs.html?Dist=File-Read>

       o   Search CPAN - <http://search.cpan.org/dist/File-Read>

COPYRIGHT &; LICENSE
       Copyright (C) 2006, 2007 Sebastien Aperghis-Tramoni, all rights reserved.

       This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.10.0							    2007-10-18							   File::Read(3pm)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Compare 2 huge files wrt to a key using awk

Discussion started by: Ranjani

2. Shell Programming and Scripting

Comparing two huge files

Discussion started by: kmkbuddy_1983

3. UNIX for Advanced & Expert Users

Huge files manipulation

Discussion started by: Klashxx

4. Shell Programming and Scripting

Compare 2 folders to find several missing files among huge amounts of files.

Discussion started by: jiapei100