Sponsored Content
Top Forums Shell Programming and Scripting Efficiently altering and merging files in perl Post 302905511 by sam05121988 on Thursday 12th of June 2014 03:06:51 AM
Old 06-12-2014
Lightbulb Efficiently altering and merging files in perl

I have two files

Code:
fileA
HEADER LINE A
CommentLine A
Content A
....
....
....
TAILER A

Code:
fileB
HEADER LINE B
CommentLine B
Content B
....
....
....
TAILER B

I want to merge these two files as
Code:
HEADER LINE A
CommentLine A
Content A
....
....
....
Content B
....
....
....
TAILER B

i.e. skip the TAILER line of file A and skip the HEADER and Comment Line of fileB

I am able to do it using the below perl code
Code:
        open ( FA, "$fileA" ) || die("can't open fileA $!");
        open ( FB, "$fileB" ) || die("can't open fileB $!");
        open ( TMP, ">> tmp_file" ) || die("can't open tmp_file $!");

        #reading both files in array
        my @fileA = <FA>;
        my @fileB = <FB>;

        #getting rid of HEADER, Comment line, in fileB
        shift @fileB;
        shift @fileB;

        #getting rid of TAILER in fileA
        pop @fileA;

        my @tmp_file=(@fileA,@fileB);

        foreach ( @tmp_file ){
            print TMP $_;
        }

        close(FA);
        close(FB);
        close(TMP);
        
        rename tmp_file, fileA || die("can't rename tmp_file to fileA);

This code works fine, however I doubt it's efficiency if fileA and fileB are going to be millions of lines (which is the case)
i.e. why read whole file in arrays just to get rid of three lines (will end up using lots of memory)

Can someone suggest a more efficient way of doing this
(answers in perl only)

Last edited by sam05121988; 06-12-2014 at 04:15 AM..
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Issue altering end data

I have an inventory program that I would like to have the ability to go and change or alter the field data based on the item number as a key. I have the menu option set but at the end of the script process it just appends the changed data to the database rather than what I would like; which is to... (5 Replies)
Discussion started by: stlitguru
5 Replies

2. UNIX Desktop Questions & Answers

how to search files efficiently using patterns

hi friens, :) if i need to find files with extension .c++,.C++,.cpp,.Cpp,.CPp,.cPP,.CpP,.cpP,.c,.C wat is the pattern for finding them :confused: (2 Replies)
Discussion started by: arunsubbhian
2 Replies

3. Shell Programming and Scripting

altering numbers in files

I want to change a number in a file into number -1.. for instance file_input is fdisdlf_s35 fdjsk_s27 fsdf_s42 jkljllljkkl_s57 ... etc now i want the output to be fdisdlf_s34 fdjsk_s26 fdsf_s41 jkljllljkkl_s56 ... etc I was think of using "sed -e 's/2/1/g' -e 's/3/2/g' -e... (4 Replies)
Discussion started by: bigboizvince
4 Replies

4. Shell Programming and Scripting

Scripting question: Altering 2 field.

Hi Experts, I want to alter two filed of my data file: The _new should come to 2nd column, and _new to be removed from 4rth column, please advise, datafile.txt aa /dev/vgAA/lvol1 bb /dev/vgAA_new/lvol1 aa /dev/vgAA1/lvol2 bb /dev/vgAA1_new/lvol2 aa /dev/vgAC/lvol1 bb... (5 Replies)
Discussion started by: rveri
5 Replies

5. Shell Programming and Scripting

perl : merging two arrays on basis of common parameter

I have 2 arrays, @array1 contains records in the format 1|_|X|_|ssd|_| 4|_|H|_|hbd|_| 9|_|Y|_|u8gjdfg|_| @array2 contains records in the format X|_|asdf|_| Y|_|qwer|_| A|_|9kdkf|_| @array3 should contain records in the PLz X|_|ssd|_|asdf|_| Y|_|hdb|_|qwer|_| PLZ dont use... (2 Replies)
Discussion started by: centurion_13
2 Replies

6. Shell Programming and Scripting

Algorithm to load files efficiently without missing or accidently archiving....

We have a requirement where we get the Delta Files in every one hour and we need to load them into Oracle database every one hour using Powercenter. To efficiently do this we need to build an File management system. Here is our process: we get 6 files for 6 tables with a timestamp appended... (2 Replies)
Discussion started by: okkadu
2 Replies

7. Shell Programming and Scripting

merging two files

file1.txt 1 2 10 11 56 57 7 8 43 44 and let's suppose that there is a file called file2.txt with 100 columns I want to produce a file3.txt with columns specified in file1.txt in that order (1,2,10,11,56,57,7,8,43,44) Thanks! (2 Replies)
Discussion started by: johnkim0806
2 Replies

8. Shell Programming and Scripting

Perl - multiple keys and merging two files

Hi, I'm not a regular coder but some times I write some basic perl script, hence Perl is bit difficult for me :). I'm merging two files a.txt and b.txt into c.txt: a.txt ------ x001;frtb70;xyz;109 x001;frvt65;sec;239 x003;wqax34;jul;659 x004;yhud43;yhn;760 b.txt ------... (8 Replies)
Discussion started by: Lokesha
8 Replies

9. Shell Programming and Scripting

Altering a variable

Can I take an argument input, lets say it's, hg0000_xy1_v2, in the script it becomes f ... then hack off the end of the filename to change the variable to hg0000 only. I tried using sed but can't figure it out. f="$f" | sed 's/_fg_v//' I could change the variable label if necessary to... (4 Replies)
Discussion started by: scribling
4 Replies

10. Programming

Altering a jar file

I have a script I am trying to test and run but it runs against a jar file. I wrote an external property file so it would redirect with my script, but it keeps going in search of the previous property file. Is there any way to externally over write the jar file and if not how do you go about... (7 Replies)
Discussion started by: risarose87
7 Replies
File::Copy(3pm) 					 Perl Programmers Reference Guide					   File::Copy(3pm)

NAME
File::Copy - Copy files or filehandles SYNOPSIS
use File::Copy; copy("file1","file2") or die "Copy failed: $!"; copy("Copy.pm",*STDOUT); move("/dev1/fileA","/dev2/fileB"); use File::Copy "cp"; $n = FileHandle->new("/a/file","r"); cp($n,"x"); DESCRIPTION
The File::Copy module provides two basic functions, "copy" and "move", which are useful for getting the contents of a file from one place to another. copy The "copy" function takes two parameters: a file to copy from and a file to copy to. Either argument may be a string, a FileHandle reference or a FileHandle glob. Obviously, if the first argument is a filehandle of some sort, it will be read from, and if it is a file name it will be opened for reading. Likewise, the second argument will be written to (and created if need be). Trying to copy a file on top of itself is an error. If the destination (second argument) already exists and is a directory, and the source (first argument) is not a filehandle, then the source file will be copied into the directory specified by the destination, using the same base name as the source file. It's a failure to have a filehandle as the source when the destination is a directory. Note that passing in files as handles instead of names may lead to loss of information on some operating systems; it is recommended that you use file names whenever possible. Files are opened in binary mode where applicable. To get a consistent behaviour when copying from a filehandle to a file, use "binmode" on the filehandle. An optional third parameter can be used to specify the buffer size used for copying. This is the number of bytes from the first file, that will be held in memory at any given time, before being written to the second file. The default buffer size depends upon the file, but will generally be the whole file (up to 2MB), or 1k for filehandles that do not reference files (eg. sockets). You may use the syntax "use File::Copy "cp"" to get at the "cp" alias for this function. The syntax is exactly the same. The behavior is nearly the same as well: as of version 2.15, "cp" will preserve the source file's permission bits like the shell utility cp(1) would do, while "copy" uses the default permissions for the target file (which may depend on the process' "umask", file ownership, inherited ACLs, etc.). If an error occurs in setting permissions, "cp" will return 0, regardless of whether the file was successfully copied. move The "move" function also takes two parameters: the current name and the intended name of the file to be moved. If the destination already exists and is a directory, and the source is not a directory, then the source file will be renamed into the directory specified by the destination. If possible, move() will simply rename the file. Otherwise, it copies the file to the new location and deletes the original. If an error occurs during this copy-and-delete process, you may be left with a (possibly partial) copy of the file under the destination name. You may use the "mv" alias for this function in the same way that you may use the "cp" alias for "copy". syscopy File::Copy also provides the "syscopy" routine, which copies the file specified in the first parameter to the file specified in the second parameter, preserving OS-specific attributes and file structure. For Unix systems, this is equivalent to the simple "copy" routine, which doesn't preserve OS-specific attributes. For VMS systems, this calls the "rmscopy" routine (see below). For OS/2 systems, this calls the "syscopy" XSUB directly. For Win32 systems, this calls "Win32::CopyFile". Special behaviour if "syscopy" is defined (OS/2, VMS and Win32): If both arguments to "copy" are not file handles, then "copy" will perform a "system copy" of the input file to a new output file, in order to preserve file attributes, indexed file structure, etc. The buffer size parameter is ignored. If either argument to "copy" is a handle to an opened file, then data is copied using Perl operators, and no effort is made to preserve file attributes or record structure. The system copy routine may also be called directly under VMS and OS/2 as "File::Copy::syscopy" (or under VMS as "File::Copy::rmscopy", which is the routine that does the actual work for syscopy). rmscopy($from,$to[,$date_flag]) The first and second arguments may be strings, typeglobs, typeglob references, or objects inheriting from IO::Handle; they are used in all cases to obtain the filespec of the input and output files, respectively. The name and type of the input file are used as defaults for the output file, if necessary. A new version of the output file is always created, which inherits the structure and RMS attributes of the input file, except for owner and protections (and possibly timestamps; see below). All data from the input file is copied to the output file; if either of the first two parameters to "rmscopy" is a file handle, its position is unchanged. (Note that this means a file handle pointing to the output file will be associated with an old version of that file after "rmscopy" returns, not the newly created version.) The third parameter is an integer flag, which tells "rmscopy" how to handle timestamps. If it is < 0, none of the input file's timestamps are propagated to the output file. If it is > 0, then it is interpreted as a bitmask: if bit 0 (the LSB) is set, then timestamps other than the revision date are propagated; if bit 1 is set, the revision date is propagated. If the third parameter to "rmscopy" is 0, then it behaves much like the DCL COPY command: if the name or type of the output file was explicitly specified, then no timestamps are propagated, but if they were taken implicitly from the input filespec, then all timestamps other than the revision date are propagated. If this parameter is not supplied, it defaults to 0. Like "copy", "rmscopy" returns 1 on success. If an error occurs, it sets $!, deletes the output file, and returns 0. RETURN
All functions return 1 on success, 0 on failure. $! will be set if an error was encountered. AUTHOR
File::Copy was written by Aaron Sherman <ajs@ajs.com> in 1995, and updated by Charles Bailey <bailey@newman.upenn.edu> in 1996. perl v5.18.2 2014-01-06 File::Copy(3pm)
All times are GMT -4. The time now is 01:59 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy