Efficiently altering and merging files in perl Post: 302905511

Sponsored Content

Top Forums Shell Programming and Scripting Efficiently altering and merging files in perl Post 302905511 by sam05121988 on Thursday 12th of June 2014 03:06:51 AM

06-12-2014

Registered User

Efficiently altering and merging files in perl

I have two files

Code:

fileA
HEADER LINE A
CommentLine A
Content A
....
....
....
TAILER A

Code:

fileB
HEADER LINE B
CommentLine B
Content B
....
....
....
TAILER B

I want to merge these two files as

Code:

HEADER LINE A
CommentLine A
Content A
....
....
....
Content B
....
....
....
TAILER B

i.e. skip the TAILER line of file A and skip the HEADER and Comment Line of fileB

I am able to do it using the below perl code

Code:

        open ( FA, "$fileA" ) || die("can't open fileA $!");
        open ( FB, "$fileB" ) || die("can't open fileB $!");
        open ( TMP, ">> tmp_file" ) || die("can't open tmp_file $!");

        #reading both files in array
        my @fileA = <FA>;
        my @fileB = <FB>;

        #getting rid of HEADER, Comment line, in fileB
        shift @fileB;
        shift @fileB;

        #getting rid of TAILER in fileA
        pop @fileA;

        my @tmp_file=(@fileA,@fileB);

        foreach ( @tmp_file ){
            print TMP $_;
        }

        close(FA);
        close(FB);
        close(TMP);
        
        rename tmp_file, fileA || die("can't rename tmp_file to fileA);

This code works fine, however I doubt it's efficiency if fileA and fileB are going to be millions of lines (which is the case)
i.e. why read whole file in arrays just to get rid of three lines (will end up using lots of memory)

Can someone suggest a more efficient way of doing this
(answers in perl only)

Last edited by sam05121988; 06-12-2014 at 04:15 AM..

sam05121988

View Public Profile for sam05121988

Find all posts by sam05121988

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Issue altering end data

I have an inventory program that I would like to have the ability to go and change or alter the field data based on the item number as a key. I have the menu option set but at the end of the script process it just appends the changed data to the database rather than what I would like; which is to...

2. UNIX Desktop Questions & Answers

how to search files efficiently using patterns

hi friens, :) if i need to find files with extension .c++,.C++,.cpp,.Cpp,.CPp,.cPP,.CpP,.cpP,.c,.C wat is the pattern for finding them :confused:

3. Shell Programming and Scripting

altering numbers in files

I want to change a number in a file into number -1.. for instance file_input is fdisdlf_s35 fdjsk_s27 fsdf_s42 jkljllljkkl_s57 ... etc now i want the output to be fdisdlf_s34 fdjsk_s26 fdsf_s41 jkljllljkkl_s56 ... etc I was think of using "sed -e 's/2/1/g' -e 's/3/2/g' -e...

4. Shell Programming and Scripting

Scripting question: Altering 2 field.

Hi Experts, I want to alter two filed of my data file: The _new should come to 2nd column, and _new to be removed from 4rth column, please advise, datafile.txt aa /dev/vgAA/lvol1 bb /dev/vgAA_new/lvol1 aa /dev/vgAA1/lvol2 bb /dev/vgAA1_new/lvol2 aa /dev/vgAC/lvol1 bb...

5. Shell Programming and Scripting

perl : merging two arrays on basis of common parameter

I have 2 arrays, @array1 contains records in the format 1|_|X|_|ssd|_| 4|_|H|_|hbd|_| 9|_|Y|_|u8gjdfg|_| @array2 contains records in the format X|_|asdf|_| Y|_|qwer|_| A|_|9kdkf|_| @array3 should contain records in the PLz X|_|ssd|_|asdf|_| Y|_|hdb|_|qwer|_| PLZ dont use...

6. Shell Programming and Scripting

Algorithm to load files efficiently without missing or accidently archiving....

We have a requirement where we get the Delta Files in every one hour and we need to load them into Oracle database every one hour using Powercenter. To efficiently do this we need to build an File management system. Here is our process: we get 6 files for 6 tables with a timestamp appended...

7. Shell Programming and Scripting

merging two files

file1.txt 1 2 10 11 56 57 7 8 43 44 and let's suppose that there is a file called file2.txt with 100 columns I want to produce a file3.txt with columns specified in file1.txt in that order (1,2,10,11,56,57,7,8,43,44) Thanks!

8. Shell Programming and Scripting

Perl - multiple keys and merging two files

Hi, I'm not a regular coder but some times I write some basic perl script, hence Perl is bit difficult for me :). I'm merging two files a.txt and b.txt into c.txt: a.txt ------ x001;frtb70;xyz;109 x001;frvt65;sec;239 x003;wqax34;jul;659 x004;yhud43;yhn;760 b.txt ------...

9. Shell Programming and Scripting

Altering a variable

Can I take an argument input, lets say it's, hg0000_xy1_v2, in the script it becomes f ... then hack off the end of the filename to change the variable to hg0000 only. I tried using sed but can't figure it out. f="$f" | sed 's/_fg_v//' I could change the variable label if necessary to...

10. Programming

Altering a jar file

I have a script I am trying to test and run but it runs against a jar file. I wrote an external property file so it would redirect with my script, but it keeps going in search of the previous property file. Is there any way to externally over write the jar file and if not how do you go about...

LEARN ABOUT DEBIAN

bup-margin

bup-margin(1)						      General Commands Manual						     bup-margin(1)

NAME

       bup-margin - figure out your deduplication safety margin

SYNOPSIS

       bup margin [options...]

DESCRIPTION

       bup margin  iterates  through  all  objects  in	your  bup repository, calculating the largest number of prefix bits shared between any two
       entries.  This number, n, identifies the longest subset of SHA-1 you could use and still encounter a collision between your object ids.

       For example, one system that was tested had a collection of 11 million objects (70 GB), and bup margin returned 45.  That  means  a  46-bit
       hash  would be sufficient to avoid all collisions among that set of objects; each object in that repository could be uniquely identified by
       its first 46 bits.

       The number of bits needed seems to increase by about 1 or 2 for every doubling of the number of objects.  Since SHA-1 hashes have 160 bits,
       that  leaves 115 bits of margin.  Of course, because SHA-1 hashes are essentially random, it's theoretically possible to use many more bits
       with far fewer objects.

       If you're paranoid about the possibility of SHA-1 collisions, you can monitor your repository by running bup margin occasionally to see	if
       you're getting dangerously close to 160 bits.

OPTIONS

       --predict
	      Guess  the offset into each index file where a particular object will appear, and report the maximum deviation of the correct answer
	      from the guess.  This is potentially useful for tuning an interpolation search algorithm.

       --ignore-midx
	      don't use .midx files, use only .idx files.  This is only really useful when used with --predict.

EXAMPLE

	      $ bup margin
	      Reading indexes: 100.00% (1612581/1612581), done.
	      40
	      40 matching prefix bits
	      1.94 bits per doubling
	      120 bits (61.86 doublings) remaining
	      4.19338e+18 times larger is possible

	      Everyone on earth could have 625878182 data sets
	      like yours, all in one repository, and we would
	      expect 1 object collision.

	      $ bup margin --predict
	      PackIdxList: using 1 index.
	      Reading indexes: 100.00% (1612581/1612581), done.
	      915 of 1612581 (0.057%)

SEE ALSO

       bup-midx(1), bup-save(1)

BUP

       Part of the bup(1) suite.

AUTHORS

       Avery Pennarun <apenwarr@gmail.com>.

Bup unknown-															     bup-margin(1)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Issue altering end data

Discussion started by: stlitguru

2. UNIX Desktop Questions & Answers

how to search files efficiently using patterns

Discussion started by: arunsubbhian

3. Shell Programming and Scripting

altering numbers in files

Discussion started by: bigboizvince

4. Shell Programming and Scripting

Scripting question: Altering 2 field.

Discussion started by: rveri

5. Shell Programming and Scripting

perl : merging two arrays on basis of common parameter

Discussion started by: centurion_13

6. Shell Programming and Scripting

Algorithm to load files efficiently without missing or accidently archiving....

Discussion started by: okkadu

7. Shell Programming and Scripting

merging two files

Discussion started by: johnkim0806

8. Shell Programming and Scripting

Perl - multiple keys and merging two files

Discussion started by: Lokesha

9. Shell Programming and Scripting

Altering a variable

Discussion started by: scribling

10. Programming

Altering a jar file

Discussion started by: risarose87

LEARN ABOUT DEBIAN

bup-margin