If you sort your data, however, you can use the comm utility, which does not need to completely load either file into memory. Since the lines are in sorted order, it can tell when a line and when a line is skipped by whether the next line is greater or less or equal...
sort should be smart enough to process in blocks and not run out of memory. Be sure you have enough /tmp/ space, or redirect it to use another folder for temporary files where you have the room. See man sort for details.
Note that it might be possible to run comm once to get both sets of data, if only I knew what your data looks like -- which I still don't, after asking several times...
Hi All,
Can you please help me in resolving the following problem?
My requirement is like this:
1) I have two files YESTERDAY_FILE and TODAY_FILE. Each one is having nearly two million data.
2) I need to check each record of TODAY_FILE in YESTERDAY_FILE. If exists we can skip that by... (5 Replies)
Hi,
I have two files file A and File B. File A is a error file and File B is source file. In the error file. First line is the actual error and second line gives the information about the record (client ID) that throws error. I need to compare the first field (which doesnt start with '//') of... (11 Replies)
Below is my perl script:
#!/usr/bin/perl
open(FILE,"$ARGV") or die "$!";
@DATA = <FILE>;
close FILE;
$join = join("",@DATA);
@array = split( ">",$join);
for($i=0;$i<=scalar(@array);$i++){
system ("/home/bin/./program_name_count_length MULTI_sequence_DATA_FILE -d... (5 Replies)
Hi, all:
I've got two folders, say, "folder1" and "folder2".
Under each, there are thousands of files.
It's quite obvious that there are some files missing in each. I just would like to find them. I believe this can be done by "diff" command.
However, if I change the above question a... (1 Reply)
Hello Everyone,
I have a perl script that reads two types of data files (txt and XML). These data files are huge and large in number. I am using something like this :
foreach my $t (@text)
{
open TEXT, $t or die "Cannot open $t for reading: $!\n";
while(my $line=<TEXT>){
... (4 Replies)
Hi all,
I have two large files and i want a field by field comparison for each record in it.
All fields are tab seperated.
file1:
Email SELVAKUMAR RAMACHANDRAN
Email SHILPA SAHU
Web NIYATI SONI
Web NIYATI SONI
Email VIINII DOSHI
Web RAJNISH KUMAR
Web ... (4 Replies)
Hi,
I'm new to perl and i have to write a perl script that will compare to log/txt files and display the differences. Unfortunately I'm not allowed to use any complied binaries or applications like diff or comm.
So far i've across a code like this:
use strict;
use warnings;
my $list1;... (2 Replies)
Hi,
I have the following command in place
nawk -F, '!a++' file > file.uniq
It has been working perfectly as per requirements, by removing duplicates by taking into consideration only first 3 fields. Recently it has started giving below error:
bash-3.2$ nawk -F, '!a++'... (17 Replies)
I have these two file that I am trying to compare using shell arrays. I need to find out the changed or the missing
enteries from File2. For example. The line "f nsd1" in file2 is different from file1 and the line "g nsd6" is missing
from file2.
I dont want to use "for loop" because my files... (2 Replies)
Discussion started by: sags007_99
2 Replies
LEARN ABOUT CENTOS
seek
seek(n) Tcl Built-In Commands seek(n)
__________________________________________________________________________________________________________________________________________________NAME
seek - Change the access position for an open channel
SYNOPSIS
seek channelId offset ?origin?
_________________________________________________________________DESCRIPTION
Changes the current access position for channelId.
ChannelId must be an identifier for an open channel such as a Tcl standard channel (stdin, stdout, or stderr), the return value from an
invocation of open or socket, or the result of a channel creation command provided by a Tcl extension.
The offset and origin arguments specify the position at which the next read or write will occur for channelId. Offset must be an integer
(which may be negative) and origin must be one of the following:
start The new access position will be offset bytes from the start of the underlying file or device.
current The new access position will be offset bytes from the current access position; a negative offset moves the access position back-
wards in the underlying file or device.
end The new access position will be offset bytes from the end of the file or device. A negative offset places the access position
before the end of file, and a positive offset places the access position after the end of file.
The origin argument defaults to start.
The command flushes all buffered output for the channel before the command returns, even if the channel is in nonblocking mode. It also
discards any buffered and unread input. This command returns an empty string. An error occurs if this command is applied to channels
whose underlying file or device does not support seeking.
Note that offset values are byte offsets, not character offsets. Both seek and tell operate in terms of bytes, not characters, unlike
read.
EXAMPLES
Read a file twice:
set f [open file.txt]
set data1 [read $f]
seek $f 0
set data2 [read $f]
close $f
# $data1 == $data2 if the file wasn't updated
Read the last 10 bytes from a file:
set f [open file.data]
# This is guaranteed to work with binary data but
# may fail with other encodings...
fconfigure $f -translation binary
seek $f -10 end
set data [read $f 10]
close $f
SEE ALSO
file(n), open(n), close(n), gets(n), tell(n), Tcl_StandardChannels(3)KEYWORDS
access position, file, seek
Tcl 8.1 seek(n)