08-30-2007
Hi ilan,
Thanks for taking this up ... I have the first piece figured out ... I can generate a random number using a small perl script that I downloaded of the net ... but I have a problem with the second part ... I'll try to describe it better.
I have 50 million records with 2 columns. Both the columns are present in all the records.
Step1: Generate a random value (this is the part i figured out above)
Step2: Locate a random record among the 50 million
Step3: Replace the value in the second column with the value generated in step 1.
Step4: Go back to Step1, generate a new value, look for another random record, replace it with this value and so on for about a million times.
I want to be able to do this in place since everytime I replace a record using awk, it gives the whole 50 million inclusive of that change as the output and i have redirect the output to another file, rename it to the original and start over again for the next iteration.
What I need is a way to edit the file in place in a loop identifying random records and changing the second column a million times.
The high level requirement is:
Given a file of 50 million records, I have to generate a file that has 50 million records but has 1 million records whose second column varies from that of the first file. Maybe there is an easier way to do this ... But I am stumped right now ....
Thanks,
V
10 More Discussions You Might Find Interesting
1. UNIX for Dummies Questions & Answers
How to edit large file using vi where you can't increase /usr/var/tmp anymore? (3 Replies)
Discussion started by: nazri
3 Replies
2. Shell Programming and Scripting
I am trying to edit a file that has 33k+ records. In this file I need to edit each record that has a 'Y' in the 107th position and change the 10 fields before the 'Y' to blanks. Not all records have a 'Y' in the 107th field.
ex:
... (8 Replies)
Discussion started by: jxh461
8 Replies
3. Shell Programming and Scripting
hi All,
Plz let me know how to edit a file with 2000000 records.
each record contains with 40 field seperated by |.
i want modify 455487 record, but i am uable to edit this large file using vi editor in unix.
plz let me know how to modify this file.
Thanks in advance.
-Bali Reddy (3 Replies)
Discussion started by: balireddy_77
3 Replies
4. Shell Programming and Scripting
Hi,
I need to make a script to edit a file. File is a large file in below format
Version: 2008120101
;$INCLUDE ./abc/xyz/Delhi
;$INCLUDE ./abc/xyz/London
$INCLUDE ./abc/xyz/New York
First line in the file is version number which is in year,month,date and serial number format. Each... (5 Replies)
Discussion started by: makkar4u
5 Replies
5. Shell Programming and Scripting
Hello,
I am attempting to create a command that I can eventually put into a loop so I can edit 1file on many servers. I would like to edit the file in place with sed -i. If not I will take any suggestions on how to use a temp file. I need to remove a email address from the configuration file... (4 Replies)
Discussion started by: abacus
4 Replies
6. Solaris
Dear Gurus,
I need you to advice or suggestion about the best solution to copy data around 200-300G from serverA(location A) to serverB(location B). Normally, I will share folder and then copy but it takes too long time(about 2 days).
Do you have any suggestion or which way should be... (9 Replies)
Discussion started by: unitipon
9 Replies
7. Shell Programming and Scripting
Hello, I am using tcsh on AIX.
I would like to write a script that does the following:
1. given an inode, how do I find exactly the name of the file?
I know I could do this using ls -i | grep <inode>
but it returns: <inode> <filename>. I need some string manipulation or something to... (1 Reply)
Discussion started by: lastZenMaster
1 Replies
8. Shell Programming and Scripting
Hi,
I lack the utter fundamentals on how to craft an awk script.
I have hundreds of text files that were mangled by .doc format so all the lines are broken up so I need to join all of the lines of text into a single line. Normally I use vim command "ggVGJ" to join all lines but with so many... (3 Replies)
Discussion started by: n00ti
3 Replies
9. Shell Programming and Scripting
Whenever I am trying to edit a file in unix with vi editor, I am getting the following error:
<data>
:Tmp file too large
Is there any way that I can edit the file other than vi. Any help is really appreciated.
Thanks (10 Replies)
Discussion started by: bobby1015
10 Replies
10. Shell Programming and Scripting
Hi All,
Daily I am getting the updated file.
I have to search for this file in all directories and sub directories.
If the file existed in a particular directory then move this updated file to that particular directory.
If the file is not existed in any of the directories then place this... (4 Replies)
Discussion started by: ROCK_PLSQL
4 Replies
LEARN ABOUT CENTOS
tctmttest
TCTMTTEST(1) Tokyo Cabinet TCTMTTEST(1)
NAME
tctmttest - test cases of the table database API
DESCRIPTION
The command `tctmttest' is a utility for facility test under multi-thread situation. This command is used in the following format. `path'
specifies the path of a database file. `tnum' specifies the number of running threads. `rnum' specifies the number of iterations. `bnum'
specifies the number of buckets. `apow' specifies the power of the alignment. `fpow' specifies the power of the free block pool.
tctmttest write [-tl] [-td|-tb|-tt|-tx] [-rc num] [-lc num] [-nc num] [-xm num] [-df num] [-ip] [-is] [-in] [-it] [-if] [-ix]
[-nl|-nb] [-rnd] path tnum rnum [bnum [apow [fpow]]]
Store records with columns "str", "num", "type", and "flag".
tctmttest read [-rc num] [-lc num] [-nc num] [-xm num] [-df num] [-nl|-nb] [-rnd] path tnum
Retrieve all records of the database above.
tctmttest remove [-rc num] [-lc num] [-nc num] [-xm num] [-df num] [-nl|-nb] [-rnd] path tnum
Remove all records of the database above.
tctmttest wicked [-tl] [-td|-tb|-tt|-tx] [-nl|-nb] path tnum rnum
Perform updating operations selected at random.
tctmttest typical [-tl] [-td|-tb|-tt|-tx] [-rc num] [-lc num] [-nc num] [-xm num] [-df num] [-nl|-nb] [-rr num] path tnum rnum [bnum
[apow [fpow]]
Perform typical operations selected at random.
Options feature the following.
-tl : enable the option `TDBTLARGE'.
-td : enable the option `TDBTDEFLATE'.
-tb : enable the option `TDBTBZIP'.
-tt : enable the option `TDBTTCBS'.
-tx : enable the option `TDBTEXCODEC'.
-rc num : specify the number of cached records.
-lc num : specify the number of cached leaf pages.
-nc num : specify the number of cached non-leaf pages.
-xm num : specify the size of the extra mapped memory.
-df num : specify the unit step number of auto defragmentation.
-ip : create the number index for the primary key.
-is : create the string index for the column "str".
-in : create the number index for the column "num".
-it : create the string index for the column "type".
-if : create the token inverted index for the column "flag".
-ix : create the q-gram inverted index for the column "text".
-nl : enable the option `TDBNOLCK'.
-nb : enable the option `TDBLCKNB'.
-rnd : select keys at random.
-nc : omit the comparison test.
-rr num : specify the ratio of reading operation by percentage.
This command returns 0 on success, another on failure.
SEE ALSO
tcttest(1), tctmgr(1), tctdb(3), tokyocabinet(3)
Man Page 2012-08-18 TCTMTTEST(1)