If you choose a random starting point, then select 1 record at every fixed m interval you for a reasonable number of times (1000 is beyond what is needed have a statistically valid sample ie - random sample of the population.
Since you have 200000 records start somewhere between 1 and 200, then step forward by 200 records 1000 times.
Hello,
I have got one file with more than 120+ million records(35 GB in size). I have to extract some relevant data from file based on some parameter and generate other output file.
What will be the besat and fastest way to extract the ne file.
sample file format :--... (2 Replies)
Lets say I want to pick a random file when I do an "ls" command. I don't have set number of files in each directory.
ls | head -1
This gives me the first one in each directory, is there a way to do the same but pick a random one. (3 Replies)
Hello gurus,
I am new to "awk" and trying to break a large file having 4 million records into several output files each having half million but at the same time I want to keep the similar key records in the same output file, not to exist accross the files.
e.g. my data is like:
Row_Num,... (6 Replies)
I have a file that needs to be parsed into multiple files every time there line contains a number 1. the problem i face is the lines are random and the file size is random. an example is that on line 4, 65, 187, 202 & 209 are number 1's so there has to be file breaks between all those to create 4... (6 Replies)
Need to use dd to generate a large file from a sample file of random data. This is because I don't have /dev/urandom.
I create a named pipe then:
dd if=mynamed.fifo do=myfile.fifo bs=1024 count=1024
but when I cat a file to the fifo that's 1024 random bytes:
cat randomfile.txt >... (7 Replies)
Hello All,
I have a large file, more than 50,000 lines, and I want to split it in even 5000 records. Which I can do using
sed '1d;$d;' <filename> | awk 'NR%5000==1{x="F"++i;}{print > x}'Now I need to add one more condition that is not to break the file at 5000th record if the 5000th record... (20 Replies)
Hi,
Do anybody knows how to use awk or any command to random print out 1000 number which start from range 1 to 150000?
I know that "rand" in awk can do similar random selection.
But I have no idea how to write a code that can random pick 1000 number from range 1 to 150000 :confused:
... (1 Reply)
I have a file, named records.txt, containing large number of records, around 0.5 million records in format below:
28433005 1 1 3 2 2 2 2 2 2 2 2 2 2 2
28433004 0 2 3 2 2 2 2 2 2 1 2 2 2 2
...
Another file is a key file, named key.txt, which is the list of some numbers in the first column of... (5 Replies)
Dear folks
I have a large data set which contains 400K columns. I decide to select 50K determined columns from the whole 400K columns. Is there any command in unix which could do this process for me? I need to also mention that I store all of the columns id in one file which may help to select... (5 Replies)
Discussion started by: sajmar
5 Replies
LEARN ABOUT DEBIAN
px_get_record2
PX_GET_RECORD2(3) Library Functions Manual PX_GET_RECORD2(3)NAME
PX_get_record2 -- Returns record in Paradox file
SYNOPSIS
#include <paradox.h>
int PX_get_record2(pxdoc_t *pxdoc, int recno, char *data, int *deleted, pxdatablockinfo_t *pxdbinfo)
DESCRIPTION
This function is similar to PX_get_record(3) but takes two extra parameters. If *deleted is set to 1 the function will consider any record
in the database, even those which are deleted. If *pxdbinfo is not NULL, the function will return some information about the data block
where the record has been read from. You will have to allocate memory for pxdbinfo before calling PX_get_record2.
On return *deleted will be set to 1 if the requested record is deleted or 0 if it is not deleted. The struct pxdatablockinfo_t has the fol-
lowing fields:
blockpos (long)
File positon where the block starts. The first six bytes of the block contain the header, followed by the record data.
recordpos (long)
File position where the requested record starts.
size (int)
Size of the data block without the six bytes for the header.
recno (int)
Record number within the data block. The first record in the block has number 0.
numrecords (int)
The number of records in this block.
number (int)
The number of the data block.
This function may return records with invalid data, because records are not explizitly marked as deleted, but rather the size of a valid
data block is modified. A data block is a fixed size area in the file which holds a certain number of records. If for some reason a data
block has newer been completely filled with records, the algorithmn anticipates deleted records in this data block, which are not there.
This often happens with the last data block in a file, which is likely to not being fully filled with records.
If you accessing several records, do it in ascending order, because this is the most efficient way.
Note:
This function is deprecated. Use PX_retrieve_record(3) instead
RETURN VALUE
Returns 0 on success and -1 on failure.
SEE ALSO PX_get_field(3), PX_get_record(3)AUTHOR
This manual page was written by Uwe Steinmann uwe@steinmann.cx.
PX_GET_RECORD2(3)