I have a 1.2G file that contains no newline characters. This is essentially a log file with each entry being exactly 78bits long. The basic format is /DATE/USER/MISC/. The single uniform thing about the file is that that the 8 character is always ":"
I worked with smaller files of the same data before using the following command
but the problem with this particular file is the size of the file itself. At 1.2G ggrep runs out of memory....
looking for a way to break up the file or get around the memory limits.
Having an entry that is 78 bits long that contains characters is very strange. Most entries in a file are a stream of 8 bit bytes. So, to split your entries (each of which is 9.75 bytes) into 11 byte lines (your 9.75 bytes per entry plus 2 bits for byte packing and a newline so the output is a text file), you're probably going to find writing a C program to read bytes and rotate bits into the proper positions easier than doing it in a shell script.
What two bits should be added to your entries to produce 10 characters (assuming ASCII or EBCDIC) from your input entries?
If your entries are all 78 bits long, why is your grep looking for a varying number of characters before and after the colon and why is the string it is matching varying from 1 to 76 characters (not bits or bytes) inclusive instead of the 78 bits you specified???
Please show us the first 200 bytes of your input file piped through the command:
If the records are all of fixed size, dd can be used to insert a newline after them. An example with 4 byte fixed size records:
dd is unaffected by line length limitations. You chould chain this before an awk or grep or what have you.
I assume you meant bs=4 instead of bs=3, but when processing a 1.2Gb file, dd will run noticeably faster with its default block size (512 bytes) or a larger size like bs=1024000. The dd bs=n parameter specifies how many bytes dd will read at a time from its input file and how many bytes at a time it will write to its output file.
With conv=unblock, it is just the conversion buffer size (specified by cbs=n) that determines the output line length produced by the dd utility.
I do have a large matrix of the following format and it is tab delimited
ch-ab1-20 ch-bb2-23 ch-ab1-34 ch-ab1-24 er-cc1-45 bv-cc1-78
ch-ab1-20 0 2 3 4 5 6
ch-bb2-23 3 0 5 ... (6 Replies)
Hi
i want to see file in solaris which are eating space.
like we have a listfiles command in AIX which show all the files in decreading order of the size .
example of listfile command in this command i am able to all the huge file in root directory. do we have any similar command in... (1 Reply)
We're running Solaris 7 on FDDI n/w on an E6500 host and wish to use MTU (packet size) > 1500, more like 3072 bytes to begin with and possibly up to 4096 bytes.
Linux has /etc/network/interfaces. Does ANYONE remember the equivalent in Unix? When I do ifconfig eth0 mtu 4000, I get the error... (0 Replies)
Hello everyone. Need some help copying a filesystem. The situation is this: I have an oracle DB mounted on /u01 and need to copy it to /u02. /u01 is 500 Gb and /u02 is 300 Gb. The size used on /u01 is 187 Gb. This is running on solaris 9 and both filesystems are UFS.
I have tried to do it using:... (14 Replies)
I need to parse a large log say 300-400 mb
The commands like awk and cat etc are taking time.
Please help how to process.
I need to process the log for certain values of current date.
But I am unbale to do so. (17 Replies)
Hi All,
Following is the sample file
and following is the op desired
that is the last entry of each unique first field is required.
My solution is as follows
However the original file has around a million entries and around a 100,000 uniques first fields, so this soln.... (6 Replies)