Sponsored Content
Top Forums Shell Programming and Scripting Parsing large files in Solaris 11 Post 302952511 by Don Cragun on Wednesday 19th of August 2015 12:53:31 PM
Old 08-19-2015
Quote:
Originally Posted by Corona688
If the records are all of fixed size, dd can be used to insert a newline after them. An example with 4 byte fixed size records:

Code:
# bs is 1 minus the record size, cbs is the record size.
$ printf "AAA:BBB:CCC:DDD:" | dd bs=3 cbs=4 conv=unblock

AAA:
BBB:
CCC:
DDD:

$

dd is unaffected by line length limitations. You chould chain this before an awk or grep or what have you.

Code:
dd if=filename ... | grep whatever

I assume you meant bs=4 instead of bs=3, but when processing a 1.2Gb file, dd will run noticeably faster with its default block size (512 bytes) or a larger size like bs=1024000. The dd bs=n parameter specifies how many bytes dd will read at a time from its input file and how many bytes at a time it will write to its output file.

With conv=unblock, it is just the conversion buffer size (specified by cbs=n) that determines the output line length produced by the dd utility.
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Problem with parsing a large file

Hi All, Following is the sample file and following is the op desired that is the last entry of each unique first field is required. My solution is as follows However the original file has around a million entries and around a 100,000 uniques first fields, so this soln.... (6 Replies)
Discussion started by: gauravgoel
6 Replies

2. Shell Programming and Scripting

Parsing a large log

I need to parse a large log say 300-400 mb The commands like awk and cat etc are taking time. Please help how to process. I need to process the log for certain values of current date. But I am unbale to do so. (17 Replies)
Discussion started by: asth
17 Replies

3. Shell Programming and Scripting

parsing large CDR XML file

Dear Freind in the file attached how parse the data to be like a normal table :D (3 Replies)
Discussion started by: saifsafaa
3 Replies

4. Shell Programming and Scripting

Divide large data files into smaller files

Hello everyone! I have 2 types of files in the following format: 1) *.fa >1234 ...some text... >2345 ...some text... >3456 ...some text... . . . . 2) *.info >1234 (7 Replies)
Discussion started by: ad23
7 Replies

5. Solaris

How to safely copy full filesystems with large files (10Gb files)

Hello everyone. Need some help copying a filesystem. The situation is this: I have an oracle DB mounted on /u01 and need to copy it to /u02. /u01 is 500 Gb and /u02 is 300 Gb. The size used on /u01 is 187 Gb. This is running on solaris 9 and both filesystems are UFS. I have tried to do it using:... (14 Replies)
Discussion started by: dragonov7
14 Replies

6. UNIX for Advanced & Expert Users

Need help with configuring large packet size on Solaris 7 / e6500

We're running Solaris 7 on FDDI n/w on an E6500 host and wish to use MTU (packet size) > 1500, more like 3072 bytes to begin with and possibly up to 4096 bytes. Linux has /etc/network/interfaces. Does ANYONE remember the equivalent in Unix? When I do ifconfig eth0 mtu 4000, I get the error... (0 Replies)
Discussion started by: sharique
0 Replies

7. Shell Programming and Scripting

Help needed for parsing large XML with awk.

My XML structure looks like: <?xml version="1.0" encoding="UTF-8"?> <SearchRepository> <SearchItems> <SearchItem> ... </SearchItem> <SearchItem> ... ... (1 Reply)
Discussion started by: jasonjustice
1 Replies

8. UNIX for Dummies Questions & Answers

How to display large file in Solaris?

Hi i want to see file in solaris which are eating space. like we have a listfiles command in AIX which show all the files in decreading order of the size . example of listfile command in this command i am able to all the huge file in root directory. do we have any similar command in... (1 Reply)
Discussion started by: scriptor
1 Replies

9. Shell Programming and Scripting

Parsing a subset of data from a large matrix

I do have a large matrix of the following format and it is tab delimited ch-ab1-20 ch-bb2-23 ch-ab1-34 ch-ab1-24 er-cc1-45 bv-cc1-78 ch-ab1-20 0 2 3 4 5 6 ch-bb2-23 3 0 5 ... (6 Replies)
Discussion started by: Kanja
6 Replies
DD(1)							      General Commands Manual							     DD(1)

NAME
dd - convert and copy a file SYNOPSIS
dd [option=value] ... DESCRIPTION
Dd copies the specified input file to the specified output with possible conversions. The standard input and output are used by default. The input and output block size may be specified to take advantage of raw physical I/O. option values if= input file name; standard input is default of= output file name; standard output is default ibs=n input block size n bytes (default 512) obs=n output block size (default 512) bs=n set both input and output block size, superseding ibs and obs; also, if no conversion is specified, it is particularly effi- cient since no copy need be done cbs=n conversion buffer size skip=n skip n input records before starting copy files=n copy n input files before terminating (makes sense only where input is a magtape or similar device). seek=n seek n records from beginning of output file before copying count=n copy only n input records conv=ascii convert EBCDIC to ASCII ebcdic convert ASCII to EBCDIC ibm slightly different map of ASCII to EBCDIC block convert variable length records to fixed length unblock convert fixed length records to variable length lcase map alphabetics to lower case ucase map alphabetics to upper case swab swap every pair of bytes noerror do not stop processing on an error sync pad every input record to ibs ... , ... several comma-separated conversions Where sizes are specified, a number of bytes is expected. A number may end with k, b or w to specify multiplication by 1024, 512, or 2 respectively; a pair of numbers may be separated by x to indicate a product. Cbs is used only if ascii, unblock, ebcdic, ibm, or block conversion is specified. In the first two cases, cbs characters are placed into the conversion buffer, any specified character mapping is done, trailing blanks trimmed and new-line added before sending the line to the output. In the latter three cases, characters are read into the conversion buffer, and blanks added to make up an output record of size cbs. After completion, dd reports the number of whole and partial input and output blocks. For example, to read an EBCDIC tape blocked ten 80-byte EBCDIC card images per record into the ASCII file x: dd if=/dev/rmt0 of=x ibs=800 cbs=80 conv=ascii,lcase Note the use of raw magtape. Dd is especially suited to I/O on the raw physical devices because it allows reading and writing in arbitrary record sizes. SEE ALSO
cp(1), tr(1) DIAGNOSTICS
f+p records in(out): numbers of full and partial records read(written) BUGS
The ASCII/EBCDIC conversion tables are taken from the 256 character standard in the CACM Nov, 1968. The `ibm' conversion, while less blessed as a standard, corresponds better to certain IBM print train conventions. There is no universal solution. One must specify ``conv=noerror,sync'' when copying raw disks with bad sectors to insure dd stays synchronized. Certain combinations of arguments to conv= are permitted. However, the block or unblock option cannot be combined with ascii, ebcdic or ibm. Invalid combinations silently ignore all but the last mutually-exclusive keyword. 4th Berkeley Distribution April 29, 1985 DD(1)
All times are GMT -4. The time now is 04:10 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy