Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Parsing out records from one huge record Post 302173733 by jim mcnamara on Friday 7th of March 2008 04:57:14 PM
Old 03-07-2008
No UNIX tool can read a 9MB record. If you look at the man page for dd, you can specify a cbs=10:

Code:
dd cbs=10 conv=unblock < oldfile > newfile

This creates a stream of carriage returns at the end of 10 character records.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

deleting multiple records from a huge file at one time

I have a very big file of 5gb size and there are about 50 million records in there. I have to delete the records based on recrord number that I know fromoutside with out opening the file. The record numbers are very random like 5000678, 7890005 etc. Can somebody let me know how i can... (5 Replies)
Discussion started by: dsravan
5 Replies

2. UNIX for Advanced & Expert Users

Parsing records from one record

Hi, I got a file which is one huge record. I know each record should be 550 bytes long. How do I parse out the records from the one huge record. (1 Reply)
Discussion started by: bwrynz1
1 Replies

3. Shell Programming and Scripting

Grep matched records from huge file

111111111100000000001111111111 123232323200000010001114545454 232435424200000000001232131212 342354234301000000002323423443 232435424200000000001232131212 2390898994200000000001238908092 This is the record format. From 11th position to 20th position in a record there are 0's occuring,and... (6 Replies)
Discussion started by: mjkreddy
6 Replies

4. Shell Programming and Scripting

Parsing record into multiple records in Shell Script

Hi, I am trying to parse a very long record in a text file into multiple records by checking ADD, DELETE, or MODIFY field value in a shell script. Input # File name xyz.txt ADD|N000|8015662|DELETE|N001|9915662|MODIFY|N999|85678 Output ADD|N000|8015662| DELETE|N001|9915662|... (8 Replies)
Discussion started by: naveed
8 Replies

5. Shell Programming and Scripting

Multiple records based on ';' in the record

Hi All, I have a *.csv files in a die /pro/lif/dow, (pipe delimiter file), these files are having 8 columns and 6 column(CDR_LOGIC) records are populated as below, I need to incorporate the below logic in all the *.csv files. 11||:ColumnA||:ColumnB 123||:ColumnA IIF(:ColumnA = :ColumnC then... (6 Replies)
Discussion started by: shruthidwh
6 Replies

6. Shell Programming and Scripting

Need help splitting huge single record file

I was given a data file that I need to split into multiple lines/records based on a key word. The problem is that it is 2.5GB or bigger and everything I try in perl or sed causes a Segmentation fault. Can someone give me some other ideas. The data is of the form:... (5 Replies)
Discussion started by: leolson
5 Replies

7. Shell Programming and Scripting

reformat one record from two records

I have not get much answer/solution for the posting. Here I break down the question and hope to get some help. 1. How can I use AWK to read in two records at the same time and keep loop to next two when the condition is meet? position 1-10 --> Unique to identity whether... (4 Replies)
Discussion started by: menglm
4 Replies

8. Shell Programming and Scripting

Multiple Records from 1 Record

I need to make one record to multiple records based on occurence column in the record and change the date.For example below first record has 5 ,so need to create 5 records from one and change the date to 5 months.Occurence can be any number. I am unable to come with a script.Can some one help ... (5 Replies)
Discussion started by: traininfa
5 Replies

9. Shell Programming and Scripting

Fetching record based on Uniq Key from huge file.

Hi i want to fetch 100k record from a file which is looking like as below. XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX ... (17 Replies)
Discussion started by: lathigara
17 Replies

10. UNIX for Beginners Questions & Answers

Help in printing records where there is a 'header' in the first record ???

Hi, I have a backup report that unfortunately has some kind of hanging indent thing where the first line contains one column more than the others I managed to get the output that I wanted using awk, but just wanting to know if there is short way of doing it using the same awk Below is what... (2 Replies)
Discussion started by: newbie_01
2 Replies
dd(1)							      General Commands Manual							     dd(1)

NAME
dd - Converts and copies a file SYNOPSIS
dd [option=value...] STANDARDS
Interfaces documented on this reference page conform to industry standards as follows: dd: XCU5.0, SVID 4 Refer to the standards(5) reference page for more information about industry standards and associated tags. OPTIONS
None OPERANDS
The option=value operand set may take any of the following forms: Specifies the input file name; standard input is the default. Specifies the output file name; standard output is the default. For the above operands, the application must supply an expression specifying a size in bytes. The expression, expr can be: a positive decimal number a positive decimal number followed by k specifying mutiplication by 1024 a positive decimal number followed by b specifying multiplication by 512 two or more positive decimal numbers with or without k or b seper- ated by x, specifying the product of the indicated values. All the operands will be processed before any input is read. Skips number input records before starting copy. [Tru64 UNIX] Copies number input files before terminating (makes sense only where input is a magnetic tape or similar device). [Tru64 UNIX] Seeks to the numberth record from the beginning of input file before copying. [Tru64 UNIX] Seeks to the numberth record from the beginning of output file before copying. Same as seek=number. Seeks to the numberth record from the beginning of output file before copying. Same as oseek=number. Copies only number input records. Specifies one or more of the following conversions: Converts EBCDIC to ASCII. Converts variable-length records to fixed-length. Converts ASCII to EBCDIC. Converts IBM-EBCDIC to ASCII. Performs a slightly different map of ASCII to EBCDIC. Converts fixed-length records to variable-length. Makes all alphabetic characters lower case. Makes all alphabetic characters upper case. Swaps every pair of bytes. Does not stop processing on an error. Pads every input record to ibs. [Tru64 UNIX] Creates a sparse output file as described in AdvFS Administration. Do not truncate the output file. Preserve blocks in the output file not explicitly written by this invocation of the dd utility. (See the of=output_file oper- and.) Allows several comma-separated conversions. DESCRIPTION
The dd command reads the specified input file or standard input, does the specified conversions, and copies it to the specified output file or standard output. The input and output block size may be specified to take advantage of raw physical I/O. The terms block and record refer to the quantity of data read or written by dd in one operation and are not necessarily the same size as a disk block. Where sizes are specified, a number of bytes is expected. A number may end with w, b, or k to specify multiplication by 2, 512, or 1024, respectively; a pair of numbers can be separated by an x to indicate a product. The cbs specification is used if one of the following conversions is specified: ascii, unblock, ebcdic, ibm, or block. For the first two conversions, dd places characters in a conversion buffer of size cbs, converts these characters to ASCII, trims trailing spaces, and adds newline characters before sending data to the specified output. For the latter three cases, dd places ASCII characters in the conversion buffer, converts these characters to EBCDIC, and adds trailing spaces to create an output record of size cbs. After it finishes, dd reports the number of whole and partial input and output blocks. NOTES
[Tru64 UNIX] To copy to a raw disk, the disk label must first be zeroed using the disklabel -z command. For example: disklabel -z disk/cdrom17 [Tru64 UNIX] If you do not zero out the disk label, dd fails with the following error message: dd write error: Read-only file sys- tem The ASCII/EBCDIC conversion tables are taken from the 256 character standard in the CACM November, 1968. There is no universal standard for EBCDIC/ASCII translation. [Tru64 UNIX] One must specify conv=noerror,sync when copying raw disks with bad sectors to ensure dd stays synchronized. [Tru64 UNIX] Certain combinations of arguments to conv= are permitted. However, the block or unblock option cannot be combined with ascii, ebcdic, or ibm. Invalid combinations silently ignore all but the last mutually exclu- sive keyword. [Tru64 UNIX] If you need to use dd to copy to a streaming tape and the data is an odd length (not a multiple of 512 bytes), you must use the conv=sync flag to fill the last record. Streaming tape devices permit only multiples of 512 bytes. [Tru64 UNIX] If option bs is used (or bs is equal to obs) and no conversion is specified, then dd is particularly efficient since less memory copies are done. [Tru64 UNIX] The dd command does not support floppy disk multivolumes, but it does support tape multivol- umes. This means that when ENOSPC is returned while reading or writing a tape, dd will prompt the user for a new tape. [Tru64 UNIX] In order to make use of tape multivolumes, the files option must be used. Security Note [Tru64 UNIX] Any file system archive that contains ACLs (access control lists) that was created using dd is not exportable unless the tar- get system has the exact same password and group files. If there is a mismatch, incorrect access may be granted to a file or directory. EXIT STATUS
The following exit values are returned: The input file was successfully copied. An error occurred DIAGNOSTICS
This message specifies the number of full and partial records both read and written: f+p records in f+p records out The number of full records read or written (f) refers to the blocks of data of size ibs or obs. The number of partial records read or written (p) refers to the blocks of data smaller than ibs or obs. EXAMPLES
To read an EBCDIC tape blocked ten 80-byte EBCDIC card images per record into the ASCII file x, enter: dd if=/dev/tape/tape0_d0 of=x ibs=800 cbs=80 conv=ascii,lcase Note the use of raw magnetic tape. The dd command is especially suited to I/O on the raw physical devices because it allows reading and writing in arbitrary record sizes. To convert an ASCII text file to EBCDIC, enter: dd if=text.ascii of=text.ebcdic conv=ebcdic This converts text.ascii to EBCDIC representation, storing this in text.ebcdic. ENVIRONMENT VARIABLES
The following environment variables affect the execution of dd: Provides a default value for the internationalization variables that are unset or null. If LANG is unset or null, the corresponding value from the default locale is used. If any of the internationalization vari- ables contain an invalid setting, the utility behaves as if none of the variables had been defined. If set to a non-empty string value, overrides the values of all the other internationalization variables. Determines the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multibyte characters in arguments). Determines the locale for the for- mat and contents of diagnostic messages written to standard error. Determines the location of message catalogues for the processing of LC_MESSAGES. SEE ALSO
Commands: cp(1), cpio(1), sed(1), tar(1), tr(1), trbsd(1) Functions: lseek(2) Routines: fseek(3) Files: ascii(5) Standards: standards(5) Command and Shell User's Guide AdvFS Administration dd(1)
All times are GMT -4. The time now is 10:43 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy