Sponsored Content
Top Forums UNIX for Dummies Questions & Answers ConCATenating binaries but excluding last bytes from each file Post 302877743 by Don Cragun on Tuesday 3rd of December 2013 08:43:01 AM
Old 12-03-2013
This isn't highly efficient (since it will copy the redundant bytes to the output file multiple time, doesn't try to align block sizes to disk block boundaries, uses small input block sizes if you have a large file that starts at a small [but non-zero] offset), but it seems to do what you want. You can make it considerably more complex to verify that the ranges of bytes specified by the input files don't leave any holes in the output file and to avoid copying duplicated data more than once. The checks for names that contain a "-" but are not just a string of digits followed by a "-" followed by a string of digits could be simplified with ksh and bash specific constructs. But the constructs used here should be portable to any shell that handles basic POSIX shell parameter expansion requirements correctly.

This is too simplistic to work if you want to process a file with a starting offset that is close to your process' maximum available address space. (In other words it probably won't work for terabyte sized files where the starting offset in one or more of your input files is relatively large.) But, it should give you a starting point for a more advanced script:
Code:
#!/bin/ksh
IAm=${0##*/}
of="final.$$"
> $of   # create zero-length temporary output file
maxe=0
for i in *-*
do      printf "Looking at \"%s\"\n" "$i"
        b=${i%%-*}
        e=${i##*-}
        if [ -z "$b" ] || [ -z "$e" ] || [ "$i" != "$b-$e" ] ||
                [ "$b" != "${b#*[!0-9]}" ] || [ "$e" != "${e#*[!0-9]}" ]
        then    continue
        fi
        if [ $e -gt $maxe ]
        then    maxe=$e
        fi
        if [ $b -gt 0 ]
        then    seek="ibs=102400 obs=$b seek=1"
        else    seek='bs=10240'
        fi
        echo starting dd if=$i of=$of $seek conv=notrunc
        dd if=$i of=$of $seek conv=notrunc
done
if [ $maxe -gt 0 ]
then    printf "Creating 0-%d\n" $maxe
        mv $of 0-$maxe
        exit
fi
rm $of
printf "%s: No input files found; no output file created.\n" "$IAm" >&2
exit 1

I use the Korn shell, but this script will work with any POSIX conforming shell without changing anything other than the first line in the script to specify your shell.
This User Gave Thanks to Don Cragun For This Post:
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Take a file from the system and put on tape and reset the file to 0 bytes

:mad: I did this the other day but one of my support personnel removed my history so i could call it back up to remeber the exact command since i am air-headed at times. I am trying to take a 30 MEG file off the system and drop it to tape then i want to make the file go back to being 0 bytes so... (1 Reply)
Discussion started by: JackieRyan26
1 Replies

2. Shell Programming and Scripting

Remove first N bytes and last N bytes from a binary file on AIX.

Hi all, Does anybody know or guide me on how to remove the first N bytes and the last N bytes from a binary file? Is there any AWK or SED or any command that I can use to achieve this? Your help is greatly appreciated!! Best Regards, Naveen. (1 Reply)
Discussion started by: naveendronavall
1 Replies

3. Shell Programming and Scripting

Concatenating the two lines in a file

hi My requirement is i have a file with some records like this file name ::xyz a=1 b=100,200 ,300,400 ,500,600 c=700,800 d=900 i want to change my file a=1 b=100,200,300,400 c=700,800 d=900 if record starts with " , " that line should fallows the previous line.please give... (6 Replies)
Discussion started by: srivsn
6 Replies

4. UNIX for Dummies Questions & Answers

Bytes of character in file

Hi, How do I check for the total bytes of character used by a file? Can I used a od command to check? Thanks. (1 Reply)
Discussion started by: user50210
1 Replies

5. Programming

how to inspect the bytes in a file?

What is the easiest way to inspect the bytes stored in a file? Ideally, If my file was 10 bytes each of which had only the high bit set, I'd be able to browse for it and get output like this: 01 - 10000000 02 - 10000000 03 - 10000000 04 - 10000000 05 - 10000000 06 - 10000000 07 -... (7 Replies)
Discussion started by: sneakyimp
7 Replies

6. Shell Programming and Scripting

Error PHP Fatal error: Allowed memory size of 67108864 bytes exhausted(tried to allocate 401 bytes)

While running script I am getting an error like Few lines in data are not being processed. After googling it I came to know that adding such line would give some memory to it ini_set("memory_limit","64M"); my input file size is 1 GB. Is that memory limit is based on RAM we have on... (1 Reply)
Discussion started by: elamurugu
1 Replies

7. Programming

Copying 1024 bytes data in 3-bytes chunk

Hi, If I want to copy a 1024 byte data stream in to the target location in 3-bytes chunk, I guess I can use the following script. dd bs=1024 count=3 if=/src of=/dest But, I would like to know, how to do it via a C program. I have tried this with memcpy(), that did not help. (3 Replies)
Discussion started by: royalibrahim
3 Replies

8. UNIX for Dummies Questions & Answers

X bytes of 0, Y bytes of random data, Z bytes of 5, T bytes of 1. ??

Hello guys. I really hope someone will help me with this one.. So, I have to write this script who: - creates a file home/student/vmdisk of 10 mb - formats that file to ext3 - mounts that partition to /mnt/partition - creates a file /mnt/partition/data. In this file, there will... (1 Reply)
Discussion started by: razolo13
1 Replies

9. Shell Programming and Scripting

Shell script - entered input(1-40 bytes) needs to be converted exactly 40 bytes

hello, suppose, entered input is of 1-40 bytes, i need it to be converted to 40 bytes exactly. example: if i have entered my name anywhere between 1-40 i want it to be stored with 40 bytes exactly. enter your name: donald duck (this is of 11 bytes) expected is as below - display 11... (3 Replies)
Discussion started by: shravan.300
3 Replies

10. Shell Programming and Scripting

Get file's first x bytes

is there a better way to do this: head -c 10000k /var/dump.log | head -c 6000k unfortunately, the "-c" option is not available on sun solaris. so i'm looking at "dd". but i dont know how to use it to achieve the same exact goal as the above head command. this needs to work on both solaris... (5 Replies)
Discussion started by: SkySmart
5 Replies
DD(1)							    BSD General Commands Manual 						     DD(1)

NAME
dd -- convert and copy a file SYNOPSIS
dd [operands ...] DESCRIPTION
The dd utility copies the standard input to the standard output. Input data is read and written in 512-byte blocks. If input reads are short, input from multiple reads are aggregated to form the output block. When finished, dd displays the number of complete and partial input and output blocks and truncated input records to the standard error output. The following operands are available: bs=n Set both input and output block size to n bytes, superseding the ibs and obs operands. If no conversion values other than noerror, notrunc or sync are specified, then each input block is copied to the output as a single block without any aggregation of short blocks. cbs=n Set the conversion record size to n bytes. The conversion record size is required by the record oriented conversion values. count=n Copy only n input blocks. files=n Copy n input files before terminating. This operand is only applicable when the input device is a tape. ibs=n Set the input block size to n bytes instead of the default 512. if=file Read input from file instead of the standard input. iseek=n Seek on the input file n blocks. This is synonymous with skip=n. obs=n Set the output block size to n bytes instead of the default 512. of=file Write output to file instead of the standard output. Any regular output file is truncated unless the notrunc conversion value is specified. If an initial portion of the output file is seeked past (see the oseek operand), the output file is truncated at that point. oseek=n Seek on the output file n blocks. This is synonymous with seek=n. seek=n Seek n blocks from the beginning of the output before copying. On non-tape devices, an lseek(2) operation is used. Otherwise, existing blocks are read and the data discarded. If the user does not have read permission for the tape, it is positioned using the tape ioctl(2) function calls. If the seek operation is past the end of file, space from the current end of file to the specified offset is filled with blocks of NUL bytes. skip=n Skip n blocks from the beginning of the input before copying. On input which supports seeks, an lseek(2) operation is used. Other- wise, input data is read and discarded. For pipes, the correct number of bytes is read. For all other devices, the correct number of blocks is read without distinguishing between a partial or complete block being read. conv=value[,value ...] Where value is one of the symbols from the following list. ascii, oldascii The same as the unblock value except that characters are translated from EBCDIC to ASCII before the records are converted. (These values imply unblock if the operand cbs is also specified.) There are two conversion maps for ASCII. The value ascii specifies the recommended one which is compatible with AT&T System V UNIX. The value oldascii specifies the one used in historic AT&T UNIX and pre-4.3BSD-Reno systems. block Treats the input as a sequence of newline or end-of-file terminated variable length records independent of input and output block boundaries. Any trailing newline character is discarded. Each input record is converted to a fixed length output record where the length is specified by the cbs operand. Input records shorter than the conversion record size are padded with spaces. Input records longer than the conversion record size are truncated. The number of truncated input records, if any, are reported to the standard error output at the completion of the copy. ebcdic, ibm, oldebcdic, oldibm The same as the block value except that characters are translated from ASCII to EBCDIC after the records are converted. (These values imply block if the operand cbs is also specified.) There are four conversion maps for EBCDIC. The value ebcdic specifies the recommended one which is compatible with AT&T System V UNIX. The value ibm is a slightly different mapping, which is compatible with the AT&T System V UNIX ibm value. The values oldebcdic and oldibm are maps used in his- toric AT&T UNIX and pre-4.3BSD-Reno systems. lcase Transform uppercase characters into lowercase characters. noerror Do not stop processing on an input error. When an input error occurs, a diagnostic message followed by the current input and output block counts will be written to the standard error output in the same format as the standard completion message. If the sync conversion is also specified, any missing input data will be replaced with NUL bytes (or with spaces if a block oriented conversion value was specified) and processed as a normal input buffer. If the sync conversion is not specified, the input block is omitted from the output. On input files which are not tapes or pipes, the file offset will be posi- tioned past the block in which the error occurred using lseek(2). notrunc Do not truncate the output file. This will preserve any blocks in the output file not explicitly written by dd. The notrunc value is not supported for tapes. osync Pad the final output block to the full output block size. If the input file is not a multiple of the output block size after conversion, this conversion forces the final output block to be the same size as preceding blocks for use on devices that require regularly sized blocks to be written. This option is incompatible with use of the bs=n block size specifica- tion. sparse If one or more output blocks would consist solely of NUL bytes, try to seek the output file by the required space instead of filling them with NULs, resulting in a sparse file. swab Swap every pair of input bytes. If an input buffer has an odd number of bytes, the last byte will be ignored during swap- ping. sync Pad every input block to the input buffer size. Spaces are used for pad bytes if a block oriented conversion value is specified, otherwise NUL bytes are used. ucase Transform lowercase characters into uppercase characters. unblock Treats the input as a sequence of fixed length records independent of input and output block boundaries. The length of the input records is specified by the cbs operand. Any trailing space characters are discarded and a newline character is appended. Where sizes are specified, a decimal, octal, or hexadecimal number of bytes is expected. If the number ends with a ``b'', ``k'', ``m'', ``g'', or ``w'', the number is multiplied by 512, 1024 (1K), 1048576 (1M), 1073741824 (1G) or the number of bytes in an integer, respec- tively. Two or more numbers may be separated by an ``x'' to indicate a product. When finished, dd displays the number of complete and partial input and output blocks, truncated input records and odd-length byte-swapping blocks to the standard error output. A partial input block is one where less than the input block size was read. A partial output block is one where less than the output block size was written. Partial output blocks to tape devices are considered fatal errors. Otherwise, the rest of the block will be written. Partial output blocks to character devices will produce a warning message. A truncated input block is one where a variable length record oriented conversion value was specified and the input line was too long to fit in the conversion record or was not newline terminated. Normally, data resulting from input or conversion or both are aggregated into output blocks of the specified size. After the end of input is reached, any remaining output is written as a block. This means that the final output block may be shorter than the output block size. If dd receives a SIGINFO (see the status argument for stty(1)) signal, the current input and output block counts will be written to the stan- dard error output in the same format as the standard completion message. If dd receives a SIGINT signal, the current input and output block counts will be written to the standard error output in the same format as the standard completion message and dd will exit. DIAGNOSTICS
The dd utility exits 0 on success, and >0 if an error occurs. SEE ALSO
cp(1), tr(1) STANDARDS
The dd utility is expected to be a superset of the IEEE Std 1003.2 (``POSIX.2'') standard. The files operand and the ascii, ebcdic, ibm, oldascii, oldebcdic and oldibm values are extensions to the POSIX standard. BSD
January 13, 1994 BSD
All times are GMT -4. The time now is 03:07 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy