Sponsored Content
Top Forums UNIX for Dummies Questions & Answers duplicated lines not recognized by sort and uniq Post 302250060 by redoubtable on Wednesday 22nd of October 2008 04:20:55 PM
Old 10-22-2008
The best way to see the difference is to diff both files.

diff ID_file1.txt ID_file2.txt says the files differ.
To find out the difference, I issued an hexdump on both files and we see the difference quite easily in the end of each string:
Code:
redoubtable@Tsunami ~ $ hexdump ID_file2.txt |head -n1
0000000 6f43 746e 6769 0d31 430a 6e6f 6974 3267
redoubtable@Tsunami ~ $ hexdump ID_file1.txt |head -n1
0000000 6f43 746e 6769 0a31 6f43 746e 6769 0a32
redoubtable@Tsunami ~ $

As you can see, there is an 0xd followed by 0xa in the end of ID_file2.txt and just a 0xa in ID_file1.txt

PS: the output of hexdump should be read as follows:
1234 5678 9123 4567 -> 34, 12, 78, 56, 23, 91, 67, 45.
So, 6f43 746e 6769 0d31 430a 6e6f 6974 3267 is 0x43 0x6f 0x6e 0x74 0x69 0x67 0x31 0xd 0xa 0x43 0x6f ...
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

sort/uniq

I have a file: Fred Fred Fred Jim Fred Jim Jim If sort is executed on the listed file, shouldn't the output be?: Fred Fred Fred Fred Jim Jim Jim (3 Replies)
Discussion started by: jimmyflip
3 Replies

2. Shell Programming and Scripting

Sort, Uniq, Duplicates

Input File is : ------------- 25060008,0040,03, 25136437,0030,03, 25069457,0040,02, 80303438,0014,03,1st 80321837,0009,03,1st 80321977,0009,03,1st 80341345,0007,03,1st 84176527,0047,03,1st 84176527,0047,03, 20000735,0018,03,1st 25060008,0040,03, I am using the following in the script... (5 Replies)
Discussion started by: Amruta Pitkar
5 Replies

3. Shell Programming and Scripting

remove duplicated lines without sort

Hi Just wondering whether or not I can remove duplicated lines without sort For example, I use the command who, which shows users who are logging on. In some cases, it shows duplicated lines of users who are logging on more than one terminal. Normally, I would do who | cut -d" " -f1 |... (6 Replies)
Discussion started by: lalelle
6 Replies

4. Shell Programming and Scripting

Help with Uniq and sort

The key is first field i want only uniq record for the first field in file. I want the output as or output as Appreciate help on this (4 Replies)
Discussion started by: pinnacle
4 Replies

5. UNIX for Dummies Questions & Answers

Sort and uniq lines of a file while keeping a header line

So, I have a file that has some duplicate lines. The file has a header line that I would like to keep at the top. I could do this by extracting the header from the file, 'sort -u' the remaining lines, and recombine them. But they are quite big, so if there is a way to do it with a single... (1 Reply)
Discussion started by: Digby
1 Replies

6. Shell Programming and Scripting

Sort field and uniq

I have a flatfile A.txt 2012/12/04 14:06:07 |trees|Boards 2, 3|denver|mekong|mekong12 2012/12/04 17:07:22 |trees|Boards 2, 3|denver|mekong|mekong12 2012/12/04 17:13:27 |trees|Boards 2, 3|denver|mekong|mekong12 2012/12/04 14:07:39 |rain|Boards 1|tampa|merced|merced11 How do i sort and get... (3 Replies)
Discussion started by: sabercats
3 Replies

7. UNIX for Dummies Questions & Answers

Sort csv file by duplicated column value

hello, I have a large file (about 1gb) that is in a file similar to the following: I want to make it so that I can put all the duplicates where column 3 (delimited by the commas) are shown on top. Meaning all people with the same age are listed at the top. The command I used was ... (3 Replies)
Discussion started by: jl487
3 Replies

8. Shell Programming and Scripting

Uniq or sort -u or similar only between { }

Hi ! I am trying to remove doubbled entrys in a textfile only between delimiters. Like that example but i dont know how to do that with sort or similar. input: { aaa aaa } { aaa aaa } output: { aaa } { (8 Replies)
Discussion started by: fugitivus
8 Replies

9. UNIX for Dummies Questions & Answers

Uniq and sort -u

Hello all, Need to pick your brains, I have a 10Gb file where each row is a name, I am expecting about 50 names in total. So there are a lot of repetitions in clusters. So I want to do a sort -u file Will it be considerably faster or slower to use a uniq before piping it to sort... (3 Replies)
Discussion started by: senhia83
3 Replies

10. Shell Programming and Scripting

Sort & Uniq -u

Hi All, Below the actual file which i like to sort and Uniq -u /opt/oracle/work/Antony/Shell_Script> cat emp.1st 2233|a.k. shukula |g.m. |sales |12/12/52 |6000 1006|chanchal singhvi |director |sales |03/09/38 |6700... (8 Replies)
Discussion started by: Antony Ankrose
8 Replies
XXD(1)							      General Commands Manual							    XXD(1)

NAME
xxd - make a hexdump or do the reverse. SYNOPSIS
xxd -h[elp] xxd [options] [infile [outfile]] xxd -r[evert] [options] [infile [outfile]] DESCRIPTION
xxd creates a hex dump of a given file or standard input. It can also convert a hex dump back to its original binary form. Like uuen- code(1) and uudecode(1) it allows the transmission of binary data in a `mail-safe' ASCII representation, but has the advantage of decoding to standard output. Moreover, it can be used to perform binary file patching. OPTIONS
If no infile is given, standard input is read. If infile is specified as a `-' character, then input is taken from standard input. If no outfile is given (or a `-' character is in its place), results are sent to standard output. Note that a "lazy" parser is used which does not check for more than the first option letter, unless the option is followed by a parameter. Spaces between a single option letter and its parameter are optional. Parameters to options can be specified in decimal, hexadecimal or octal notation. Thus -c8, -c 8, -c 010 and -cols 8 are all equivalent. -a | -autoskip toggle autoskip: A single '*' replaces nul-lines. Default off. -b | -bits Switch to bits (binary digits) dump, rather than hexdump. This option writes octets as eight digits "1"s and "0"s instead of a nor- mal hexadecimal dump. Each line is preceded by a line number in hexadecimal and followed by an ascii (or ebcdic) representation. The command line switches -r, -p, -i do not work with this mode. -c cols | -cols cols format <cols> octets per line. Default 16 (-i: 12, -ps: 30, -b: 6). Max 256. -E | -EBCDIC Change the character encoding in the righthand column from ASCII to EBCDIC. This does not change the hexadecimal representation. The option is meaningless in combinations with -r, -p or -i. -g bytes | -groupsize bytes separate the output of every <bytes> bytes (two hex characters or eight bit-digits each) by a whitespace. Specify -g 0 to suppress grouping. <Bytes> defaults to 2 in normal mode and 1 in bits mode. Grouping does not apply to postscript or include style. -h | -help print a summary of available commands and exit. No hex dumping is performed. -i | -include output in C include file style. A complete static array definition is written (named after the input file), unless xxd reads from stdin. -l len | -len len stop after writing <len> octets. -p | -ps | -postscript | -plain output in postscript continuous hexdump style. Also known as plain hexdump style. -r | -revert reverse operation: convert (or patch) hexdump into binary. If not writing to stdout, xxd writes into its output file without trun- cating it. Use the combination -r -p to read plain hexadecimal dumps without line number information and without a particular column layout. Additional Whitespace and line-breaks are allowed anywhere. -seek offset When used after -r: revert with <offset> added to file positions found in hexdump. -s [+][-]seek start at <seek> bytes abs. (or rel.) infile offset. + indicates that the seek is relative to the current stdin file position (mean- ingless when not reading from stdin). - indicates that the seek should be that many characters from the end of the input (or if combined with +: before the current stdin file position). Without -s option, xxd starts at the current file position. -u use upper case hex letters. Default is lower case. -v | -version show version string. CAVEATS
xxd -r has some builtin magic while evaluating line number information. If the output file is seekable, then the linenumbers at the start of each hexdump line may be out of order, lines may be missing, or overlapping. In these cases xxd will lseek(2) to the next position. If the output file is not seekable, only gaps are allowed, which will be filled by null-bytes. xxd -r never generates parse errors. Garbage is silently skipped. When editing hexdumps, please note that xxd -r skips everything on the input line after reading enough columns of hexadecimal data (see option -c). This also means, that changes to the printable ascii (or ebcdic) columns are always ignored. Reverting a plain (or postscript) style hexdump with xxd -r -p does not depend on the correct number of columns. Here anything that looks like a pair of hex-digits is inter- preted. Note the difference between % xxd -i file and % xxd -i < file xxd -s +seek may be different from xxd -s seek, as lseek(2) is used to "rewind" input. A '+' makes a difference if the input source is stdin, and if stdin's file position is not at the start of the file by the time xxd is started and given its input. The following examples may help to clarify (or further confuse!)... Rewind stdin before reading; needed because the `cat' has already read to the end of stdin. % sh -c "cat > plain_copy; xxd -s 0 > hex_copy" < file Hexdump from file position 0x480 (=1024+128) onwards. The `+' sign means "relative to the current position", thus the `128' adds to the 1k where dd left off. % sh -c "dd of=plain_snippet bs=1k count=1; xxd -s +128 > hex_snippet" < file Hexdump from file position 0x100 ( = 1024-768) on. % sh -c "dd of=plain_snippet bs=1k count=1; xxd -s +-768 > hex_snippet" < file However, this is a rare situation and the use of `+' is rarely needed. The author prefers to monitor the effect of xxd with strace(1) or truss(1), whenever -s is used. EXAMPLES
Print everything but the first three lines (hex 0x30 bytes) of file. % xxd -s 0x30 file Print 3 lines (hex 0x30 bytes) from the end of file. % xxd -s -0x30 file Print 120 bytes as continuous hexdump with 20 octets per line. % xxd -l 120 -ps -c 20 xxd.1 2e54482058584420312022417567757374203139 39362220224d616e75616c207061676520666f72 20787864220a2e5c220a2e5c222032317374204d 617920313939360a2e5c22204d616e2070616765 20617574686f723a0a2e5c2220202020546f6e79 204e7567656e74203c746f6e79407363746e7567 Hexdump the first 120 bytes of this man page with 12 octets per line. % xxd -l 120 -c 12 xxd.1 0000000: 2e54 4820 5858 4420 3120 2241 .TH XXD 1 "A 000000c: 7567 7573 7420 3139 3936 2220 ugust 1996" 0000018: 224d 616e 7561 6c20 7061 6765 "Manual page 0000024: 2066 6f72 2078 7864 220a 2e5c for xxd".. 0000030: 220a 2e5c 2220 3231 7374 204d ".." 21st M 000003c: 6179 2031 3939 360a 2e5c 2220 ay 1996.." 0000048: 4d61 6e20 7061 6765 2061 7574 Man page aut 0000054: 686f 723a 0a2e 5c22 2020 2020 hor:.." 0000060: 546f 6e79 204e 7567 656e 7420 Tony Nugent 000006c: 3c74 6f6e 7940 7363 746e 7567 <tony@sctnug Display just the date from the file xxd.1 % xxd -s 0x36 -l 13 -c 13 xxd.1 0000036: 3231 7374 204d 6179 2031 3939 36 21st May 1996 Copy input_file to output_file and prepend 100 bytes of value 0x00. % xxd input_file | xxd -r -s 100 > output_file Patch the date in the file xxd.1 % echo "0000037: 3574 68" | xxd -r - xxd.1 % xxd -s 0x36 -l 13 -c 13 xxd.1 0000036: 3235 7468 204d 6179 2031 3939 36 25th May 1996 Create a 65537 byte file with all bytes 0x00, except for the last one which is 'A' (hex 0x41). % echo "010000: 41" | xxd -r > file Hexdump this file with autoskip. % xxd -a -c 12 file 0000000: 0000 0000 0000 0000 0000 0000 ............ * 000fffc: 0000 0000 40 ....A Create a 1 byte file containing a single 'A' character. The number after '-r -s' adds to the linenumbers found in the file; in effect, the leading bytes are suppressed. % echo "010000: 41" | xxd -r -s -0x10000 > file Use xxd as a filter within an editor such as vim(1) to hexdump a region marked between `a' and `z'. :'a,'z!xxd Use xxd as a filter within an editor such as vim(1) to recover a binary hexdump marked between `a' and `z'. :'a,'z!xxd -r Use xxd as a filter within an editor such as vim(1) to recover one line of a hexdump. Move the cursor over the line and type: !!xxd -r Read single characters from a serial line % xxd -c1 < /dev/term/b & % stty < /dev/term/b -echo -opost -isig -icanon min 1 % echo -n foo > /dev/term/b RETURN VALUES
The following error values are returned: 0 no errors encountered. -1 operation not supported ( xxd -r -i still impossible). 1 error while parsing options. 2 problems with input file. 3 problems with output file. 4,5 desired seek position is unreachable. SEE ALSO
uuencode(1), uudecode(1), patch(1) WARNINGS
The tools weirdness matches its creators brain. Use entirely at your own risk. Copy files. Trace it. Become a wizard. VERSION
This manual page documents xxd version 1.7 AUTHOR
(c) 1990-1997 by Juergen Weigert <jnweiger@informatik.uni-erlangen.de> Distribute freely and credit me, make money and share with me, lose money and don't ask me. Manual page started by Tony Nugent <tony@sctnugen.ppp.gu.edu.au> <T.Nugent@sct.gu.edu.au> Small changes by Bram Moolenaar. Edited by Juergen Weigert. Manual page for xxd August 1996 XXD(1)
All times are GMT -4. The time now is 07:16 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy