Sponsored Content
Top Forums Shell Programming and Scripting Text File with Binary Values processing Post 302994754 by Zam_1234 on Monday 27th of March 2017 03:26:32 PM
Old 03-27-2017
Apple Text File with Binary Values processing

Hello all,
I have a txt file containing millions of lines. Below is the example:

Code:
{tx:be} head -50 file.txt 
Instr1: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 

Instr1: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000000000000000000000001100001010000000010011101001111000000000100010100111111110010000000000000000000000000000000000000001 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000000000000000000000001100001010000000010010101001111000000000100010100111111111000000000000000000000000000000000000000001 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000000000000000000000001100001010000000000000101000011000000000100010101000000000010000000000000000000000000000000000000001 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000110110000000000000000100001010000000010011100101000000000000100010101000000001110000000000000000000000000000000000000001 

Instr1: 000000000100000000000000000001111110000000000000000000010000000000100010101000000010110000000000000010001001111011011100000111 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000110111000111110000100100001010000000011100100101000000000000100010011110110111000000000000000000000000000000000000000001 

Instr1: 000001010110000000000100000100000000101001011101100110100000000000100010011110111000000000000000000000000000000000000000000001 

Instr1: 000001010110000000000011000100000000100101011101100110100000000000100010011110111001000000000000000000000000000000000000000001 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000110111000111101110100100001010000000011100100101000000000000100010011110111010100000000000000000000000000000000000000001 

Instr1: 000000110111000111101110100100001010000000011100100101000000000000100010011110111010100000000000000000000000000000000000000001 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000100110000000000001011100001110000000010011100100010000000000100010100111111110110000000000000000000000000000000000000001 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000000000000000000000001100001010000000000000101000011000000000100010101000000000010000000000000000000000000000000000000001 

Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 

Instr1: 000000110110000000000000000100001010000000010011100101000000000000100010101000000001110000000000000000000000000000000000000001 

Instr1: 000000000100000000000000000001111110000000000000000000010000000000100010101000000010110000000000000010001001111011011100000111

There are empty lines which I take off using "sed 's/^$/d' file.txt"

Now the problem is, I want to find number of uniq values on the binary field. Here is what I want:
in the binary values, I was to find how many times the uniq values in field [57:50] are occuring. (MSB -> 125, LSB -> 0). There are total 126 bits in the lines.
I have sorted the files using sort:
Code:
sort -k2.50,2.57 file.txt

output:
{tx:be} tail -50 file.txt 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000011111111110000000000100010100000001101110000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000100111111110000000000100010100000100101100000000000000000000000000000000000000001 
Instr1: 000000000010000000000000000010000010100111000100111111110000000000100010100000100101100000000000000000000000000000000000000001 
Instr1: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx 
Instr1: xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

As you can see, the files are sorted based on the fields that I am interested in. Now I am not sure how to find the Number of occurence (uniq) in those fields.

I have tried the uniq command, but surely it doesn't help:
Code:
uniq -c -f1 -s75 -w69 file.txt

Output: (truncated)
2751026 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000 
     23 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000001 
     23 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000001 
     23 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000001 
     24 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000011000000000000000000000000000000000000000001 
     24 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000 
     22 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000101000000000000000000000000000000000000000001 
     19 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000110000000000000000000000000000000000000000001 
     17 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000000111000000000000000000000000000000000000000000 
     18 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001000000000000000000000000000000000000000000001 
     18 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001001000000000000000000000000000000000000000001 
     17 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001010000000000000000000000000000000000000000001 
     14 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001011000000000000000000000000000000000000000001 
      8 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001100000000000000000000000000000000000000000001 
     11 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001101000000000000000000000000000000000000000001 
      6 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001110000000000000000000000000000000000000000001 
      5 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000001111000000000000000000000000000000000000000001 
      1 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000000 
      2 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000010000000000000000000000000000000000000000000001 
      4 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000010001000000000000000000000000000000000000000001 
      4 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000010010000000000000000000000000000000000000000001 
      4 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000010011000000000000000000000000000000000000000001 
      4 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000010100000000000000000000000000000000000000000001 
      3 Instr1: 000000000000000000000000000000000000000000000000000000000000000000000000000000010101000000000000000000000000000000000000000001 
     11 Instr1: 000000000100000000000000000000000000000000000000000000001000000000100001110001111111000000000000000010000111001000010010000001 
     12 Instr1: 0000000001000000000000000000000000000000000000000000000010

What I am looking for in output is perhaps: (i am randomly putting values here)
Code:
2000 Instr1[or any sutitable text]: '00000000'
150 Instr1:  '10001100'
120 Instr1: '00100000'
and so on

I think the 'uniq' command should be ok, but I am open to anything.

Thanks in advance.
 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Modifying binary file by editing Hex values ?

Hi , i'm using special binary file (lotus notes) and modifying an hexadecimal address range with windows hex editor and it works fine ! The file is an AIX one and i'm forced to transfert (ftp) it before modifying it and re-transfert ! NOW i would do this directly under AIX ! I can... (4 Replies)
Discussion started by: Nicol
4 Replies

2. UNIX for Dummies Questions & Answers

Processing a text file

A file contains one name per line, such as: john doe jack bruce nancy smith sam riley When I 'cat' the file, the white space is treated as a new line. For example list=`(cat /path/to/file.txt)` for items in $list do echo $items done I get: john doe (1 Reply)
Discussion started by: TheCrunge
1 Replies

3. UNIX for Dummies Questions & Answers

Binary data to text file conversion

Dear Sir; i want to know how the binary data convert to text file or readablw format (ASCII).If possible pl. help me for the software and where it is available for download. i.e. (1 Reply)
Discussion started by: auro123
1 Replies

4. UNIX for Dummies Questions & Answers

How to convert binary Unix file to text

Hi all, I have a print control file (dflt) for Oracle which is in binary. As I am going to develope an application in Window environment, I would like to reference the dflt file. But it is in binary format and I cannot access it. Anyone can suggest me how to convert the file into text or... (5 Replies)
Discussion started by: user12345
5 Replies

5. UNIX for Dummies Questions & Answers

text file processing

Hello! There is a text file, that contains hierarchy of menues, like: Aaaaa->Bbbbb Aaaaa->Cccc Aaaaa-> {spaces} Ddddd (it means that the full path is Aaaaa->Cccc->Ddddd ) Aaaaa-> {more spaces} Eeeee (it means that the full path is Aaaaa->Cccc->Ddddd->Eeeee ) Fffffff->Ggggg... (1 Reply)
Discussion started by: alias47
1 Replies

6. UNIX for Advanced & Expert Users

Converting Binary decimal coded values to Ascii Values

Hi All, Is there any command which can convert binary decimal coded values to ascii values... i have bcd values like below оооооооооооо0о-- -v - Pls suggest a way to convert this. Thanks, Deepti.Gaur (3 Replies)
Discussion started by: gaur.deepti
3 Replies

7. Programming

Reading a binary file in text or ASCII format

Hi All, Please suggest me how to read a binary file in text or ASCII format. thanks Nagendra (3 Replies)
Discussion started by: Nagendra
3 Replies

8. Shell Programming and Scripting

Text processing of file

I have a text file which is a dataset. and I need to convert it into a CSV format The file is as follows : First line : -1 3:1 11:1 14:1 19:1 39:1 42:1 55:1 64:1 67:1 73:1 75:1 76:1 80:1 83:1 Second line " +1 5:1 11:1 15:1 32:1 39:1 40:1 52:1 63:1 67:1 73:1 74:1 76:1 78:1 83:1 There are a... (6 Replies)
Discussion started by: ajayram
6 Replies

9. Programming

awk processing / Shell Script Processing to remove columns text file

Hello, I extracted a list of files in a directory with the command ls . However this is not my computer, so the ls functionality has been revamped so that it gives the filesizes in front like this : This is the output of ls command : I stored the output in a file filelist 1.1M... (5 Replies)
Discussion started by: ajayram
5 Replies

10. Shell Programming and Scripting

Read record from the text file contain multiple separated values & assign those values to variables

I have a file containing multiple values, some of them are pipe separated which are to be read as separate values and some of them are single value all are these need to store in variables. I need to read this file which is an input to my script Config.txt file name, first path, second... (7 Replies)
Discussion started by: ketanraut
7 Replies
All times are GMT -4. The time now is 09:43 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy