How to replicate Ruby´s binary file reading with Java?
Hello to all guys,
Maybe some expert could help me.
I have a working ruby script shown below that reads a big binary file (more than 2GB). The chunks of data I want to analyze
is separated by the sequence FF47 withing the binary. So, in the ruby script is defined as "line separator" = FF47 ($/="\xff\x47")
in order to read the file "line by line" avoiding to load the entire big file in memory.
The program works great and now I'm trying to apply this algorithm in Java. I've seen built-in ways in java to read not big binary files
but I don't know how to set as line separator the sequence FF47.
but I don't know how to set as line separator the sequence FF47.
I am no expert in Java, but i don't think this is possible. You probably have to do it yourself, like in good old C. You open a file (fopen()) and use fseek(), fread() and ftell() to find what you search for. The functions are part of the standard library, so they should work the same way in C and Java.
I am no expert in Java, but i don't think this is possible. You probably have to do it yourself, like in good old C. You open a file (fopen()) and use fseek(), fread() and ftell() to find what you search for. The functions are part of the standard library, so they should work the same way in C and Java.
I hope this helps.
bakunin
AFAIK there's no way to use the stdio-based family of library calls (fopen(), etc.) and have them treat the binary sequence "FF47" as a "line" separator.
Even if you could set your LOCALE envvals to use a character set that uses "FF47" as a 16-bit character newline character (if one even exists), the fact that it's a binary file could break things - the "newline" character might not always be in a 16-bit boundary.
The only way to do what the OP asked is to read the file as a binary file, and search for the "FF47" bits. And hope that the way the file was written wasn't in a way that's endian-dependent. Especially when using Java on a little-endian machine (x86, most ARM OS's) as Java tends to read/write data in network byte order - big endian - for portability.
AFAIK there's no way to use the stdio-based family of library calls (fopen(), etc.) and have them treat the binary sequence "FF47" as a "line" separator.
You do not treat them as a "line separator", but simply search for the sequence and then read what's after. Using stdios function calls doesn't have "line separators" because there is no such thing as a "line" which could be separated. Sorry for not mentioning that explicitly, i thought it was obvious.
bakunin
Last edited by bakunin; 11-21-2014 at 11:34 AM..
Reason: typo
You do not treat them as a "line separator", but simply search for the sequence and then read what's after. Using stdios function calls doesn't have "line separators" because there is no such thing as a "line" which could be separated. Sorry for not mentioning that explicitly, i thought it was obvious.
bakunin
Actually, there are two stdio-based calls that process input line-by-line - gets() and fgets().
And yes, the only way to do what the OP wants in Java is to search through the data looking for the binary separator sequence.
Thanks for your answers. Sounds great an option that reads line by line from a binary file in C using get(), fget() as you said, but since the "lines" or chunks are separated by FF65 and in my original ruby code I process very well the chunks with regular expressions, I'm afraid I cannot use C for this task since I thinks it doesn't has support for Perl regular expressions fashion, I'm not sure.
Thanks for your answers. Sounds great an option that reads line by line from a binary file in C using get(), fget() as you said, but since the "lines" or chunks are separated by FF65 and in my original ruby code I process very well the chunks with regular expressions, I'm afraid I cannot use C for this task since I thinks it doesn't has support for Perl regular expressions fashion, I'm not sure.
Regards
The C language doesn't have the regular expressions that Perl uses but it has its own built-in regular expressions the same as sed and awk so look up the man page of regexec / regcomp etc...
Dear Gurus
I am stuck with the peice of work and do not know from where to start.
I get a machine generated file which is binary file contain binary data, i want to read binary data as it is without converting into any other format.
i want to read byte by byte.
Please let me know what... (24 Replies)
this is my code and no matter what record number the user enters i cant get any of the records fields to read into the structure acct. What am i doing wrong?
#include <stdio.h>
typedef struct
{
char name;
int number;
float balance;
} acct_info_t;
int main (int... (0 Replies)
Hi,
I have an shell script program in a remote linux machine which will do some specific monitoring functionality. Also, have some C executables in that machine.
From a windows machine, I want to run the shell script program (If possible using java).
I tried with SSH for this. but, in... (1 Reply)
I am new to shell scripting and I have to to the following
I have a flat file with storename(lenth 20) , emailaddress(lenth 40), location(15). There is NO delimiters in that file.
Like the following str00001.txt
StoreName emailaddress location... (3 Replies)
Hi,
I've searched and couldn't find anyone else with this problem. Is there anyway (preferably using ksh - but other script languages would do) that I can read in binary float data into a text file. The data (arrays from various stages of radar processing) comes in various formats, but mainly... (3 Replies)
In the Java programme, I am calling function, "Runtime.getRuntime().exec( cmdarray ); " with the array of arguments in which first argument is the binary(C-executable) file and argv1,argv2 and so on. This will be executed on Sun OS system.. I can execute using "sh -c cmdarray" on the shell... (0 Replies)
I'm having trouble with reading information back into a program from a binary file. when i try to display the contents of the file i get a Memory fault(coredump). would anyone be able to assist?
this is my fread line fread(&file_data,sizeof(struct book_type),1,fileSave); ive also tried it without... (3 Replies)