Sponsored Content
Top Forums UNIX for Dummies Questions & Answers tailing a file which contains Control chracters Post 302543932 by gregoryp on Tuesday 2nd of August 2011 12:26:25 PM
Old 08-02-2011
tailing a file which contains Control chracters

Hi.
I have a log file which gets updated by a java process and it uses ASCII STX and ETX characters (i.e CTRL-B and CTRL-C characters) to demarcate each XML message logged.
so the format of the file is something like
Code:
 
STX XML_MESSAGE1
..
..
ETX STX XML_MESSAGE2
..
..
ETX

each XML message spans multiple lines as shown in ...

I dont have control over the process which writes these log files. I now need to write a utility which reads from one of these files and extract each XML message. While doing this i came up with this issue when using tail function.
so basically i did something like
Code:
tail -100f <file> > out_file &

and then after sometime compared the out_file to the original file. I noticed that some parts of some messages are missing. and they are replaced by ASCII NUL characters.
Looks like i only get this issue when the file is getting updated by new content. If i did
Code:
tail +0 <file> > out_file

, then i cant find any weird NUL characters and also the two files compare to be the same in content.
Looks like the issue is something to do with control characters - the STX and ETX. but then i am not in control of this file and i need some way of reading it and processing it. Initially i wrote a java program to do it and used Java IO API. but then when i got these NUL characters , i went and tried tail, which then gave me the same issue.
I am running the tail from a SunOS machine :
SunOS <host_name> 5.10 Generic_142901-08 i86pc i386 i86pc
and the file i am tailing is in a NAS storage and mounted using NFS.

Appreciate any help in trying to resolve this issue.
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

tailing logs

Hi I'd like to achieve the ff functionality; tail -f log | grep keyword ...... and then perform a function. That is, I like to tail a log and when a certain keyword appears I then want my script to play an audio file for example. Any ideas?? Cheers M (1 Reply)
Discussion started by: squeakywheel
1 Replies

2. Shell Programming and Scripting

to see space, tab, end of the line chracters

what can I use ?? In vi, I can use :set list <-- and see end of line $.. or use cat -A but I am wondering if there is command or program that allows me to see all the hidden characters( space, tab and etc) Please help thanks. (3 Replies)
Discussion started by: convenientstore
3 Replies

3. Shell Programming and Scripting

count the number chracters occurances in a line

Hi Could anybody tell me how to count the number of occurances of a character within a LINE. actually i have a single line with words seperated with '+' chracter e.g. abc+def+ghj+jkl+asd i want to separte the words above. Please provide the necessary logic in the form of a SHELL... (6 Replies)
Discussion started by: skyineyes
6 Replies

4. Shell Programming and Scripting

escape chracters

hi, how to echo \\ in unix ie echo the path \\dir1\dir2\\dir3 thanks, Sam (7 Replies)
Discussion started by: sam99
7 Replies

5. Shell Programming and Scripting

Tailing new log file & echo the string on console

Guys, I do have a script that runs to take the server out from network, after running the script it is writing the new log file{outFile} in to directory . Now what i need is my script should tail the last modified file{outFile} & search the string {Server Status} ans should echo the same at the... (0 Replies)
Discussion started by: raghunsi
0 Replies

6. Shell Programming and Scripting

Tailing last modified part of log file

I have a log file which contains data like this This log file is updated twice a day at 7am and 6pm, I want a script(which i will make run at 7:10am and 6:10pm) which should fetch only the last appended lines since last update.. I mean.. if i execute the script at 7.10am 3/3/2010 it... (4 Replies)
Discussion started by: user__user3110
4 Replies

7. Shell Programming and Scripting

Better way to do tailing with awk

my current code: varA=$(tail -200 /var/log/data.txt | egrep -c "Capital|capitol") varB=$(tail -200 /var/log/data.txt | egrep -c "State|Country") varC=$(tail -200 /var/log/data.txt | egrep -c "City|Town") I want to do this a different way. something like: AllVars=$(echo $(tail -200... (5 Replies)
Discussion started by: SkySmart
5 Replies

8. Shell Programming and Scripting

Tailing logs from different files into one single file

Hi Please help me in finding a solution for tailing multiple log files and writing all of them into one common file. I have 4 log files with same name in 4 different folders. Whenever I post a Request - any one of these 4 log files gets updated with some log detail in the below format : ... (5 Replies)
Discussion started by: nisav
5 Replies

9. UNIX for Beginners Questions & Answers

Extract string between two special chracters

Hi Folks - I'm trying to extract the string between two special characters, the "-" and "." symbols. The string format is as such: _PBCS_URL_PRD=https://plan-a503777.pbcs.us6.ocloud.com _PBCS_URL_TST=https://pln-test-a503777.pbcs.us6.ocloud.comIn the above case, I need to extract "a503777".... (7 Replies)
Discussion started by: SIMMS7400
7 Replies

10. Shell Programming and Scripting

Modification of perl script to split a large file into chunks of 5000 chracters

I have a perl script which splits a large file into chunks.The script is given below use strict; use warnings; open (FH, "<monolingual.txt") or die "Could not open source file. $!"; my $i = 0; while (1) { my $chunk; print "process part $i\n"; open(OUT, ">part$i.log") or die "Could... (4 Replies)
Discussion started by: gimley
4 Replies
OD(1)							    BSD General Commands Manual 						     OD(1)

NAME
od -- octal, decimal, hex, ASCII dump SYNOPSIS
od [-aBbcDdeFfHhIiLlOosvXx] [-A base] [-j skip] [-N length] [-t type] [[+]offset[.][Bb]] [file ...] DESCRIPTION
The od utility is a filter which displays the specified files, or standard input if no files are specified, in a user specified format. The options are as follows: -A base Specify the input address base. base may be one of d, o, x or n, which specify decimal, octal, hexadecimal addresses or no address, respectively. -a Output named characters. Equivalent to -t a. -B, -o Output octal shorts. Equivalent to -t o2. -b Output octal bytes. Equivalent to -t o1. -c Output C-style escaped characters. Equivalent to -t c. -D Output unsigned decimal ints. Equivalent to -t u4. -e, -F Output double-precision floating point numbers. Equivalent to -t fD. -f Output single-precision floating point numbers. Equivalent to -t fF. -H, -X Output hexadecimal ints. Equivalent to -t x4. -h, -x Output hexadecimal shorts. Equivalent to -t x2. -I, -L, -l Output signed decimal longs. Equivalent to -t dL. -i Output signed decimal ints. Equivalent to -t dI. -j skip Skip skip bytes of the combined input before dumping. The number may be followed by one of b, k or m which specify the units of the number as blocks (512 bytes), kilobytes and megabytes, respectively. -N length Dump at most length bytes of input. -O Output octal ints. Equivalent to -t o4. -s Output signed decimal shorts. Equivalent to -t d2. -t type Specify the output format. type is a string containing one or more of the following kinds of type specifiers: a Named characters (ASCII). Control characters are displayed using the following names: 000 NUL 001 SOH 002 STX 003 ETX 004 EOT 005 ENQ 006 ACK 007 BEL 008 BS 009 HT 00a NL 00b VT 00c FF 00d CR 00e SO 00f SI 010 DLE 011 DC1 012 DC2 013 DC3 014 DC4 015 NAK 016 SYN 017 ETB 018 CAN 019 EM 01a SUB 01b ESC 01c FS 01d GS 01e RS 01f US 020 SP 0ff DEL c Characters in the default character set. Non-printing characters are represented as 3-digit octal character codes, except the following characters, which are represented as C escapes: NUL alert a backspace  newline carriage-return tab vertical tab v Multi-byte characters are displayed in the area corresponding to the first byte of the character. The remaining bytes are shown as '**'. [d|o|u|x][C|S|I|L|n] Signed decimal (d), octal (o), unsigned decimal (u) or hexadecimal (x). Followed by an optional size specifier, which may be either C (char), S (short), I (int), L (long), or a byte count as a decimal integer. f[F|D|L|n] Floating-point number. Followed by an optional size specifier, which may be either F (float), D (double) or L (long double). -v Write all input data, instead of replacing lines of duplicate values with a '*'. Multiple options that specify output format may be used; the output will contain one line for each format. If no output format is specified, -t oS is assumed. ENVIRONMENT
The LANG, LC_ALL and LC_CTYPE environment variables affect the execution of od as described in environ(7). DIAGNOSTICS
The od utility exits 0 on success, and >0 if an error occurs. COMPATIBILITY
The traditional -s option to extract string constants is not supported; consider using strings(1) instead. SEE ALSO
hexdump(1), strings(1) STANDARDS
The od utility conforms to IEEE Std 1003.1-2001 (``POSIX.1''). HISTORY
An od command appeared in Version 1 AT&T UNIX. BSD
July 11, 2004 BSD
All times are GMT -4. The time now is 03:57 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy