Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Escaping non-readable characters using grep, sed or awk Post 302399806 by seanwpaul on Monday 1st of March 2010 04:05:21 PM
Old 03-01-2010
Escaping non-readable characters using grep, sed or awk

I'm trying to parse out DNS logs from dozens of different domain controllers over a large period of time. The logs are rolled up into individual text files by size, which may contain only a portion of a day's activity or several day's worth (depending on amount of activity). I'm splitting them by date, and transforming the information into a .CSV file to be parsed again later.

Splitting the files up by date is fairly easy, but every once in a while there's errors that will cause the script to fail. Each instance of the script crashing is from special characters being introduced in the results, which cause awk to crash.

I've been trying to find a way to not return lines that have these kinds of characters, but haven't managed to find one yet.

Example: (yes, it has been altered to conceal IP Addresses and domains queried)
Code:
20100207 03:52:40 3F8 PACKET  04073FC0 UDP Rcv 192.168.40.200  a61c   Q [0001   D   NOERROR] A     (7)èˆ5(3)217(3)123(2)11(3)dis(2)ds(4)test(3)com(0)

20100207 03:52:40 924 PACKET  04CCC250 UDP Rcv 192.168.40.200  06d4   Q [0001   D   NOERROR] A     (7)èˆ5(3)217(3)123(2)11(2)ds(4)test(3)com(0)

20100207 03:52:40 EFC PACKET  045E80A0 UDP Rcv 192.168.40.115  35bc   Q [0001   D   NOERROR] PTR   (3)200(2)40(3)138(3)160(7)in-addr(4)test(0)

20100207 03:52:40 DB4 PACKET  03D02080 UDP Rcv 192.168.40.200  265b   Q [0001   D   NOERROR] A     (7)èˆ5(3)217(3)123(2)11(4)test(3)com(0)

Of the 4 lines above, I only want to push the 3rd to awk. Since I don't know where, when, or even if, they'll appear in a file, I can't simply tell it what line to skip, or something along those lines (get 1 file with the special characters, about every 700 or so, out of 90,000 files).

Someone in my office suggested I try sed, but I'm not familiar with it, beyond simple find/replaces, and instances of "sed l" will translate the first line with special characters, then print the line a second time, and crashing.

The code I'm using to find and transform my data is: where "date" = the date I'm searching for and "file" = the domain controller DNS log and "domain-controller" = the name of the domain controller that was the source
Code:
grep.exe -F "date" file | awk "{OFS=\",\"};{print $1,$2,$8,$14,$15,\" >> \"date_domain-controller_DailyRollup.csv\"}"

BTW, yes, that's ugly, but part of it is I have to do it on Windows (yes, I know, believe me).
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

escaping special characters in file name...

dear, I would like to rename files in a dir to another format, so I write a bash shell script to handle it. But my problem now is how to handle files having special characters like spaces, (, ): "a b c (d).doc" It seems that I need to escape those characters before applying the "mv" command.... (1 Reply)
Discussion started by: lau0001
1 Replies

2. Shell Programming and Scripting

Delete not readable characters

Hi All, I wanted to delete all the unwanted characters in the string. ie, to delete all the characters which are not alpha numeric values. var1="a./bc" var2='abc/\."123' like to get the output as print var1 abc print var2 abc123 Could you guys help me out pls. Your help is... (3 Replies)
Discussion started by: ajilesh
3 Replies

3. Shell Programming and Scripting

Escaping Special Characters-Help

Hi All, I am having a trouble in passing special characters to a script. As I am new to bash script I dont know how to go and solve this. mypwd=(a+sdfg!h# if i pass $mypwd to a bash script, it is not accepting "(,!,+ etc". It would be a great help if some one can help to escape these... (3 Replies)
Discussion started by: Tuxidow
3 Replies

4. Shell Programming and Scripting

Escaping Special characters

I want to append the following line to /var/spool/cron/root: */7 * * * * /root/'Linux CPU (EDF).sh' > /dev/null 2>&1 How to accomplish this using echo? ---------- Post updated at 04:09 PM ---------- Previous update was at 04:07 PM ---------- "Linux CPU (EDF)" is actually stored in a... (11 Replies)
Discussion started by: proactiveaditya
11 Replies

5. Shell Programming and Scripting

grep or sed. How to remove certain characters

Here is my problem. I have a list of phone numbers that I want to use only the last 4 digits as PINs for something I am working on. I have all the numbers in a file but now I want to be removed all items EXCEPT the last 4 digits. I have seen sed commands and some grep commands but I am... (10 Replies)
Discussion started by: Sucio
10 Replies

6. Shell Programming and Scripting

Help with escaping xml characters in a file

Hi, I have a file xy.csv with the following data separated by pipe (|): BC-NACO|12>ISA43<TEST| A & A INC|FAMOUS'S AL| i need to escape the xml characters as below BC-NACO|12&gt;ISA43&lt;TEST| A &amp; A INC|FAMOUS&apos;S AL| Please advise (5 Replies)
Discussion started by: prasannarajesh
5 Replies

7. Shell Programming and Scripting

SED equivalent for grep -w -f with pattern having special characters

I'm looking for SED equivalent for grep -w -f. All I want is to search a list of patterns from a file. Also If the pattern doesn't match I do not want "null returned", rather I would prefer some text as place holder say "BLANK LINE" as I intend to process the output file based on line number. ... (1 Reply)
Discussion started by: novice_man
1 Replies

8. Shell Programming and Scripting

Escaping special characters

I'm attempting a little hack to get grep to highlight (change foreground color to red) a found string. Assuming a target file "test" consisting of the word "albert": My executable "algrep" consists of this: grep $1 $2 | sed "s/$1/\\\033 And when run: algrep al test Produces this:... (2 Replies)
Discussion started by: tiggyboo
2 Replies

9. Shell Programming and Scripting

sed replacing specific characters and control characters by escaping

sed -e "s// /g" old.txt > new.txt While I do know some control characters need to be escaped, can normal characters also be escaped and still work the same way? Basically I do not know all control characters that have a special meaning, for example, ?, ., % have a meaning and have to be escaped... (11 Replies)
Discussion started by: ijustneeda
11 Replies

10. Shell Programming and Scripting

sed or awk grep, that will only get the line with more characters.

Is there a command for sed and awk that will only sort the line with more characters? #cat file 123 12345 12 asdgjljhhho bac ss Output: asdgjljhhho #cat file2 11.2 12345.00 21.222 12345678.10 (2 Replies)
Discussion started by: invinzin21
2 Replies
largefile(5)                                            Standards, Environments, and Macros                                           largefile(5)

NAME
largefile - large file status of utilities DESCRIPTION
A large file is a regular file whose size is greater than or equal to 2 Gbyte ( 2**31 bytes). A small file is a regular file whose size is less than 2 Gbyte. Large file aware utilities A utility is called large file aware if it can process large files in the same manner as it does small files. A utility that is large file aware is able to handle large files as input and generate as output large files that are being processed. The exception is where additional files are used as system configuration files or support files that can augment the processing. For example, the file utility supports the -m option for an alternative "magic" file and the -f option for a support file that can contain a list of file names. It is unspecified whether a utility that is large file aware will accept configuration or support files that are large files. If a large file aware utility does not accept configuration or support files that are large files, it will cause no data loss or corruption upon encountering such files and will return an appropriate error. The following /usr/bin utilities are large file aware: adb awk bdiff cat chgrp chmod chown cksum cmp compress cp csh csplit cut dd dircmp du egrep fgrep file find ftp getconf grep gzip head join jsh ksh ln ls mdb mkdir mkfifo more mv nawk page paste pathchck pg rcp remsh rksh rm rmdir rsh sed sh sort split sum tail tar tee test touch tr uncompress uudecode uuencode wc zcat The following /usr/xpg4/bin utilities are large file aware: awk cp chgrp chown du egrep fgrep file grep ln ls more mv rm sed sh sort tail tr The following /usr/xpg6/bin utilities are large file aware: getconf ls tr The following /usr/sbin utilities are large file aware: install mkfile mknod mvdir swap See the USAGE section of the swap(1M) manual page for limitations of swap on block devices greater than 2 Gbyte on a 32-bit operating sys- tem. The following /usr/ucb utilities are large file aware: chown from ln ls sed sum touch The /usr/bin/cpio and /usr/bin/pax utilities are large file aware, but cannot archive a file whose size exceeds 8 Gbyte - 1 byte. The /usr/bin/truss utilities has been modified to read a dump file and display information relevant to large files, such as offsets. cachefs file systems The following /usr/bin utilities are large file aware for cachefs file systems: cachefspack cachefsstat The following /usr/sbin utilities are large file aware for cachefs file systems: cachefslog cachefswssize cfsadmin fsck mount umount nfs file systems The following utilities are large file aware for nfs file systems: /usr/lib/autofs/automountd /usr/sbin/mount /usr/lib/nfs/rquotad ufs file systems The following /usr/bin utility is large file aware for ufs file systems: df The following /usr/lib/nfs utility is large file aware for ufs file systems: rquotad The following /usr/xpg4/bin utility is large file aware for ufs file systems: df The following /usr/sbin utilities are large file aware for ufs file systems: clri dcopy edquota ff fsck fsdb fsirand fstyp labelit lockfs mkfs mount ncheck newfs quot quota quotacheck quotaoff quotaon repquota tunefs ufsdump ufsrestore umount Large file safe utilities A utility is called large file safe if it causes no data loss or corruption when it encounters a large file. A utility that is large file safe is unable to process properly a large file, but returns an appropriate error. The following /usr/bin utilities are large file safe: audioconvert audioplay audiorecord comm diff diff3 diffmk ed lp mail mailcompat mailstats mailx pack pcat red rmail sdiff unpack vi view The following /usr/xpg4/bin utilities are large file safe: ed vi view The following /usr/xpg6/bin utility is large file safe: ed The following /usr/sbin utilities are large file safe: lpfilter lpforms The following /usr/ucb utilities are large file safe: Mail lpr The following /usr/lib utility is large file safe: sendmail SEE ALSO
lf64(5), lfcompile(5), lfcompile64(5) SunOS 5.10 7 Nov 2003 largefile(5)
All times are GMT -4. The time now is 08:28 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy