Sponsored Content
Top Forums Shell Programming and Scripting awk command not working as expected Post 302970465 by Don Cragun on Wednesday 6th of April 2016 09:12:28 PM
Old 04-06-2016
In a UTF-8 locale, the two-byte sequence specified by the C string "\347\03" is not a valid character, and it looks like gawk is discarding th contents of that line (or at least the characters starting with the invalid byte sequence and following characters up to the end of the line) are being discarded without printing a diagnostic. The standards only specify the behavior of awk when the files it reads are text files, so it is allowed to do anything it wants in this case. (By definition, file1 is not a text file since it contains byte sequences that are not valid characters in the current locale.) As RudiC suggested, using the C locale (which uses a single-byte character set with all byte sequences being valid characters) instead of the en_US.UTF-8 locale (which uses a variable number of bytes to encode a character and some sequences do not form valid characters) should do what you want in this case. (Note, however, that even in the C locale, NUL bytes are not valid in a text file and, if the file is not an empty file, the last byte of the file must be a <newline> character, and no line in the file can be longer than LINE_MAX bytes long. On most systems, LINE_MAX is 2048. The standards don't allow LINE_MAX to be less than 2048. The command:
Code:
getconf LINE_MAX

will give you the limit on your system.) From what we have seen so far (i.e., the first three lines), file1 appears to be a valid text file in the C locale.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk not working as expected with BIG files ...

I am facing some strange problem. I know, there is only one record in a file 'test.txt' which starts with 'X' I ensure that with following command, awk /^X/ test.txt | wc -l This gives me output = '1'. Now I take out this record out of the file, as follows : awk /^X/ test.txt >... (1 Reply)
Discussion started by: videsh77
1 Replies

2. Shell Programming and Scripting

Var substitution in awk - not working as expected

countA=`awk '/X/''{print substr($0,38,1)}' fName | wc -l` countB=`wc -l fName | awk '{print int($1)}'` echo > temp ratio=`awk -va=$countA -vc=$countB '{printf "%.4f", a/c}' temp` After running script for above I am getting an error as : awk: 0602-533 Cannot find or open file -vc=25. The... (3 Replies)
Discussion started by: videsh77
3 Replies

3. UNIX for Dummies Questions & Answers

Find command not working as expected

I have a script with a find command using xargs to copy the files found to another directory. The find command is finding the appropriate file, but it's not copying. I've checked permissions, and those are all O.K., so I'm not sure what I'm missing. Any help is greatly appreciated. This is... (2 Replies)
Discussion started by: mpflug
2 Replies

4. OS X (Apple)

Cat command not working as expected

I've been trying to figure this out since last night, and I'm just stumped. The last time I did any shell scripting was 8 years ago on a Unix box, and it was never my strong suit. I'm on a Mac running Leopard now. Here's my dilemma - hopefully someone can point me in the right direction. I'm... (10 Replies)
Discussion started by: Daniel M. Clark
10 Replies

5. Shell Programming and Scripting

bash variable (set via awk+sed) not working as expected

Hi! Been working on a script and I've been having a problem. I've finally narrowed it down to this variable I'm setting: servername=$(awk -v FS=\/ '{ print $7 } blah.txt | sed 's\/./-/g' | awk -v FS=\- '{print $1}')" This will essentially pare down a line like this: ... (7 Replies)
Discussion started by: creativedynamo
7 Replies

6. Shell Programming and Scripting

Read command not working as expected

I was trying to write a simple script which will read a text file and count the number of vowels in the file. My code is given below - #!/bin/bash file=$1 v=0 if then echo "$0 filename" exit 1 fi if then echo "$file not a file" exit 2 fi while read -n... (14 Replies)
Discussion started by: linux_learner
14 Replies

7. Shell Programming and Scripting

awk not working as expected in script

Dear all, I had script which used to work, but recently it is not working as expected. I have command line in my shell script to choose the following format from the output_elog and perform some task afterwards on As you see, I want all numbers in foramt following RED mark except for... (12 Replies)
Discussion started by: emily
12 Replies

8. Shell Programming and Scripting

Cp command not working as expected in HPUX

Hi, I'm having trouble with a simple copy command in a script on HPUX. I am trying to copy a file and append date & time. The echo command prints out what I am expecting.. echo "Backing up $file to $file.$DATE.$FIXNUM" | tee -a $LOGFILE + echo 'Backing up... (4 Replies)
Discussion started by: Glennyp
4 Replies

9. Shell Programming and Scripting

awk gsub not working as expected

Hi Experts, Need your kind help with gsub awk. Below is my pattern:"exec=1_host_cnt=100_dup=4_NameTag=targetSrv_500.csv","'20171122112948"," 100"," 1"," 1"," 4","400","","", " aac sample exec ""hostname=XXXXX commandline='timeout 10 openssl speed -multi 2 ; exit 0'"" ","-1","-1","1","... (6 Replies)
Discussion started by: pradyumnajpn10
6 Replies

10. Shell Programming and Scripting

awk matching script not working as expected

This is my ubuntu version: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.4 LTS Release: 16.04 Codename: xenial $ /bin/awk -V | head -n1 bash: /bin/awk: No such file or directory I have gotten a script that helps me to parse,... (14 Replies)
Discussion started by: delbroooks
14 Replies
cmp(1)							      General Commands Manual							    cmp(1)

NAME
cmp - Compares two files SYNOPSIS
cmp [-l | -s] file1 file2 STANDARDS
Interfaces documented on this reference page conform to industry standards as follows: cmp:XCU5.0 Refer to the standards(5) reference page for more information about industry standards and associated tags. OPTIONS
Prints the byte number (decimal) and the differing bytes (octal) for each difference. Does not print data for differing files; returns only an exit value. OPERANDS
The path name of a file to be compared. The path name of a file to be compared. DESCRIPTION
The cmp command compares two files. If file1 or file2 is - (dash), standard input is used for that file. It is an error to specify - for both files. By default, the cmp command prints no information if the files are the same. If the files differ, cmp prints the byte and line number where the difference occurred. The cmp command also specifies whether one file is an initial subsequence of the other (that is, if the cmp command reads an End-of-File character in one file before finding any differences). Usually, you use the cmp command to compare nontext files and the diff command to compare text files. Note that bytes and lines reported by cmp are numbered from 1. EXIT STATUS
The following exit values are returned: The files are identical. The files differ. This includes files of different lengths that are identical in the first part of both files. An error occurred. EXAMPLES
To determine whether two files are identical, enter: cmp prog.o.bak prog.o The preceding command compares the files prog.o.bak and prog.o. If the files are identical, a message is not displayed. If the files differ, the location of the first difference is displayed. For instance: prog.o.bak prog.o differ: byte 5, line 1 If the message cmp: EOF on prog.o.bak is displayed, then the first part of prog.o is identical to prog.o.bak, but there is addi- tional data in prog.o. If the message cmp: EOF on prog.o is displayed, it is prog.o.bak that is the same as prog.o but also contains addition data. To display each pair of bytes that differ, enter: cmp -l prog.o.bak prog.o This compares the files and then displays the byte number (in decimal) and the differing bytes (in octal) for each difference. For example, if the fifth byte is octal 101 in prog.o.bak and 141 in prog.o, then the cmp command displays: 5 101 141 . . . ENVIRONMENT VARIABLES
The following environment variables affect the execution of cmp: Provides a default value for the internationalization variables that are unset or null. If LANG is unset or null, the corresponding value from the default locale is used. If any of the internationalization vari- ables contain an invalid setting, the utility behaves as if none of the variables had been defined. If set to a non-empty string value, overrides the values of all the other internationalization variables. Determines the locale for the interpretation of sequences of bytes of text data as characters (for example, single-byte as opposed to multibyte characters in arguments). Determines the locale for the for- mat and contents of diagnostic messages written to standard error. Determines the location of message catalogues for the processing of LC_MESSAGES. SEE ALSO
Commands: comm(1), bdiff(1), diff(1), diff3(1), sdiff(1) Standards: standards(5) cmp(1)
All times are GMT -4. The time now is 03:45 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy