Sponsored Content
Top Forums Shell Programming and Scripting awk command not working as expected Post 302970465 by Don Cragun on Wednesday 6th of April 2016 09:12:28 PM
Old 04-06-2016
In a UTF-8 locale, the two-byte sequence specified by the C string "\347\03" is not a valid character, and it looks like gawk is discarding th contents of that line (or at least the characters starting with the invalid byte sequence and following characters up to the end of the line) are being discarded without printing a diagnostic. The standards only specify the behavior of awk when the files it reads are text files, so it is allowed to do anything it wants in this case. (By definition, file1 is not a text file since it contains byte sequences that are not valid characters in the current locale.) As RudiC suggested, using the C locale (which uses a single-byte character set with all byte sequences being valid characters) instead of the en_US.UTF-8 locale (which uses a variable number of bytes to encode a character and some sequences do not form valid characters) should do what you want in this case. (Note, however, that even in the C locale, NUL bytes are not valid in a text file and, if the file is not an empty file, the last byte of the file must be a <newline> character, and no line in the file can be longer than LINE_MAX bytes long. On most systems, LINE_MAX is 2048. The standards don't allow LINE_MAX to be less than 2048. The command:
Code:
getconf LINE_MAX

will give you the limit on your system.) From what we have seen so far (i.e., the first three lines), file1 appears to be a valid text file in the C locale.
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk not working as expected with BIG files ...

I am facing some strange problem. I know, there is only one record in a file 'test.txt' which starts with 'X' I ensure that with following command, awk /^X/ test.txt | wc -l This gives me output = '1'. Now I take out this record out of the file, as follows : awk /^X/ test.txt >... (1 Reply)
Discussion started by: videsh77
1 Replies

2. Shell Programming and Scripting

Var substitution in awk - not working as expected

countA=`awk '/X/''{print substr($0,38,1)}' fName | wc -l` countB=`wc -l fName | awk '{print int($1)}'` echo > temp ratio=`awk -va=$countA -vc=$countB '{printf "%.4f", a/c}' temp` After running script for above I am getting an error as : awk: 0602-533 Cannot find or open file -vc=25. The... (3 Replies)
Discussion started by: videsh77
3 Replies

3. UNIX for Dummies Questions & Answers

Find command not working as expected

I have a script with a find command using xargs to copy the files found to another directory. The find command is finding the appropriate file, but it's not copying. I've checked permissions, and those are all O.K., so I'm not sure what I'm missing. Any help is greatly appreciated. This is... (2 Replies)
Discussion started by: mpflug
2 Replies

4. OS X (Apple)

Cat command not working as expected

I've been trying to figure this out since last night, and I'm just stumped. The last time I did any shell scripting was 8 years ago on a Unix box, and it was never my strong suit. I'm on a Mac running Leopard now. Here's my dilemma - hopefully someone can point me in the right direction. I'm... (10 Replies)
Discussion started by: Daniel M. Clark
10 Replies

5. Shell Programming and Scripting

bash variable (set via awk+sed) not working as expected

Hi! Been working on a script and I've been having a problem. I've finally narrowed it down to this variable I'm setting: servername=$(awk -v FS=\/ '{ print $7 } blah.txt | sed 's\/./-/g' | awk -v FS=\- '{print $1}')" This will essentially pare down a line like this: ... (7 Replies)
Discussion started by: creativedynamo
7 Replies

6. Shell Programming and Scripting

Read command not working as expected

I was trying to write a simple script which will read a text file and count the number of vowels in the file. My code is given below - #!/bin/bash file=$1 v=0 if then echo "$0 filename" exit 1 fi if then echo "$file not a file" exit 2 fi while read -n... (14 Replies)
Discussion started by: linux_learner
14 Replies

7. Shell Programming and Scripting

awk not working as expected in script

Dear all, I had script which used to work, but recently it is not working as expected. I have command line in my shell script to choose the following format from the output_elog and perform some task afterwards on As you see, I want all numbers in foramt following RED mark except for... (12 Replies)
Discussion started by: emily
12 Replies

8. Shell Programming and Scripting

Cp command not working as expected in HPUX

Hi, I'm having trouble with a simple copy command in a script on HPUX. I am trying to copy a file and append date & time. The echo command prints out what I am expecting.. echo "Backing up $file to $file.$DATE.$FIXNUM" | tee -a $LOGFILE + echo 'Backing up... (4 Replies)
Discussion started by: Glennyp
4 Replies

9. Shell Programming and Scripting

awk gsub not working as expected

Hi Experts, Need your kind help with gsub awk. Below is my pattern:"exec=1_host_cnt=100_dup=4_NameTag=targetSrv_500.csv","'20171122112948"," 100"," 1"," 1"," 4","400","","", " aac sample exec ""hostname=XXXXX commandline='timeout 10 openssl speed -multi 2 ; exit 0'"" ","-1","-1","1","... (6 Replies)
Discussion started by: pradyumnajpn10
6 Replies

10. Shell Programming and Scripting

awk matching script not working as expected

This is my ubuntu version: $ lsb_release -a No LSB modules are available. Distributor ID: Ubuntu Description: Ubuntu 16.04.4 LTS Release: 16.04 Codename: xenial $ /bin/awk -V | head -n1 bash: /bin/awk: No such file or directory I have gotten a script that helps me to parse,... (14 Replies)
Discussion started by: delbroooks
14 Replies
CHARMAP(5)                                                   Linux Programmer's Manual                                                  CHARMAP(5)

NAME
charmap - character set description file DESCRIPTION
A character set description (charmap) defines all available characters and their encodings in a character set. localedef(1) can use charmaps to create locale variants for different character sets. Syntax The charmap file starts with a header that may consist of the following keywords: <code_set_name> is followed by the name of the character map. <comment_char> is followed by a character that will be used as the comment character for the rest of the file. It defaults to the number sign (#). <escape_char> is followed by a character that should be used as the escape character for the rest of the file to mark characters that should be interpreted in a special way. It defaults to the backslash (). <mb_cur_max> is followed by the maximum number of bytes for a character. The default value is 1. <mb_cur_min> is followed by the minimum number of bytes for a character. This value must be less than or equal than <mb_cur_max>. If not speci- fied, it defaults to <mb_cur_max>. The character set definition section starts with the keyword CHARMAP in the first column. The following lines may have one of the two following forms to define the character set: <character> byte-sequence comment This form defines exactly one character and its byte sequence, comment being optional. <character>..<character> byte-sequence comment This form defines a character range and its byte sequence, comment being optional. The character set definition section ends with the string END CHARMAP. The character set definition section may optionally be followed by a section to define widths of characters. The WIDTH_DEFAULT keyword can be used to define the default width for all characters not explicitly listed. The default character width is 1. The width section for individual characters starts with the keyword WIDTH in the first column. The following lines may have one of the two following forms to define the widths of the characters: <character> width This form defines the width of exactly one character. <character>...<character> width This form defines the width for all the characters in the range. The width definition section ends with the string END WIDTH. FILES
/usr/share/i18n/charmaps Usual default character map path. CONFORMING TO
POSIX.2. EXAMPLE
The Euro sign is defined as follows in the UTF-8 charmap: <U20AC> /xe2/x82/xac EURO SIGN SEE ALSO
iconv(1), locale(1), localedef(1), locale(5), charsets(7) COLOPHON
This page is part of release 4.15 of the Linux man-pages project. A description of the project, information about reporting bugs, and the latest version of this page, can be found at https://www.kernel.org/doc/man-pages/. GNU 2016-07-17 CHARMAP(5)
All times are GMT -4. The time now is 03:48 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy