awk substr fails


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk substr fails
# 1  
Old 01-11-2012
awk substr fails

Hi all,

I want to get each line of a data file from position 464 plus 8 characters. I tried in two different ways, and the results were different. I'd like to know why.

First method, using awk:
Code:
awk '{print substr($0,464,8)}' CONCIL_VUELTA_ALF_100112_0801.ok

Second method, using scripting:
Code:
while read line; do echo ${line: 463 :8}; done < CONCIL_VUELTA_ALF_100112_0801.ok

As far as I know, in awk position of columns starts at 1, that's why I started in position 464. However using shell expansion it starts at 0, so I started at 463. First of all, is that true?

Well, results are slightly different. Here's a little sample (diff command):
Quote:
75,79c75,79
< 00000000
< 00000000
< 00000000
< 00000000
< 00000000
---
> 0000000
> 0000000
> 0000000
> 0000000
> 0000000
219c219
< 00000000
---
> 0000000
221a222,223
> 0000000
> 0000000
223,225c225
< 00000000
< 00000000
< 00000000
---
> 0000000
237a238
> 0000000
240,241c241
< 00000000
< 00000000
---
I don't know why, but awk sometimes fails because each line should have exactly 8 characters ('00000000'). While using scripting it works perfectly (<), in awk there are lines (>) that only have 6 or 7 characters.
Does anybody knows a reasonable explanation?

PD: I'm using cygwin 1.7.9(0.237/5/3) 2011-03-29 10:10

Thanks a lot.

Albert.
# 2  
Old 01-11-2012
Hi Albert,
please post a sample of your input file.
# 3  
Old 01-11-2012
Hi,

This is the first line that fails (awk only prints 7 characters, although awk output is 8 characters length, 7 zeros and a blank):
Quote:
00221009853102000020150XXXXXXXXX 0000018346077 XXXXXX XXXXXXXXXXX XXXXXX, S.A. 020010614 002ESP 00000000000000000000000000ESP 1JAV. XXXXXXXX, 621 XXXXXXXXXX GESTION TO 2 PL 7 08028XXXXXXXXX ESP 00100001000000000
Whole line length is 546, but I don't know how post line with all blanks. Once I submit this post (or preview), blanks dissapear. So, I uploaded that line in attached sample1.txt

Other things I tried:
Code:
echo $text | awk '{print substr($0, 464, 8)}'

where $text is the line I posted above. It works, it prints 8 zeros.
I also tried on the file I'm uploading running
Code:
awk '{print substr($0, 464, 8)}' sample1.txt

It also works.
However when I ran the command over the whole file, it fails(which has more than 20 thousand lines and I cannot post because it has personal data):
Code:
awk '{print substr($0, 464, 8)}' CONCIL_VUELTA_ALF_100112_0801.ok

It prints:
Quote:
[...]
00000000
00000000

0000000 <-- This is the line
0000000
0000000
0000000
0000000

00000000
00000000
00000000
[...]
It's an strange behaviour, because bold line is not the first occurrence. Blank lines are 100 characters length and I thought that could be the problem, but it cannot be. Sometimes awk prints right after 100 characters lines, sometimes it doesn't. I cannot find any pattern that explains why awk doesn't work properly and built-in functions does.

Thanks a lot, and sorry for my english xD

Albert.
# 4  
Old 01-11-2012
We need a sample data to reproduce the issue ...
Could you try to modify or remove the sensitive data from the input file
and post it here as attachment?
# 5  
Old 01-11-2012
Uff, it's too much work.
There are more than 20 thousand lines in that file, and not all of them has the same pattern.

I just wonder why similar commands had different output. I thought awk and scripting I posted before should work in the same way.
However, what I was attempting to do, it's done.

Thanks for your time.

Albert.
# 6  
Old 01-11-2012
Quote:
Originally Posted by AlbertGM
[...]
I just wonder why similar commands had different output. I thought awk and scripting I posted before should work in the same way.
Depends on what you mean by similar ...
Consider the following:

Code:
zsh-4.3.14[t]% printf '\t 42\t \t \n'
         42              
zsh-4.3.14[t]% printf '\t 42\t \t \n' | awk '{ print length }'
8
zsh-4.3.14[t]% printf '\t 42\t \t \n' | { read; echo ${#REPLY};}
2
zsh-4.3.14[t]% printf '\t 42\t \t \n' | { IFS= read; echo ${#REPLY};}
8

And this is only part of the possible corner cases ...

Quote:
However, what I was attempting to do, it's done.

Thanks for your time.
Glad you've solved it.
This User Gave Thanks to radoulov For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk and substr

Hello All; I have an input file 'abc.txt' with below text: 512345977,213458,100021 512345978,213454,100031 512345979,213452,100051 512345980,213455,100061 512345981,213456,100071 512345982,213456,100091 512345983,213457,100041 512345984,213451,100011 I need to paste the first field... (10 Replies)
Discussion started by: mystition
10 Replies

2. Shell Programming and Scripting

HELP : awk substr

Hi, - In a file test.wmi Col1 | firstName | lastName 4003 | toto_titi_CT- | otot_itit - I want to have only ( colones $7,$13 and $15) with code 4003 and 4002. for colone $13 I want to have the whole name untill _CT- or _GC- 1- I used the command egrep with awk #egrep -i... (2 Replies)
Discussion started by: georg2014
2 Replies

3. Shell Programming and Scripting

awk substr

Hello life savers!! Is there any way to use substr in awk command for returning one part of a string from declared start and stop point? I mean I know we have this: substr(string, start, length) Do we have anything like possible to use in awk ? : substr(string, start, stop) ... (9 Replies)
Discussion started by: @man
9 Replies

4. Shell Programming and Scripting

Substr with awk

Hi to all, I'm here again, cause I need your help to solve another issue for me. I have some files that have this name format: date_filename.csv In my shell I must rename each file removing the date so that the file name is filename.csv To do this I use this command: fnames=`ls ${fname}|... (2 Replies)
Discussion started by: leobdj
2 Replies

5. Shell Programming and Scripting

Help with awk and substr

I have the following to find lines matching "COMPLETE" and extract parts of it using substr. sed -n "/COMPLETE/p" 1.txt | awk 'BEGIN { FS = "\" } {printf"%s %s:%s \n", substr($3,17,3),substr($6,4,1), substr($7,4,1)}' | sort | uniq > temp.txt Worked fine until the numbers in 2nd & 3rd substr... (5 Replies)
Discussion started by: zpn
5 Replies

6. Shell Programming and Scripting

Korn expr substr fails for non-numeric value

I am running AIX 5.3 using the Korn Shell. I am reading file names from a file, as an example: E0801260 E0824349 E0925345 EMPMSTR statement "num=$(expr substr "$DDNAME" 4 2) extracts the numeric values fine. But when I het the last entry, it returns num=MS, but I get an error... (19 Replies)
Discussion started by: kafkaf55
19 Replies

7. Shell Programming and Scripting

awk substr

Hi I have multiple files that name begins bidb_yyyymm. (yyyymm = current year month of file creation). What I want to do is look at the files and where yyyymm is older than 1 month I want to remove the file from the server. I was looking at looping through the files and getting the yyyymm... (2 Replies)
Discussion started by: colesga
2 Replies

8. UNIX for Dummies Questions & Answers

awk or substr

i have a variable 200612 the last two digits of this variable should be between 1 and 12, it should not be greater than 12 or less than 1 (for ex: 00 or 13,14,15 is not accepted) how do i check for this conditions in a unix shell script. thanks Ram (3 Replies)
Discussion started by: ramky79
3 Replies

9. Shell Programming and Scripting

How to use awk substr ?

Hi all, I have a flatfile I would like to get ext = 7950 , how do I do that ? if ($1 == "CTI-ProgramStart") { ext = substr($9,index($9,"Extension")+11,4); But why it is not working ???? Please help . Thanks (1 Reply)
Discussion started by: sabercats
1 Replies

10. Shell Programming and Scripting

awk substr?

Sorry if this has been posted before, I searched but not sure what I really want to do. I have a file with records that show who has logged into my application: 2003-03-14:I:root: Log_mesg: registered servername:userid. (more after this) I want to pull out the userid, date and time into... (2 Replies)
Discussion started by: MizzGail
2 Replies
Login or Register to Ask a Question