Don't understand how RS functions in awk


Login or Register for Dates, Times and to Reply

 
Thread Tools Search this Thread
# 1  
Don't understand how RS functions in awk

I learn using RS in awk to extract portion of file in this forum which is wonderful solution to the problem. However, I don't understand how exactly it operates.

I don't quite understand the mechanism behind how searching for /DATA2/ can result in extracting the whole section under "DATA2"

sample
=====
Code:
DATA1
data11
data12

DATA2
data21
data22

DATA3
data31
data32

Code:
$cat sample | awk 'BEGIN {RS=""} /DATA2/'
DATA2
data21
data22

Since RS is set to be empty string, so each line now should be regarded as a field and so I expected printing $1 and $2 would give me the output of DATA1 and data11 but it didn't. Instead, it returned me with what is shown below:

Code:
$ cat sample | awk 'BEGIN {RS=""} { print $1 }'
DATA1
DATA2
DATA3
$ cat sample | awk 'BEGIN {RS=""} { print $2 }'
data11
data21
data31

So, can someone explain to me why it behaved this way?? Thanks!


Moderator's Comments:
Mod Comment Please use code tags, thank you!

Last edited by Franklin52; 08-28-2010 at 10:14 AM..
# 2  
It looks fine to me. In your latter examples you did not specify a record, so it produces the fields for all the records. For comparison:
Code:
$ awk 'BEGIN {RS=""} /DATA2/{ print $1,$2 }' infile
DATA2 data21

# 3  
Quote:
Originally Posted by joe228
I learn using RS in awk to extract portion of file in this forum which is wonderful solution to the problem. However, I don't understand how exactly it operates.

I don't quite understand the mechanism behind how searching for /DATA2/ can result in extracting the whole section under "DATA2"

sample
=====
DATA1
data11
data12

DATA2
data21
data22

DATA3
data31
data32

$cat sample | awk 'BEGIN {RS=""} /DATA2/'
DATA2
data21
data22

Since RS is set to be empty string, so each line now should be regarded as a field and so I expected printing $1 and $2 would give me the output of DATA1 and data11 but it didn't. Instead, it returned me with what is shown below:

$ cat sample | awk 'BEGIN {RS=""} { print $1 }'
DATA1
DATA2
DATA3
$ cat sample | awk 'BEGIN {RS=""} { print $2 }'
data11
data21
data31

So, can someone explain to me why it behaved this way?? Thanks!
It behaves like that, because setting RS to empty string causes AWK to go into special mode, where it separates records by empty lines, so in your example you end up with three records:
Code:
DATA1        |
data11       |   1st record (NR=1)
data12       |

DATA2        |
data21       |   2nd record (NR=2)
data22       |

DATA3        |
data31       |   3rd record (NR=3)
data32       |

In that mode one more thing is changed. Field separator is now not only space or tab, but also newline. So inside each of those records you end up with three fields:
Code:
DATA1        <=  1st field ($1)
data11       <=  2nd field ($2)   
data12       <=  3rd field ($3)

I hope it cleared things up for you.
# 4  
default RS is "\n" or new line, it used to separate the records . So by default, each line is a record.

if RS="", then use the empty line as record separater.
Login or Register for Dates, Times and to Reply

Previous Thread | Next Thread
Thread Tools Search this Thread
Search this Thread:
Advanced Search

Test Your Knowledge in Computers #738
Difficulty: Medium
RadioShack introduced a 50 MB external hard disk for the TRS-80 Model III/4 in 1983.
True or False?

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Don't understand AWK asort behaviour

Hello, I have the following script : BEGIN { print "1 ***"; split("abc",T,""); T="e"; T="z"; T="y"; for (i in T) printf("%i:%s ",i,T); print ""; for (i=1; i<=length(T); i++) printf(T); print "" print "2 ***"; asort(T,U); for (i in U) printf("%i:%s ",i,U); ... (3 Replies)
Discussion started by: jgilot
3 Replies

2. Programming

Understand Virtual functions Internals

I am just trying to understand the virtual fns. concept. I know that if I have a virtual fn. in a base class and its overridden fn. in derived class then based upon the address of base/derived object stored in the base class pointer the fns. will be called. In the below code I had kept... (2 Replies)
Discussion started by: rupeshkp728
2 Replies

3. UNIX for Dummies Questions & Answers

I don't understand conditions :(

Hi there, I have a very general question. I'm rather new to (bash) shell scripting and I don't understand how conditions work... I've read numerous tutorials but I don't get it. I really don't. Sometime what I do works, sometime it doesn't and that's frustating. So what's the actual difference... (0 Replies)
Discussion started by: hypsis
0 Replies

4. UNIX for Dummies Questions & Answers

trying to compile and don't understand error message

this is my program i am trying to compile /* filedata -- display information about a file */ #include <stdlib.h> #include <stdio.h> #include <sys/stat.h> #include <sys/types.h> /* * use octarray for determing * if permission bits set */ static short octarray = {0400, 0200, 0100,... (2 Replies)
Discussion started by: heywoodfloyd
2 Replies

5. Shell Programming and Scripting

Perl syntax that I don't understand.

I'm just trying to confirm that I understand someone's code correctly. If someone has code that says: $foo ||= mysub(); I'm assuming that it means if $foo is nothing or undef, then assign it some value via mysub(). If I'm wrong on this, please let me know. Also, what's the difference... (4 Replies)
Discussion started by: mrwatkin
4 Replies

6. UNIX for Dummies Questions & Answers

Another Simple BASH command I don't understand. Help?

I have a text file called file1 which contains the text: "ls -l" When I enter this command: bash < file1 > file1 file1 gets erased. However if I enter this command: bash < file1 > newfile the output from "ls -l" is stored in newfile. My question is why doesn't file1's text ("ls -l") get... (3 Replies)
Discussion started by: phunkypants
3 Replies

7. Homework & Coursework Questions

I don't understand some basics..

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: 1)find all lines in file ,myf that contain all the words cat dog and mouse in any order and start with the letter... (1 Reply)
Discussion started by: cudders
1 Replies

8. UNIX for Advanced & Expert Users

don't understand the unix script

if {"$my_ext_type" = MAIN]; then cd $v_sc_dir Filex.SH $v_so_dir\/$v_fr_file Can somebody tell me what does this suggest. I am pretty new to unix and I am getting confused. What i understood from here is If we have a file extension name as MAIN which we have then we change the directory to... (1 Reply)
Discussion started by: pochaman
1 Replies

Featured Tech Videos