awk and substr


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk and substr
# 8  
Old 03-28-2016
Quote:
Originally Posted by Akshay Hegde
I added few records to your file just to test command that is as follows

Code:
[akshay@localhost tmp]$ cat abc.txt
512345977,213458,100021
512345978,213454,100031
512345979,213452,100051
512345980,213455,100061
512345981,213456,100071
512345982,213456,100091
512345983,213457,100041
512345984,213451,100011
512345910,213451,100011
512345909,213451,100011
512345917,213451,100011
512345922,213451,100011
512345927,213451,100011
512345939,213451,100011

Code:
[akshay@localhost tmp]$ cat map.txt
0,15,file1.txt
16,25,file2.txt
26,40,file3.txt

One liner command as you expected Smilie
Code:
[akshay@localhost tmp]$ awk -F, 'FNR==NR{a[$3]=$1 FS $2;next}{for(i in a){c=substr($1,length($1)-1);split(a[i],d); if(c>=d[1]&& c<=d[2])printf("General,%s,123,0,%s,0\n",$1,$3)>i }}' map.txt abc.txt


Output


file1.txt
Code:
[akshay@localhost tmp]$ cat file1.txt 
General,512345910,123,0,100011,0
General,512345909,123,0,100011,0

file2.txt
Code:
[akshay@localhost tmp]$ cat file2.txt 
General,512345917,123,0,100011,0
General,512345922,123,0,100011,0

file3.txt
Code:
[akshay@localhost tmp]$ cat file3.txt 
General,512345927,123,0,100011,0
General,512345939,123,0,100011,0

Readable version
Code:
awk -F, '
      # Read map.txt set array a where index of array being column3, and element being
      # column1 and column2 separated by input field separator
      # FNR==NR is true only when awk reads first file
      FNR==NR{
                   a[$3]=$1 FS $2
                   next
             }

      # Read second file abc.txt
             {
                     # Loop through array elements
                     for(i in a)
                     {
                           # Extract last 2 digits from column1 of current line read
                           c=substr($1,length($1)-1);
                      
                           # split array value a[i] into array d where separator being FS
                           # where d[1] is min value, d[2] is max value
                           split(a[i],d) 

                           # if last 2 digits lies withing range, write to file i 
                           if(c>=d[1]&& c<=d[2])
                                  printf("General,%s,123,0,%s,0\n",$1,$3) > i 
                     }
              }
        ' map.txt abc.txt

In current context close(file) that is close(i) is not incorporated, you should make some provision to close file,
if you are writing to many files, as you may receive makes too many open files error.
I am trying to understand your code - its a really good piece of work.

But I am getting an error.

Code:
awk -F, 'FNR==NR{a[$3]=$1 FS $2;next}{for(i in a){c=substr($1,length($1)-1);split(a[i],d); if(c>=d[1]&& c<=d[2])printf("GENERAL,%s,,,,,%s,,,,,0\n",$1,$3)>i }}' map.txt abc.txt

awk: cmd. line:1: (FILENAME=abc.txt FNR=31) fatal: expression for `>' redirection has null string value

Sharing the first few lines of my sample input file abc.txt
Code:
512345678,200001,10234
512345679,200001,10234
512345680,200001,10234
512345681,200001,10234
512345682,200001,10234
512345683,200001,10234
512345684,200001,10234
512345685,200001,10234
512345686,200001,10234
512345687,200001,10234
512345688,200001,10234
512345689,200001,10234
512345690,200001,10234
512345691,200001,10234
512345692,200001,10234
512345693,200001,10234
512345694,200001,10234
512345695,213456,10001
512345696,213456,10001
512345697,213456,10001
512345698,213456,10001
512345699,213456,10001
512345700,213456,10001
512345701,213456,10001
512345702,213456,10001
512345703,213456,10001
512345704,213456,10001
512345705,213456,10001
512345706,213456,10001
512345707,213456,10001
512345708,213456,10001
512345709,213456,10001
512345710,213456,10001
512345711,213456,10001
512345712,213456,10001
512345713,213456,10001
512345714,213456,10001
512345715,213456,10001
512345716,213456,10001
512345717,213456,10001
512345718,213456,10001
512345719,213456,10001
512345720,213456,10001
512345721,213456,10001
512345722,213456,10001
512345723,213456,10001
512345724,213456,10001
512345725,213456,10001
512345726,213456,10001
512345727,213456,10001
512345728,213456,10001
512345729,213456,10001
512345730,213456,10001
512345731,213456,10001
512345732,213456,10001
512345733,213456,10001
512345734,213456,10001
512345735,213456,10001
512345736,213456,10001
512345737,213456,10001
512345738,213456,10001
512345739,213456,10001
512345740,213456,10001
512345741,213456,10001
512345742,213456,10001
512345743,213456,10001
512345744,213456,10001
512345745,213456,10001
512345746,213456,10001
512345747,213456,10001
512345748,213456,10001
512345749,213456,10001
512345750,213456,10001
512345751,213456,10001
512345752,213456,10001
512345753,213456,10001
512345754,213456,10001
512345755,213456,10001
512345756,213456,10001
512345757,213456,10001
512345758,213456,10001
512345759,213456,10001
512345760,213456,10001
512345761,213456,10001
512345762,213456,10001
512345763,213456,10001
512345764,213456,10001
512345765,213456,10001
512345766,213456,10001
512345767,213456,10001
512345768,213456,10001
512345769,213456,10001
512345770,213456,10001
512345771,213456,10001
512345772,213456,10001
512345773,213456,10001
512345774,213456,10001
512345775,213456,10001
512345776,213456,10001
512345777,213456,10001
512345778,213456,10001
512345779,213456,10001
512345780,213456,10001
512345781,213456,10001
512345782,213456,10001
512345783,213456,10001
512345784,213456,10001
512345785,213456,10001
512345786,213456,10001
512345787,213456,10001
512345788,213456,10001
512345789,213456,10001
512345790,213456,10001
512345791,213456,10001
512345792,213456,10001
512345793,213456,10001
512345794,213456,10001
512345795,213456,10001
512345796,213456,10001
512345797,213456,10001
512345798,213456,10001
512345799,213456,10001
512345800,213456,10001

Basically, it is giving an issue when the last two digits of input Field1 are starting to get repeated. I am going to prepare a more randomized file to try it now.
# 9  
Old 03-28-2016
Code:
fatal: expression for `>' redirection has null string value

above error occurs if variable i is not set or defined

For example : see below

No error since variable is defined
Code:
[akshay@localhost tmp]$ awk 'BEGIN{s=123; print 12345>s}'
[akshay@localhost tmp]$ cat 123 
12345

Error since variable s is not defined
Code:
[akshay@localhost tmp]$ awk 'BEGIN{print 12345>s}'
awk: fatal: expression for `>' redirection has null string value


Please check map.txt file,

I get result without any error like this

Code:
[akshay@localhost tmp]$ awk -F, 'FNR==NR{a[$3]=$1 FS $2;next}{for(i in a){c=substr($1,length($1)-1);split(a[i],d); if(c>=d[1]&& c<=d[2])printf("GENERAL,%s,,,,,%s,,,,,0\n",$1,$3)>i }}' map.txt abc.txt

[akshay@localhost tmp]$ for i in file*.txt; do echo $i; cat $i; done
file1.txt
GENERAL,512345700,,,,,10001,,,,,0
GENERAL,512345701,,,,,10001,,,,,0
GENERAL,512345702,,,,,10001,,,,,0
GENERAL,512345703,,,,,10001,,,,,0
GENERAL,512345704,,,,,10001,,,,,0
GENERAL,512345705,,,,,10001,,,,,0
GENERAL,512345706,,,,,10001,,,,,0
GENERAL,512345707,,,,,10001,,,,,0
GENERAL,512345708,,,,,10001,,,,,0
GENERAL,512345709,,,,,10001,,,,,0
GENERAL,512345710,,,,,10001,,,,,0
GENERAL,512345711,,,,,10001,,,,,0
GENERAL,512345712,,,,,10001,,,,,0
GENERAL,512345713,,,,,10001,,,,,0
GENERAL,512345714,,,,,10001,,,,,0
GENERAL,512345715,,,,,10001,,,,,0
GENERAL,512345800,,,,,10001,,,,,0
file2.txt
GENERAL,512345716,,,,,10001,,,,,0
GENERAL,512345717,,,,,10001,,,,,0
GENERAL,512345718,,,,,10001,,,,,0
GENERAL,512345719,,,,,10001,,,,,0
GENERAL,512345720,,,,,10001,,,,,0
GENERAL,512345721,,,,,10001,,,,,0
GENERAL,512345722,,,,,10001,,,,,0
GENERAL,512345723,,,,,10001,,,,,0
GENERAL,512345724,,,,,10001,,,,,0
GENERAL,512345725,,,,,10001,,,,,0
file3.txt
GENERAL,512345726,,,,,10001,,,,,0
GENERAL,512345727,,,,,10001,,,,,0
GENERAL,512345728,,,,,10001,,,,,0
GENERAL,512345729,,,,,10001,,,,,0
GENERAL,512345730,,,,,10001,,,,,0
GENERAL,512345731,,,,,10001,,,,,0
GENERAL,512345732,,,,,10001,,,,,0
GENERAL,512345733,,,,,10001,,,,,0
GENERAL,512345734,,,,,10001,,,,,0
GENERAL,512345735,,,,,10001,,,,,0
GENERAL,512345736,,,,,10001,,,,,0
GENERAL,512345737,,,,,10001,,,,,0
GENERAL,512345738,,,,,10001,,,,,0
GENERAL,512345739,,,,,10001,,,,,0
GENERAL,512345740,,,,,10001,,,,,0

# 10  
Old 03-28-2016
Quote:
Originally Posted by RudiC
Code:
awk -F, '
T = (T = $1%100)<16?1:T<26?2:T<41?3:""  {print "General", $1, 123, 0, $3, 0  > ("FILE" T ".txt")}
' OFS="," file

Works like magic. I only edited the range and it worked smoothly for me.

Can you please explain the code?Smilie

----------------------------

My understanding is that you are dividing the Field1 by 100 to get the remainder value, and mapping it to a file-number accordingly.
But, What does the '?' mark do in this case?

Last edited by mystition; 03-28-2016 at 10:05 AM..
# 11  
Old 03-28-2016
% is not a division operator, but the modulo or remainder operator, so the result of above are the last two digits.
expr?expr:expr is the conditional operator; it evaluates the first expression, and, if TRUE, executes the second else the third. Above actually is three condops stacked.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

HELP : awk substr

Hi, - In a file test.wmi Col1 | firstName | lastName 4003 | toto_titi_CT- | otot_itit - I want to have only ( colones $7,$13 and $15) with code 4003 and 4002. for colone $13 I want to have the whole name untill _CT- or _GC- 1- I used the command egrep with awk #egrep -i... (2 Replies)
Discussion started by: georg2014
2 Replies

2. Shell Programming and Scripting

awk substr

Hello life savers!! Is there any way to use substr in awk command for returning one part of a string from declared start and stop point? I mean I know we have this: substr(string, start, length) Do we have anything like possible to use in awk ? : substr(string, start, stop) ... (9 Replies)
Discussion started by: @man
9 Replies

3. Shell Programming and Scripting

Substr with awk

Hi to all, I'm here again, cause I need your help to solve another issue for me. I have some files that have this name format: date_filename.csv In my shell I must rename each file removing the date so that the file name is filename.csv To do this I use this command: fnames=`ls ${fname}|... (2 Replies)
Discussion started by: leobdj
2 Replies

4. Shell Programming and Scripting

awk substr

HI I am using awk and substr function to list out the directory names in the present working directory . I am using below code ls -l | awk '{ if ((substr($1,1,1)) -eq d) {print $9 }}' But the problem is i am getting all the files and directories listed where as the requirement i wrote... (7 Replies)
Discussion started by: prabhu_kumar
7 Replies

5. Shell Programming and Scripting

Help with awk and substr

I have the following to find lines matching "COMPLETE" and extract parts of it using substr. sed -n "/COMPLETE/p" 1.txt | awk 'BEGIN { FS = "\" } {printf"%s %s:%s \n", substr($3,17,3),substr($6,4,1), substr($7,4,1)}' | sort | uniq > temp.txt Worked fine until the numbers in 2nd & 3rd substr... (5 Replies)
Discussion started by: zpn
5 Replies

6. UNIX for Dummies Questions & Answers

substr first element using awk

I have a variable '$test' that has the following string value: $test = 123|456|789|0123 How would I select just the first element ('123') using awk. Note: '|' is the delimiter, and the length of each field can vary. Thanks, - CB (3 Replies)
Discussion started by: ChicagoBlues
3 Replies

7. Shell Programming and Scripting

awk substr

Hi I have multiple files that name begins bidb_yyyymm. (yyyymm = current year month of file creation). What I want to do is look at the files and where yyyymm is older than 1 month I want to remove the file from the server. I was looking at looping through the files and getting the yyyymm... (2 Replies)
Discussion started by: colesga
2 Replies

8. UNIX for Dummies Questions & Answers

awk or substr

i have a variable 200612 the last two digits of this variable should be between 1 and 12, it should not be greater than 12 or less than 1 (for ex: 00 or 13,14,15 is not accepted) how do i check for this conditions in a unix shell script. thanks Ram (3 Replies)
Discussion started by: ramky79
3 Replies

9. Shell Programming and Scripting

How to use awk substr ?

Hi all, I have a flatfile I would like to get ext = 7950 , how do I do that ? if ($1 == "CTI-ProgramStart") { ext = substr($9,index($9,"Extension")+11,4); But why it is not working ???? Please help . Thanks (1 Reply)
Discussion started by: sabercats
1 Replies

10. Shell Programming and Scripting

awk substr?

Sorry if this has been posted before, I searched but not sure what I really want to do. I have a file with records that show who has logged into my application: 2003-03-14:I:root: Log_mesg: registered servername:userid. (more after this) I want to pull out the userid, date and time into... (2 Replies)
Discussion started by: MizzGail
2 Replies
Login or Register to Ask a Question