awk and substr


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk and substr
# 1  
Old 03-28-2016
Question [SOLVED] awk and substr

Hello All;

I have an input file 'abc.txt' with below text:
Code:
512345977,213458,100021
512345978,213454,100031
512345979,213452,100051
512345980,213455,100061
512345981,213456,100071
512345982,213456,100091
512345983,213457,100041
512345984,213451,100011

I need to paste the first field and the third field in different files depending on the last two digits of first field.

--> If last two digit of first field in between 0 and 15
then the output will be '
Code:
General,<Field1>,123,0,<Field3>,0

in File1.txt
--> If last two digit of first field in between 16 and 25
then the output will be '
Code:
General,<Field1>,123,0,<Field3>,0

in File2.txt
--> If last two digit of first field in between 26 and 40
then the output will be '
Code:
General,<Field1>,123,0,<Field3>,0

in File3.txt
and so on...

Now, I can do it using awk command followed by multiple if loops - but it will take a lot of processing time as the input file is huge with millions of records.

So, how can i do it using 1 single awk command?Smilie

Last edited by mystition; 03-29-2016 at 04:04 AM.. Reason: no parse tags -> code tags
# 2  
Old 03-28-2016
Please use code tags as required by forum rules!

And, show us your attempts.

---------- Post updated at 11:35 ---------- Previous update was at 11:33 ----------

There's no match for your criteria in ALL first fields of your sample input (assumed the field separator is , ).
# 3  
Old 03-28-2016
Hello mystition,

Could you please try following and let me know if this works for you, though I haven't seen anything matching from your shown Input_file here.
Code:
awk -F, '{len=length($1);Q=substr($1,len-1);if(Q>=0 && Q<=15){print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File1.txt"};if(Q>=16 && Q<=25){print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File2.txt"};if(Q>=26 && Q<=40){print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File3.txt"}}'  Input_file

As your Input_file is not having any of the content matching to your requirement so I couldn't test it, if your requirement is not matching to above code please show us more Input_file with expected output too.
EDIT: Adding a non-one liner form for above solution as follows.
Code:
awk -F, '{
                len=length($1);
                Q=substr($1,len-1);
                if(Q>=0 && Q<=15){
                                        print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File1.txt"
                                 };
                if(Q>=16 && Q<=25){
                                        print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File2.txt"
                                 };
                if(Q>=26 && Q<=40){
                                        print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File3.txt"
                                  }
         }
        '   Input_file

Thanks,
R. Singh

Last edited by RavinderSingh13; 03-28-2016 at 06:53 AM.. Reason: Added a non-one liner form of solution now.
# 4  
Old 03-28-2016
You say "and so on..." after specifying the matching criteria, which leads us to believe that a similar logic applies for numbers greater than 40 (and less than 99 I assume). However, I can't see what increment pattern is being applied here as the first match is a range of 16 numbers (including 0), the second being 10 numbers and the third being 15 numbers. Can you please either confirm all ranges up till 99 (if applicable) or confirm that only these 3 ranges apply?
# 5  
Old 03-28-2016
I added few records to your file just to test command that is as follows

Code:
[akshay@localhost tmp]$ cat abc.txt
512345977,213458,100021
512345978,213454,100031
512345979,213452,100051
512345980,213455,100061
512345981,213456,100071
512345982,213456,100091
512345983,213457,100041
512345984,213451,100011
512345910,213451,100011
512345909,213451,100011
512345917,213451,100011
512345922,213451,100011
512345927,213451,100011
512345939,213451,100011

Code:
[akshay@localhost tmp]$ cat map.txt
0,15,file1.txt
16,25,file2.txt
26,40,file3.txt

One liner command as you expected Smilie
Code:
[akshay@localhost tmp]$ awk -F, 'FNR==NR{a[$3]=$1 FS $2;next}{for(i in a){c=substr($1,length($1)-1);split(a[i],d); if(c>=d[1]&& c<=d[2])printf("General,%s,123,0,%s,0\n",$1,$3)>i }}' map.txt abc.txt


Output


file1.txt
Code:
[akshay@localhost tmp]$ cat file1.txt 
General,512345910,123,0,100011,0
General,512345909,123,0,100011,0

file2.txt
Code:
[akshay@localhost tmp]$ cat file2.txt 
General,512345917,123,0,100011,0
General,512345922,123,0,100011,0

file3.txt
Code:
[akshay@localhost tmp]$ cat file3.txt 
General,512345927,123,0,100011,0
General,512345939,123,0,100011,0

Readable version
Code:
awk -F, '
      # Read map.txt set array a where index of array being column3, and element being
      # column1 and column2 separated by input field separator
      # FNR==NR is true only when awk reads first file
      FNR==NR{
                   a[$3]=$1 FS $2
                   next
             }

      # Read second file abc.txt
             {
                     # Loop through array elements
                     for(i in a)
                     {
                           # Extract last 2 digits from column1 of current line read
                           c=substr($1,length($1)-1);
                      
                           # split array value a[i] into array d where separator being FS
                           # where d[1] is min value, d[2] is max value
                           split(a[i],d) 

                           # if last 2 digits lies withing range, write to file i 
                           if(c>=d[1]&& c<=d[2])
                                  printf("General,%s,123,0,%s,0\n",$1,$3) > i 
                     }
              }
        ' map.txt abc.txt

In current context close(file) that is close(i) is not incorporated, you should make some provision to close file,
if you are writing to many files, as you may receive makes too many open files error.
# 6  
Old 03-28-2016
Quote:
Originally Posted by pilnet101
You say "and so on..." after specifying the matching criteria, which leads us to believe that a similar logic applies for numbers greater than 40 (and less than 99 I assume). However, I can't see what increment pattern is being applied here as the first match is a range of 16 numbers (including 0), the second being 10 numbers and the third being 15 numbers. Can you please either confirm all ranges up till 99 (if applicable) or confirm that only these 3 ranges apply?
Sorry for being abstract. I have the ranges with different lengths to imply that there can be 'n' number of ranges - which I can specify manually. I need help with the logic to do so.

---------- Post updated at 05:12 PM ---------- Previous update was at 05:08 PM ----------

Quote:
Originally Posted by RudiC
Please use code tags as required by forum rules!

And, show us your attempts.

---------- Post updated at 11:35 ---------- Previous update was at 11:33 ----------

There's no match for your criteria in ALL first fields of your sample input (assumed the field separator is , ).
I am a member since past 7 years - for sure it's not a home-work. I did not post the attempts as I am working on a script after several years and getting some syntax errors.

Sorry for missing on the Code tags. I tried using NP - but it did not do the job.

---------- Post updated at 05:21 PM ---------- Previous update was at 05:12 PM ----------

Quote:
Originally Posted by RavinderSingh13
Hello mystition,

Could you please try following and let me know if this works for you, though I haven't seen anything matching from your shown Input_file here.
Code:
awk -F, '{len=length($1);Q=substr($1,len-1);if(Q>=0 && Q<=15){print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File1.txt"};if(Q>=16 && Q<=25){print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File2.txt"};if(Q>=26 && Q<=40){print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File3.txt"}}'  Input_file

As your Input_file is not having any of the content matching to your requirement so I couldn't test it, if your requirement is not matching to above code please show us more Input_file with expected output too.
EDIT: Adding a non-one liner form for above solution as follows.
Code:
awk -F, '{
                len=length($1);
                Q=substr($1,len-1);
                if(Q>=0 && Q<=15){
                                        print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File1.txt"
                                 };
                if(Q>=16 && Q<=25){
                                        print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File2.txt"
                                 };
                if(Q>=26 && Q<=40){
                                        print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File3.txt"
                                  }
         }
        '   Input_file

Thanks,
R. Singh
Thanks Mr Singh,

I tried your code, but it's not working as expected...

If I use the first part itself:
Code:
awk -F, '{
                len=length($1);
                Q=substr($1,len-1);
                if(Q>=0 && Q<=07){
print "General" OFS $1 OFS "123,0" OFS $3 OFS "0" >> "File1.txt"}}' abc.txt

But somehow it is even taking values other than 00 to 07

Code:
$ cat File1.txt
General 512345700 123,0 10001 0
General 512345701 123,0 10001 0
General 512345702 123,0 10001 0
General 512345703 123,0 10001 0
General 512345704 123,0 10001 0
General 512345705 123,0 10001 0
General 512345706 123,0 10001 0
General 512345707 123,0 10001 0
General 512345708 123,0 10001 0
General 512345709 123,0 10001 0
General 512345710 123,0 10001 0
General 512345711 123,0 10001 0
General 512345712 123,0 10001 0
General 512345713 123,0 10001 0
General 512345714 123,0 10001 0
General 512345715 123,0 10001 0
General 512345716 123,0 10001 0
General 512345717 123,0 10001 0
General 512345718 123,0 10001 0
General 512345719 123,0 10001 0

# 7  
Old 03-28-2016
Code:
awk -F, '
T = (T = $1%100)<16?1:T<26?2:T<41?3:""  {print "General", $1, 123, 0, $3, 0  > ("FILE" T ".txt")}
' OFS="," file

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

HELP : awk substr

Hi, - In a file test.wmi Col1 | firstName | lastName 4003 | toto_titi_CT- | otot_itit - I want to have only ( colones $7,$13 and $15) with code 4003 and 4002. for colone $13 I want to have the whole name untill _CT- or _GC- 1- I used the command egrep with awk #egrep -i... (2 Replies)
Discussion started by: georg2014
2 Replies

2. Shell Programming and Scripting

awk substr

Hello life savers!! Is there any way to use substr in awk command for returning one part of a string from declared start and stop point? I mean I know we have this: substr(string, start, length) Do we have anything like possible to use in awk ? : substr(string, start, stop) ... (9 Replies)
Discussion started by: @man
9 Replies

3. Shell Programming and Scripting

Substr with awk

Hi to all, I'm here again, cause I need your help to solve another issue for me. I have some files that have this name format: date_filename.csv In my shell I must rename each file removing the date so that the file name is filename.csv To do this I use this command: fnames=`ls ${fname}|... (2 Replies)
Discussion started by: leobdj
2 Replies

4. Shell Programming and Scripting

awk substr

HI I am using awk and substr function to list out the directory names in the present working directory . I am using below code ls -l | awk '{ if ((substr($1,1,1)) -eq d) {print $9 }}' But the problem is i am getting all the files and directories listed where as the requirement i wrote... (7 Replies)
Discussion started by: prabhu_kumar
7 Replies

5. Shell Programming and Scripting

Help with awk and substr

I have the following to find lines matching "COMPLETE" and extract parts of it using substr. sed -n "/COMPLETE/p" 1.txt | awk 'BEGIN { FS = "\" } {printf"%s %s:%s \n", substr($3,17,3),substr($6,4,1), substr($7,4,1)}' | sort | uniq > temp.txt Worked fine until the numbers in 2nd & 3rd substr... (5 Replies)
Discussion started by: zpn
5 Replies

6. UNIX for Dummies Questions & Answers

substr first element using awk

I have a variable '$test' that has the following string value: $test = 123|456|789|0123 How would I select just the first element ('123') using awk. Note: '|' is the delimiter, and the length of each field can vary. Thanks, - CB (3 Replies)
Discussion started by: ChicagoBlues
3 Replies

7. Shell Programming and Scripting

awk substr

Hi I have multiple files that name begins bidb_yyyymm. (yyyymm = current year month of file creation). What I want to do is look at the files and where yyyymm is older than 1 month I want to remove the file from the server. I was looking at looping through the files and getting the yyyymm... (2 Replies)
Discussion started by: colesga
2 Replies

8. UNIX for Dummies Questions & Answers

awk or substr

i have a variable 200612 the last two digits of this variable should be between 1 and 12, it should not be greater than 12 or less than 1 (for ex: 00 or 13,14,15 is not accepted) how do i check for this conditions in a unix shell script. thanks Ram (3 Replies)
Discussion started by: ramky79
3 Replies

9. Shell Programming and Scripting

How to use awk substr ?

Hi all, I have a flatfile I would like to get ext = 7950 , how do I do that ? if ($1 == "CTI-ProgramStart") { ext = substr($9,index($9,"Extension")+11,4); But why it is not working ???? Please help . Thanks (1 Reply)
Discussion started by: sabercats
1 Replies

10. Shell Programming and Scripting

awk substr?

Sorry if this has been posted before, I searched but not sure what I really want to do. I have a file with records that show who has logged into my application: 2003-03-14:I:root: Log_mesg: registered servername:userid. (more after this) I want to pull out the userid, date and time into... (2 Replies)
Discussion started by: MizzGail
2 Replies
Login or Register to Ask a Question