Apache2 logs analysis


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Apache2 logs analysis
# 1  
Old 08-22-2016
Apache2 logs analysis

hi there,

need some improvement on this. thanks.


Purpose is to :

Generally identify illegal accesses in the Apache2 logs, like, System commands, SHELL hacks, malwares, bots, and other hacking attempts. Most of these have a common background of gaining access to the weak parts of the www side. I had a pretty interesting set of results, mostly from India, China, Italy, Southern America and Midlands of Africa ( somebody trying to hack while sitting in safari ) and of course the USA as well.


1) re-engineer the apache2's other_vhosts_access.logs -- can also be incorporated to analyse other log formats.
2) I need to smart-ize the Counters -- initially I have to create and initialize the counters at the BEGIN block, where I'm interested in something smarter to use less coding.
3) is there any variable in AWK/GAWK containing the value of "searched string" or lets call it search-pattern place holder.
4) END block is containing individual statements for all counters at the end, need to improve it as well. Thanks

Regards,
Nasir Mahmood



Code:
#!/usr/bin/awk -f
#
#
# version 1: Counters added to show count of matches at the end.
# 1.2:     changed and displayed the resulting match at the end of every line. Added color code to string matched,

BEGIN { FS="\""; SHOWLOG=1; IGNORECASE=1; CurlynumberNF=0; azAZ09NF=0; UnameNF=0; ExprNF=0; WgetNF=0; DecodeNF=0; EvalNF=0; Base64NF=0; azAZ09NF=0; DisconnectNF=0; ConnectNF=0; FunctionNF=0; ExitNF=0; DocRootNF=0; chrNF=0; DelayNF=0; WaitforNF=0;  PrintNF=0; CgiBinNF=0; PasswdNF=0; BinShNF=0; PerlNF=0; BashNF=0; SelectNF=0; zhCNNF=0; WordPress=0; WpCron=0; WpAdmin=0; CgiBin=0; Passwd=0; WpLogin=0; Echo2=0; Eval2=0; Base64=0; DOCROOT=0; SetTimeLimit=0; SetMagicQuotes=0; FilePutContent=0; Magento=0; PhpAdmin=0; PhpMyAdmin=0; FCKEditor=0; System2=0; Sqlite=0; SQLManager=0; WebEdit=0; WpContent=0; WebSQL=0; MySQLDumper=0; webdb=0; WebConsole=0; Digit200=0; azAZ300=0; WebManage=0; }

$2 ~ /webmanage/ { WebManage++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t[0-9]{100} !200"  }; printf("%s\t\033[1;32m%s\033[0m\t\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /[a-zA-Z_-]{300,}/ && $3 !~ /200/ { azAZ300++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t[0-9]{100} !200"  }; printf("%s\t\033[1;32m%s\033[0m\t\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /[0-9]{200,}/ && $3 !~ /200/ { Digit200++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t[0-9]{100} !200"  }; printf("%s\t\033[1;32m%s\033[0m\t\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /web-console/ { WebConsole++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mweb-console\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /webdb/ { webdb++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwebdb\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /mysqldumper/ { MySQLDumper++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mmysqldumper\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /websql/ { WebSQL++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwebsql\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /wp-content/ { WpContent++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwp-content\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /webedit/ { WebEdit++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwebedit\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /sqlmanager/ { SQLManager++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31msqlmanager\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /sqlite/ { Sqlite++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31msqlite\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /system/ { System2++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31msystem\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /fckeditor/ { FCKEditor++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mfckeditor\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /phpmyadmin/ { PhpMyAdmin++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mphpmyadmin\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /phpadmin/ { PhpAdmin++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mphpadmin\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /magento/ { Magento++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mmagento\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"file_put_content"/ { FilePutContent++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mFilePutContent\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"set_magic_quotes"/ { SetMagicQuotes++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mSetMagicQuotes\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"set_time_limit"/ { SetTimeLimit++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mSetTimeLimit\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"DOCUMENT_ROOT"/ { DOCROOT++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mDOCROOT\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"base64"/ { Base64++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mbase64\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"eval"/ { Eval2++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31meval\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"echo"/ { Echo2++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mecho\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /\/wp-login/ { WpLogin++; split($1,a," "); x[a[2]]++;if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwp-login\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /passwd/ { Passwd++; split($1,a," "); x[a[2]]++;if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mpasswd\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"cgi-bin"/ { CgiBin++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mcgi-bin\033[0m"  };  printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"wp-admin"/ { WpAdmin++ ;split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwp-admin\033[0m"  };  printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"wp-cron"/ { WpCron++;  split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwp-cron\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$2 ~ /"wordpress"/ { WordPress++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwordpress\033[0m"  };  printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
$(NF-1)  ~ /"zh_CN"/ { zhCNNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mBase64_Decode\033[0m"  }; printf("%s\t\033[1;32m%s\033[0m\t%s\n",a[2],$2,$(NF-1)); }
( $(NF-1) !~ /Mozilla/ && $(NF-1) ~ /\\x[a-fA-Z0-9]+/ ) { Hexa++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mx[a-z0-9]\033[0m"  }; printf("%s\t%s\t\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"select"/ { SelectNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mselect\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"bash"/ { BashNF++;  split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mbash\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"perl"/ { PerlNF++;  split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mperl\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /bin\/sh/ { BinShNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mbin/sh\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"passwd"/ { PasswdNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mpasswdNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"cgi-bin"/ { CgiBinNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mcgi-binNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"print"/ { PrintNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mprintNF\033[0m"  };  printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"waitfor"/ { WaitforNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwaitforNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"delay"/ { DelayNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mdelay\033[0m"  };  printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /\<chr\([0-9a-zA-Z]+\)\>/ { chrNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mchrNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"DOCUMENT_ROOT"/ { DocRootNF++; split($1,a," "); if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mDOCUMENT_ROOT\033[0m"  }; x[a[2]]++; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"exit"/ { ExitNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mexitNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1 )  ~ /"function"/ { FunctionNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mfunctionNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
( $(NF-1) !~ /Mozilla/ &&  $(NF-1) !~ /Outlook/ && $(NF-1) !~ /internal dummy connection/ && $3 !~ /200/ && $(NF-1) ~ /connect/ )  { ConnectNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mconnectNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"disconnect"/ { DisconnectNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mdisconnectNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /[0-9a-zA-Z]{300,}/ { azAZ09NF++;  split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31ma-zA-Z0-9-300\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"base64"/ { Base64NF++;  split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mbase64NF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /\<eval\>/ { EvalNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mevalNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"decode"/ { DecodeNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mdecodeNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"wget([0-9]+)"/ { WgetNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mwgeNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"expr"/ { ExprNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31mexprNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /"uname"/ { UnameNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31muanemNF\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /\$\([a-zA-Z0-9]+\)/ { azAZ09NF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31m$(a-zA-Z0-9)\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
$(NF-1) ~ /\$\{[0-9]+\}/    { CurlynumberNF++; split($1,a," "); x[a[2]]++; if ( SHOWLOG ) {  $(NF-1)=$(NF-1)"\t\033[1;31m$(0-9)\033[0m"  }; printf("%s\t%s\t\033[1;32m%s\033[0m\n",a[2],$2,$(NF-1));}
END {
printf("%-20s\t%d\n","azAZ09NF",azAZ09NF);
printf("%-20s\t%d\n","UnameNF",UnameNF);
printf("%-20s\t%d\n","ExprNF",ExprNF);
printf("%-20s\t%d\n","WgetNF",WgetNF);
printf("%-20s\t%d\n","DecodeNF",DecodeNF);
printf("%-20s\t%d\n","EvalNF",EvalNF);
printf("%-20s\t%d\n","Base64NF",Base64NF);
printf("%-20s\t%d\n","azAZ09NF",azAZ09NF);
printf("%-20s\t%d\n","DisconnectNF",DisconnectNF);
printf("%-20s\t%d\n","ConnectNF",ConnectNF);
printf("%-20s\t%d\n","FunctionNF",FunctionNF);
printf("%-20s\t%d\n","ExitNF",ExitNF);
printf("%-20s\t%d\n","DocRootNF",DocRootNF);
printf("%-20s\t%d\n","chrNF",chrNF);
printf("%-20s\t%d\n","DelayNF",DelayNF);
printf("%-20s\t%d\n","WaitforNF",WaitforNF);
printf("%-20s\t%d\n","PrintNF",PrintNF);
printf("%-20s\t%d\n","CgiBinNF",CgiBinNF);
printf("%-20s\t%d\n","PasswdNF",PasswdNF);
printf("%-20s\t%d\n","BinShNF",BinShNF);
printf("%-20s\t%d\n","PerlNF",PerlNF);
printf("%-20s\t%d\n","BashNF",BashNF);
printf("%-20s\t%d\n","SelectNF",SelectNF);
printf("%-20s\t%d\n","zhCNNF",zhCNNF);
printf("%-20s\t%d\n","WordPress",WordPress);
printf("%-20s\t%d\n","WpCron",WpCron);
printf("%-20s\t%d\n","WpAdmin",WpAdmin);
printf("%-20s\t%d\n","CgiBin",CgiBin);
printf("%-20s\t%d\n","Passwd",Passwd);
printf("%-20s\t%d\n","WpLogin",WpLogin);
printf("%-20s\t%d\n","Echo2",Echo2);
printf("%-20s\t%d\n","Eval2",Eval2);
printf("%-20s\t%d\n","Base64",Base64);
printf("%-20s\t%d\n","DOCROOT",DOCROOT);
printf("%-20s\t%d\n","SetTimeLimit",SetTimeLimit);
printf("%-20s\t%d\n","SetMagicQuotes",SetMagicQuotes);
printf("%-20s\t%d\n","FilePutContent",FilePutContent);
printf("%-20s\t%d\n","Magento",Magento);
printf("%-20s\t%d\n","PhpAdmin",PhpAdmin);
printf("%-20s\t%d\n","PhpMyAdmin",PhpMyAdmin);
printf("%-20s\t%d\n","FCKEditor",FCKEditor);
printf("%-20s\t%d\n","System2",System2);
printf("%-20s\t%d\n","Sqlite",Sqlite);
printf("%-20s\t%d\n","SQLManager",SQLManager);
printf("%-20s\t%d\n","WebEdit",WebEdit);
printf("%-20s\t%d\n","WpContent",WpContent);
printf("%-20s\t%d\n","WebSQL",WebSQL);
printf("%-20s\t%d\n","MySQLDumper",MySQLDumper);
printf("%-20s\t%d\n","webdb",webdb);
printf("%-20s\t%d\n","WebConsole",WebConsole);
printf("%-20s\t%d\n","Digit200",Digit200);
printf("%-20s\t%d\n","azAZ300",azAZ300);
printf("%-20s\t%d\n","WebManage",WebManage);

        for ( j in x )  {
                print j
                        }
    }

# 2  
Old 08-22-2016
Sorry, but this code is as unreadable as probably possible. You might want to start by bringing it into a form a human can actually understand.

Inotherwordsitisquitehardtounderstandwhatyouaremeaning
ifthereisnostructureinyourcodeonecanrecognizeandyoumight
wanttostarttherebeforeevenattemptingtochangeyourcode.

I hope this helps.

bakunin
These 2 Users Gave Thanks to bakunin For This Post:
# 3  
Old 08-22-2016
Fully supporting what bakunin says, on first glance one can see that there are many, many repeating (almost) identical operations, so using adequate data structures you could dramatically simplify the entire script, making it way more maintainable at the same time.
On top, a few lines of sample input data would help as well...
# 4  
Old 08-22-2016
my apologies:


I have only production logs available from my boxes, which I used to extract from the above given script.

Reverse Engineering the code is not a big problem for those who dare.

for everyone else, here is the login

input is something like below:

Code:
example.com:80 IP_ADDRESS - - [07/Aug/2016:02:03:42 +0100] "GET /extracted-Request HTTP/1.1" 200 9638 "-" "Mozilla/5.0 (compatible; trovitBot 1.0; +http://www.trovit.com/bot.html)"

repetition of the above code lines will eventually make the script run.

the resulting output will be something like :


Code:
WpLogin                 784
Echo2                   0
Eval2                   0
Base64                  0
DOCROOT                 0
SetTimeLimit            0
SetMagicQuotes          0
FilePutContent          0
Magento                 0
PhpAdmin                0
PhpMyAdmin              1
FCKEditor               283
System2                 46
Sqlite                  0
SQLManager              0
WebEdit                 2
WpContent               2850
WebSQL                  0
MySQLDumper             0
webdb                   0
WebConsole              0
Digit200                0
azAZ300                 0
WebManage               4

<IP LIST>

Moderator's Comments:
Mod Comment Please use CODE tags for sample input, sample output, and sample code.

Last edited by Don Cragun; 08-22-2016 at 06:00 PM.. Reason: Add CODE tags.
# 5  
Old 08-22-2016
Quote:
Originally Posted by busyboy
Reverse Engineering the code is not a big problem for those who dare.
It is not about "reverse engineering": if you are not the person who wrote this code I'd say throw it away, carefully analyse what you need and then implement that. Personally i think the script you have shown us is beyond repair.

I hope this helps.

bakunin
# 6  
Old 08-22-2016
We can all analyze what each of the lines in that script is doing if we waste the time to make it readable by humans as well as by awk. But, without a clear specification of what you are trying to do and without a representative sample of the data being processed, we have no reason to know what parts of the code work correctly by design, what pars of the code work correctly by accident, and what parts of the code are broken. And showing us output with no input from which it was derived is an extremely small help.

With what you have given us, there is no reason for us to waste time trying to guess at what might be done better (other than to make the code much easier to read).
# 7  
Old 08-22-2016
I have had a quick try at simplifying this script for you.

I managed to identify 3 different tests you are doing and created a check() function that
will cover these cases. It checks for a match and returns zero of no match. Otherwise it logs when required and returns 1. The return value is added to each of your counters.

I'm sure there could be much more simplification if you specified you expressions and counter names in another config file. But you would still need to edit the config file to change the tests so I doubt much more would be gained going that way.

Below, I use check() function to increment counters for your 3 different test cases - your job is to extend this for the full testing set. Note there is no need to initialise the counters as they will be set to zero automatically once the first line is processed.

Code:
#!/usr/bin/awk -f
function check(Fld, mtch, ex) {
   # ex will always be null (false) if it is not passed in,
   # otherwise it must equate to true to continue
   if(!ex && (Fld !~ mtch)) return 0

   x[IP]++
   if (SHOWLOG) printf("%s\t\033[1;32m%s\033[0m\t\t\033[1;32m%s\033[0m\n", IP, $2, mtch)
   return 1
}

BEGIN { FS="\""; SHOWLOG=1; IGNORECASE=1 }

{
  split($1,a," ")
  IP = a[2]

  # Case 1 - match to $2
  WebManage += check($2, "webmanage")
  WebSQL    += check($2, "websql")
  Digit200  += check($2, "[0-9]{200,}")

  # Case 2 - match to $(NF - 1)
  PrintNF   += check($(NF -1), "print")
  BinShNF   += check($(NF -1), "bin/sh")

  # Case 3 - complex expression
  Hexa      += check("", "[a-z0-9]", ( $(NF-1) !~ /Mozilla/ && $(NF-1) ~ /\\x[a-fA-Z0-9]+/ ))
  ConnectNF += check("", "connect", ( $(NF-1) !~ /Mozilla/ &&  $(NF-1) !~ /Outlook/ && $(NF-1) !~ /internal dummy connection/ && $3 !~ /200/ && $(NF-1) ~ /connect/))
}

END {
  printf("%-20s\t%d\n","webManage", WebManage);
  printf("%-20s\t%d\n","WebSQL", WebSQL);
  printf("%-20s\t%d\n","Digit200", Digit200);
  printf("%-20s\t%d\n","PrintNF", PrintNF);
  printf("%-20s\t%d\n","BinShNF", BinShNF);
  printf("%-20s\t%d\n","Hexa", Hexa);
  printf("%-20s\t%d\n","ConnectNF", ConnectNF);

  for ( j in x )  {
      print j
  }
}


Last edited by Chubler_XL; 08-23-2016 at 03:48 PM.. Reason: Better variable names - remove initialise of vars
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

If I ran perl script again,old logs should move with today date and new logs should generate.

Appreciate help for the below issue. Im using below code.....I dont want to attach the logs when I ran the perl twice...I just want to take backup with today date and generate new logs...What I need to do for the below scirpt.............. 1)if logs exist it should move the logs with extention... (1 Reply)
Discussion started by: Sanjeev G
1 Replies

2. Infrastructure Monitoring

Nmon Analysis

Dear All, I am an performance tester. Now i am working in project where we are using linux 2.6.32. Now I got an oppurtunity to learn the monitoring the server. As part of this task i need to do analysis of the Nmon report. I was completely blank in this. So please suggest me how to start... (0 Replies)
Discussion started by: iamsengu
0 Replies

3. UNIX for Dummies Questions & Answers

Text analysis

Hey Guys, Does anyone know how to count the separate amount of words in a text file? e.g the 5 and 20 Furthermore does anyone know how to convert whole numbers in decimals? Thanks (24 Replies)
Discussion started by: John0101
24 Replies

4. Shell Programming and Scripting

Analysis of a script

what does this line in a script mean?? I have tried to give it at the command prompt and here is what it returns ksh: /db2home/db2dap1/sqllib/db2profile: not found. . /db2home/db2dap1/sqllib/db2profile i have tried the same thing for my home directory too and the result is the same .... (5 Replies)
Discussion started by: ramky79
5 Replies

5. Shell Programming and Scripting

Metacharacters analysis

:confused:Hi , Can someone please advise what is the meaning of metacharacters in below code? a_PROCESS=${0##*/} a_DPFX=${a_PROCESS%.*} a_LPFX="a_DPFX : $$ : " a_UPFX="Usage: $a_PROCESS" Regards, gehlnar (3 Replies)
Discussion started by: gehlnar
3 Replies

6. Shell Programming and Scripting

Grep yesterday logs from weblogic logs

Hi, I am trying to write a script which would go search and get the info from the logs based on yesterday timestamp and write yesterday logs in new file. The log file format is as follows: """"""""""""""""""""""""""... (3 Replies)
Discussion started by: harish.parker
3 Replies

7. Programming

Regarding stack analysis

I would like to know how I could do the following : void func(){ int a = 100; b=0; int c = a/b; } void sig_handler (int sig,siginfo_t *info,void *context){ //signal handling function //here I want to access the variables of func() } int main(){ struct sigaction *act =... (7 Replies)
Discussion started by: vpraveen84
7 Replies

8. Solaris

Logs Analysis Software ?

Hi, What is the best log analysis software for Solaris ?? Regards (3 Replies)
Discussion started by: adel8483
3 Replies
Login or Register to Ask a Question