![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Getting Following problem when opening shell script (Very Urgent) | sunitachoudhury | Shell Programming and Scripting | 1 | 03-31-2008 07:01 AM |
| Urgent for shell script | sunnysunny | Shell Programming and Scripting | 4 | 03-04-2008 03:42 PM |
| Performance of a shell script | namishtiwari | UNIX for Advanced & Expert Users | 2 | 08-20-2007 05:10 AM |
| Urgent help needed - shell script | aarora_98 | SUN Solaris | 3 | 01-25-2007 02:26 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
shell script performance issues --Urgent
I need help in awk please help immediatly.
This below function is taking lot of time Please help me to fine tune it so that it runs faster. The file count is around 3million records # Process Body processbody() { #set -x while read line do ENTITY_TYPE=`print "$line" | cut -d'|' -f2 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` if [ ${ENTITY_TYPE} == "O" ] then ENTITY_TYPE="B" else ENTITY_TYPE="P" fi CUSTOMER_ID=`print "$line" | cut -d'|' -f1 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` #Branch and Account Numbers should be left blank BRANCH_NUMBER= ACCOUNT_NUMBER= ACCOUNT_DATE_OPEN=`print "$line" | cut -d'|' -f3 |sed 's/[^0-9]//g' | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}' | cut -c1-8` CORPORATE_NAME=`print "$line" | cut -d'|' -f4 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` LAST_NAME=`print "$line" | cut -d'|' -f5 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` FIRST_NAME=`print "$line" | cut -d'|' -f6 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` MIDDLE_NAME=`print "$line" | cut -d'|' -f7 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` NAME_SUFFIX=`print "$line" | cut -d'|' -f8 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` # Extracting person gender information PERSON_GENDER=`print "$line" | cut -d'|' -f9 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` # If gender is anything other than M or F,replace it with blank if [[ ${PERSON_GENDER} != "M" && ${PERSON_GENDER} != "F" ]] then PERSON_GENDER= fi BIRTH_DATE=`print $line | cut -d'|' -f10 | sed 's/[^0-9]//g' | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}' | cut -c1-8` #AGE should be left blank AGE= # Extracting citizenship code information CITIZEN_COUNTRY_NAME=`print $line | cut -d'|' -f11 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` if [[ ${CITIZEN_COUNTRY_NAME} == "US" || ${CITIZEN_COUNTRY_NAME} == "USA" || ${CITIZEN_COUNTRY_NAME} == "UNITED STATES" || ${CITIZEN_COUNTRY_NAME} == "UNITED STATES OF AMERICA" ]] then CITIZENSHIP_CODE="USA" FED_ID=`print $line | cut -d'|' -f12 | sed -e 's/[^0-9]//g' | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` else CITIZENSHIP_CODE=`print $line | cut -d'|' -f11 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}' | cut -c1-3` FED_ID= fi if [[ ${ENTITY_TYPE} == "P" ]] then FED_ID_TYPE="S" else FED_ID_TYPE="T" fi #Extracting National ID information ID_INFORMATION_1=`print $line | cut -d'|' -f13 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` ID_INFORMATION_2=`print $line | cut -d'|' -f14 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` if [[ ! -z ${ID_INFORMATION_1} && ${ID_INFORMATION_1} != "" ]] then NATIONAL_ID=${ID_INFORMATION_1} # Remove all non numeric characters in NATIONAL_ID field NATIONAL_ID=`print ${NATIONAL_ID} | sed 's/[^0-9a-zA-Z]//g' | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` NATIONAL_ID_TYPE="DL" elif [[ ! -z ${ID_INFORMATION_2} && ${ID_INFORMATION_2} != "" ]] then NATIONAL_ID=${ID_INFORMATION_2} # Remove all non numeric characters in NATIONAL_ID field NATIONAL_ID=`print ${NATIONAL_ID} | sed 's/[^0-9a-zA-Z]//g' | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` NATIONAL_ID_TYPE="PP" else NATIONAL_ID= NATIONAL_ID_TYPE= fi #Extracting street address information ADDRESS_1=`print $line | cut -d'|' -f15 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` ADDRESS_2=`print $line | cut -d'|' -f16 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` STREET_ADDRESS=${ADDRESS_1}${ADDRESS_2} STREET_ADDRESS=`print ${STREET_ADDRESS} | cut -c1-60` #Extracting city information ADDRESS_3=`print $line | cut -d'|' -f17 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` CITY_NAME=${ADDRESS_3} #Extracting country information COUNTRY=`print $line | cut -d'|' -f20 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` ADDRESS_4=`print $line | cut -d'|' -f18 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` COUNTRY_NAME=${COUNTRY} if [[ ${COUNTRY_NAME} == "US" || ${COUNTRY_NAME} == "USA" || ${COUNTRY_NAME} == "UNITED STATES" || ${COUNTRY_NAME} == "UNITED STATES OF AMERICA" ]] then COUNTRY_CODE="USA" else COUNTRY_CODE=`print ${COUNTRY} | sed 's/ //g' | cut -c1-3` fi #POSTCODE=`print $line | cut -d'|' -f19 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}' |cut -c1-5` if [[ ${COUNTRY_CODE} == "USA" ]] then STATE_CODE=${ADDRESS_4} POSTCODE=`print $line | cut -d'|' -f19 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}' |cut -c1-5` FOREIGN_PROVINCE= FOREIGN_POSTAL_CODE= else STATE_CODE= POSTCODE= FOREIGN_PROVINCE=${ADDRESS_4} FOREIGN_POSTAL_CODE=`print $line | cut -d'|' -f19 | awk '{gsub(/^[ \t]+|[ \t]+$/,"");print}'` fi PROCESSBODY="CDCI|" PROCESSBODY="${PROCESSBODY}${ENTITY_TYPE}|" PROCESSBODY="${PROCESSBODY}${CUSTOMER_ID}|" PROCESSBODY="${PROCESSBODY}${BRANCH_NUMBER}|" PROCESSBODY="${PROCESSBODY}${ACCOUNT_NUMBER}|" PROCESSBODY="${PROCESSBODY}${ACCOUNT_DATE_OPEN}|" PROCESSBODY="${PROCESSBODY}${CORPORATE_NAME}|" PROCESSBODY="${PROCESSBODY}${LAST_NAME}|" PROCESSBODY="${PROCESSBODY}${FIRST_NAME}|" PROCESSBODY="${PROCESSBODY}${MIDDLE_NAME}|" PROCESSBODY="${PROCESSBODY}${NAME_SUFFIX}|" PROCESSBODY="${PROCESSBODY}${PERSON_GENDER}|" PROCESSBODY="${PROCESSBODY}${BIRTH_DATE}|" PROCESSBODY="${PROCESSBODY}${AGE}|" PROCESSBODY="${PROCESSBODY}${CITIZENSHIP_CODE}|" PROCESSBODY="${PROCESSBODY}${FED_ID}|" PROCESSBODY="${PROCESSBODY}${FED_ID_TYPE}|" PROCESSBODY="${PROCESSBODY}${NATIONAL_ID}|" PROCESSBODY="${PROCESSBODY}${NATIONAL_ID_TYPE}|" PROCESSBODY="${PROCESSBODY}${STREET_ADDRESS}|" PROCESSBODY="${PROCESSBODY}${CITY_NAME}|" PROCESSBODY="${PROCESSBODY}${STATE_CODE}|" PROCESSBODY="${PROCESSBODY}${POSTCODE}|" PROCESSBODY="${PROCESSBODY}${FOREIGN_PROVINCE}|" PROCESSBODY="${PROCESSBODY}${FOREIGN_POSTAL_CODE}|" PROCESSBODY="${PROCESSBODY}${COUNTRY_NAME}|" PROCESSBODY="${PROCESSBODY}${COUNTRY_CODE}" print "${PROCESSBODY}" >> ${INQ_TEMP_FILE} done < ${EDD_HOME}/tmp/inquiry.txt } |
|
||||
|
Quote:
but try with the below code Code:
nawk '{
split($0,arr1,"|")
split(arr1[3],arr2,"-")
print arr2[1]arr2[2]arr2[3]
}' sample
|
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|