Extract the tables from html


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Old 1 Week Ago
Extract the tables from html

Hi I have a script which extracts the table from HTML and convert it into .csv.
But the problem in the script is if we have 2 tables in HTMl . it takes only the first table.

Please help me what changes i need to do in the script to make it read the complete HTML page.

Script is as below:
Code:
#!bin/ksh



timestamp=$(date +%d_%m_%y-%T )


export REPORT_PATH=/tefuser5/tef/acw/migwrk1/Informatica/9.5.1/server/infa_shared/Phase4_WLS/bin/reports/COUNT_REPORTS/Work

#mv /tefuser5/tef/acw/migwrk1/Informatica/9.5.1/server/infa_shared/Phase4_WLS/bin/reports/COUNT_REPORTS/Work/ABP_EXP_WRLN.csv /tefuser5/tef/acw/migwrk1/Informatica/9.5.1/server/infa_shared/Phase4_WLS/bin/reports/COUNT_REPORTS/Work/Archive/ABP_EXP_WRLN.csv"_$timestamp"

cd $REPORT_PATH
#rm ABP_EXP_WRLN.csv

#cd $REPORT_PATH
#rm -f *.csv 2>/dev/null

for HTML_F in *.HTML
do
	echo "converting $HTML_F file to csv.."
	dos2unix $HTML_F 1>/dev/null 0>/dev/null 2>/dev/null
 
	l=0
	j=0
	k=0
	rm -f xyz.csv 2>/dev/null
	rm -f abc.csv 2>/dev/null

	while IFS='' read -r line || [[ -n "$line" ]]; 

          # awk '/<TABLE/ {CNT++; if (CNT == 2) P = 1}; P; /<\/TABLE/ {P = 0}'

	do
		

	#    echo "$line"
	#awk '{/<TABLE/}'
    		if [[ "$line" == \<BR\>\<TABLE\ \ width\=* || $j -ge 2 ]]; then
		
			let j=$j+1
			if [ $j -ge 2 ]; then
				#echo "reached in 1st if"
				echo "$line" | grep -i '</TD>' 1>/dev/null 2>/dev/null
				if [ $? -eq 0 ]; then
					#echo "reached in 2nd if"
					tmp=${line#*\"\>}
					#echo "$tmp"
					res=${tmp%%\ \<\/TD\>*}
					echo "$res" >> abc.csv
				else
					:
				fi
			else
				:
			fi
	    	else
				:
    	    	fi

		#echo "$line"
    		if [[ "$line" == *TABLE\>* ]]; then
      			#echo "end of 1st table"
			let k=$k+1
    		fi

    		if [ $k -eq 2 ]; then
			echo "$HTML_F is ending.."
			break
    		fi
	done < "$HTML_F"

		
	
	while read a
	do
		if [ $l -eq 4 ]; then
			l=0
			echo "$a" >> xyz.csv
		else
			let l=$l+1
			echo "$a" | tr -s '\n' ',' >> xyz.csv
		fi
	done < abc.csv

	rm -f abc.csv
	tmpfname=`basename $HTML_F .HTML`
	rm -f $tmpfname.csv 2>/dev/null
	mv xyz.csv $tmpfname.csv

	#printf "\n\n\n\n\n\n\n,,THIS IS END OF FILE,," >> $tmpfname.csv
	#printf "\n\n" >> $tmpfname.csv
	dos2unix $tmpfname.csv 1>/dev/null 0>/dev/null 2>/dev/null
	chr=`echo $tmpfname.csv|cut -d'_' -f1`
	echo "$chr $tmpfname.csv"
	#chkDiff $chr $tmpfname.csv
done


HTML page is as below:


<html>
<body>
<b><br>Running Date: </b>11-JAN-2019 03:07</br>
<h2> Schema mapping and info    </h2>
<BR><TABLE  width="100%" class="x1h" cellpadding="1" cellspacing="0" border="5">
<TR>
<b><td class="x3w" bgcolor="#808080" width="4%"> No </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Exp Schema e </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Export Tables </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Imp Schema </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Import Tables </TD>
<b><td class="x3w" bgcolor="#808080" width="5%"> Diff  </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">1 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">FVT4 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">54 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">PRDCUSTO </TD>
<b><td class="x3w" bgcolor="#E3E4FA">54 </TD>
<b><td class="x3w" bgcolor="#E3E4FA"> </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">1 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">FVT4 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">56 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">All Imp Schema</TD>
<b><td class="x3w" bgcolor="#E3E4FA">54 </TD>
<b><td class="x3w" bgcolor="#FF0000">2 </TD></TR>
</TABLE>
<h2> Missing Tables on ImpLogs   </h2>
<h3>       TABLE_NAME :NAME_DATA </h3>
<h3>       TABLE_NAME :WHITE_LIST_MIG </h3>
<h2> Table Rows Comparison   </h2>
<BR><TABLE  width="100%" class="x1h" cellpadding="1" cellspacing="0" border="5">
<TR>
<b><td class="x3w" bgcolor="#808080" width="4%"> No </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> TABLE NAME </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Exported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Imported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Diff Rows </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">1 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">NAME_DATA  </TD>
<b><td class="x3w" bgcolor="#E3E4FA">24760 </TD>
<b><td class="x3w" bgcolor="#E3E4FA"> </TD>
<b><td class="x3w" bgcolor="#FF0000">Not exist on Imp </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">2 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">WHITE_LIST_MIG  </TD>
<b><td class="x3w" bgcolor="#E3E4FA">12912 </TD>
<b><td class="x3w" bgcolor="#E3E4FA"> </TD>
<b><td class="x3w" bgcolor="#FF0000">Not exist on Imp </TD></TR>
</TABLE>
<h3> Imp and Exp logs Missmatch </h3>
<h2> All Exp , Imp logs Info   </h2>
<BR><TABLE  width="100%" class="x1h" cellpadding="1" cellspacing="0" border="5">
<TR>
<b><td class="x3w" bgcolor="#808080" width="4%"> No  </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> TABLE NAME </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Exported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Imported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Diff Rows </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">1 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">ADDRESS_DATA  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">13753 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">13753 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">2 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">ADDRESS_NAME_LINK  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">68715 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">68715 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">3 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AGREEMENT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">4 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AGREEMENT_RESOURCE  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">29979 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">29979 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">5 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AGR_RES_HISTORY  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">6 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_ACCOUNT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">7 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_ADDRESS_NAME  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">25824 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">25824 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">8 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_AGED_TRIAL_BALANCE  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18780 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18780 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">9 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_BILLING_ARRANGEMENT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">10 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_CHARGES  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18069 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18069 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">11 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_CHARGE_GROUP  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18069 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18069 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">12 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_CREDIT_DEBIT_LINK  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">11032 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">11032 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">13 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_CUSTOMER_CREDIT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">359 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">359 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">14 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_INVOICE  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18428 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18428 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">15 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_JGL_CONTROL  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">16 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_PAYMENT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">8439 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">8439 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">17 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_PAYMENT_DETAILS  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">8439 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">8439 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">18 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_PAY_CHANNEL  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">19 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_PROOF_AND_BALANCE  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">4 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">4 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">20 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_TAX_ITEM  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">21 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_TRANSACTION_LOG  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">26867 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">26867 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">22 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">AR1_UNAPPLIED_CREDIT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">711 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">711 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">23 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_ACTIVITY_HISTORY  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">30928 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">30928 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">24 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_BILL_STATEMENT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">25 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_BLNG_ARRANGEMENT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">26 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_CHARGE  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">27 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_CHARGE_REQUEST  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">1966 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">1966 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">28 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_CUSTOMER  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">29 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_CUSTOMER_INFO  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">55803 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">55803 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">30 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_CYCLE_CUSTOMERS  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">31 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_CYC_PAYER_POP  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">32 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_DOCUMENT  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">33 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_INVOICE  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">34 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_INV_CHARGE_REL  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">17269 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">35 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_PAY_CHANNEL  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">12912 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">36 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_RC_RATES  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">30928 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">30928 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">37 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_TAX  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">38 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL1_TAX_ITEM  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">39 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">BL9_PROVINCIAL_PCP  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18259 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">18259 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>

Moderator's Comments:
Mod Comment Kindly use CODE TAGS as per forums rules to wrap your codes/samples without fail.

Last edited by RavinderSingh13; 1 Week Ago at 02:43 AM..
# 2  
Old 1 Week Ago
Hello deepti01,

Good that you have shown us your code what you have tried but for guidance IMHO you should keep your question to the point and crispy.
Kindly do let us know complete requirement along with that show us small samples of input and expected output in CODE TAGS(see how I edited your code into tags and it is looking different now) and let us know then.


Thanks,
R. Singh
# 3  
Old 1 Week Ago
Hi ,

I have reduced the html code :
Below you can see two tables ,but my code is picking just one table.
I need to know what modification i need to do in the code so that it may take the complete html tables (both tables)

Code:
<html>
<body>
<b><br>Running Date: </b>11-JAN-2019 03:07</br>
<h2> Schema mapping and info    </h2>
<BR><TABLE  width="100%" class="x1h" cellpadding="1" cellspacing="0" border="5">
<TR>
<b><td class="x3w" bgcolor="#808080" width="4%"> No </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Exp Schema e </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Export Tables </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Imp Schema </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Import Tables </TD>
<b><td class="x3w" bgcolor="#808080" width="5%"> Diff  </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">1 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">FVT4 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">54 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">PRDCUSTO </TD>
<b><td class="x3w" bgcolor="#E3E4FA">54 </TD>
<b><td class="x3w" bgcolor="#E3E4FA"> </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">1 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">FVT4 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">56 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">All Imp Schema</TD>
<b><td class="x3w" bgcolor="#E3E4FA">54 </TD>
<b><td class="x3w" bgcolor="#FF0000">2 </TD></TR>
</TABLE>
<h2> Missing Tables on ImpLogs   </h2>
<h3>       TABLE_NAME :NAME_DATA </h3>
<h3>       TABLE_NAME :WHITE_LIST_MIG </h3>
<h2> Table Rows Comparison   </h2>
<BR><TABLE  width="100%" class="x1h" cellpadding="1" cellspacing="0" border="5">
<TR>
<b><td class="x3w" bgcolor="#808080" width="4%"> No </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> TABLE NAME </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Exported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Imported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Diff Rows </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">1 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">NAME_DATA  </TD>
<b><td class="x3w" bgcolor="#E3E4FA">24760 </TD>
<b><td class="x3w" bgcolor="#E3E4FA"> </TD>
<b><td class="x3w" bgcolor="#FF0000">Not exist on Imp </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#E3E4FA">2 </TD>
<b><td class="x3w" bgcolor="#E3E4FA">WHITE_LIST_MIG  </TD>
<b><td class="x3w" bgcolor="#E3E4FA">12912 </TD>
<b><td class="x3w" bgcolor="#E3E4FA"> </TD>
<b><td class="x3w" bgcolor="#FF0000">Not exist on Imp </TD></TR>
</TABLE>
<h3> Imp and Exp logs Missmatch </h3>
<h2> All Exp , Imp logs Info   </h2>
<BR><TABLE  width="100%" class="x1h" cellpadding="1" cellspacing="0" border="5">
<TR>
<b><td class="x3w" bgcolor="#808080" width="4%"> No  </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> TABLE NAME </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Exported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Imported Rows </TD>
<b><td class="x3w" bgcolor="#808080" width="20%"> Diff Rows </TD></TR>
<tr>
<b><td class="x3w" bgcolor="#BDD7EE">1 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">ADDRESS_DATA  </TD>
<b><td class="x3w" bgcolor="#BDD7EE">13753 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">13753 </TD>
<b><td class="x3w" bgcolor="#BDD7EE">0 </TD></TR>
<tr>

Moderator's Comments:
Mod Comment You have missed code tags again, kindly wrap your codes/samples in CODE TAGS.

Last edited by RavinderSingh13; 1 Week Ago at 03:12 AM..
# 4  
Old 1 Week Ago
Quote:
Originally Posted by deepti01
Hi ,

I have reduced the html code :
Below you can see two tables ,but my code is picking just one table.
I need to know what modification i need to do in the code so that it may take the complete html tables (both tables)
Thanks for adding your code.

Is this homework?

If not homework, what are you actually trying to accomplish?

Thanks.
# 5  
Old 1 Week Ago
I need to get the complete HTML tables into the .csv file which is getting created through the script.

But the issue in the script is , it is just reading the first table that has <TABLE> </TABLE> tags . I need the script to also read and extract the data from the second table which also has <TABLE></TABLE> tag.

Script is already attached.

Thanks,
Deepti
# 6  
Old 1 Week Ago
Is this homework?

It sure looks like homework from school to me for many reasons.
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Unable to send attachment with html tables in UNIX shell script Harsha Vardhan HP-UX 2 02-28-2018 02:34 PM
Splitting csv into 3 tables in html file archana25 Shell Programming and Scripting 7 07-30-2017 11:02 PM
Awk/sed HTML extract p1ne Shell Programming and Scripting 8 08-01-2016 10:33 AM
Extract table from an HTML file koutroul UNIX for Dummies Questions & Answers 4 04-11-2014 01:35 PM
Extract/Parse information from html (website) TehOne Shell Programming and Scripting 5 05-02-2012 03:12 AM
extract fields from a downloaded html file gubbu Shell Programming and Scripting 1 04-16-2012 01:55 AM
awk to create two HTML Tables dynamax Shell Programming and Scripting 2 07-08-2011 10:40 PM
extract data with awk from html files sbobotex Shell Programming and Scripting 6 12-20-2010 10:39 AM
How to extract url from html page? 14th Shell Programming and Scripting 36 10-18-2010 02:12 AM
Extract data from DB2 tables and FTP it to outside company's firewall priyanka3006 AIX 1 07-19-2010 10:27 AM
SED to extract HTML text data, not quite right! lagagnon Shell Programming and Scripting 2 01-31-2010 12:14 AM
Extract URLs from HTML code using sed L0rd Shell Programming and Scripting 13 11-30-2009 11:35 PM
sed to extract HTML content stargazerr UNIX for Advanced & Expert Users 2 03-21-2009 03:31 PM
extract data from html tables Streetrcr UNIX for Dummies Questions & Answers 8 03-20-2008 06:14 AM
How do I extract text only from html file without HTML tag los111 UNIX for Dummies Questions & Answers 4 11-28-2007 03:40 AM