Seems that part of ddreggors code is breaking during the test command. I'm going to man test to see if I can dig up something there, here, or elsewhere.
Secondly, and please bare with me on this as I'm still learning, but what can I do to tell the script to 'do nothing and keep going' vs. echo "blah" in my loop. I feel like I'm just filling in the blanks here because I'm stumped since I'm sure if I leave it out, it'll break. Would the solution be to just echo to devnull?
Third, ddreggors, I'm looking around right now but if I'm going to use your sed example, I'll need an expression a little more complex than yours since the range of characters goes beyond just [x-z] I think what I need is [a-zA-Z0-9]. Also needs to include "(|)|:|.|,|/" (brackets, semicolons, periods, commas, slashes if I noted that right). I'll try with my own sed example first then explore later if need be.
Code:
#!/bin/sh
#TD=0
CT=0
cat status.html |while read LINE
do
# Check to see if the LINE is non-empty, and has a <td> tag in it.
if [ -n "$LINE" -a `echo $LINE |grep "<td>"` != "" ] ; then
# Increase the TD counter by 1
CT=`echo "$CT+1" |bc`
# Check to see if the TD counter is at 6 (we are at 7th TD as the counter starts at 0 not 1)
if [ "$CT" -eq 6 ] ; then
# Use sed to remove this TD tag AND everything in between
echo $LINE |sed -n '/<tr>/,/<\/tr> {
s/.*<tr>//
s/<\/tr>.*//
p
}' >> ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3
else
echo $LINE >> ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3
fi
else
echo $LINE >> ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3
fi
# If we are leaving a table row the we need to reset the TD counter!
if [ -n "$LINE" -a `echo $LINE |grep "</tr>"` != "" ] ; then
CT=0
else
echo "No reset"
fi
if [ -n "$LINE" -a `echo $LINE |grep "</html>"` != "" ] ; then
mv ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3 status.html
else
echo "Not done yet, keep going"
fi
done
# If we are leaving a table row the we need to reset the TD counter!
TEST=`echo $LINE |grep '</tr>'`
if [ -n "$TEST" ] ; then
CT=0
else
echo "No reset"
fi
TEST=`echo $LINE |grep '</html>'`
if [ -n "$TEST" ] ; then
mv ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3 status.html
else
echo "Not done yet, keep going"
fi
That should fix that, and as for the sed expression, I was sure you WOULD have to change that as I am not sure we have ever seen the exact pattern you are looking for. If you did post that pattern I missed it, sorry.
I'm probably missing quotes somewhere I figure. Tried adding them to the var but it doesn't work. Below is an update of what I have so far.
Code:
#!/bin/sh
#TD=0
CT=0
cat status.html |while read LINE
do
# Check to see if the LINE is non-empty, and has a <td> tag in it.
TD=`echo $LINE |grep '</td>'`
if [ -n "$TD" ] ; then
# Increase the TD counter by 1
CT=`echo "$CT+1" |bc`
# Check to see if the TD counter is at 6 (we are at 7th TD as the counter starts at 0 not 1)
if [ "$CT" -eq 7 ] ; then
# Use sed to remove this TD tag AND everything in between
echo $LINE |sed 's/<td>[a-zA-Z0-9|(|)]<\/td>//' >> ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3
else
echo $LINE >> ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3
fi
else
echo $LINE >> ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3
fi
# If we are leaving a table row the we need to reset the TD counter!
TR=`echo $LINE |grep '</tr>'`
if [ -n "$TR" ] ; then
CT=0
else
echo "" > /dev/null
fi
HTML=`echo $LINE |grep '</html>'`
if [ -n "$HTML" ] ; then
mv ztmp.Ps23zp2s.2-Fpps3-wmmm0dss3 status.html
else
echo "" > /dev/null
fi
done
#!/bin/sh
IN=0
CT=0
OUTFILE="TestHTML.out"
echo > $OUTFILE # Start with fresh file always
cat TestHTML.htm |while read LINE
do
# If we are entering a table row the we need to reset the TD counter
TR=`echo $LINE |grep -i '<tr'`
if [ -n "$TR" ]
then
CT=0
else
echo "" > /dev/null
fi
# Check to see if the LINE is non-empty, and has an opening td tag in it.
TD=`echo $LINE |tr -d '\n' |grep -i '<td'`
if [ -n "$TD" ]
then
# We are inside a td tag.
IN=1
fi
# Check to see if the LINE is non-empty and has a closing td tag in it.
ENDTD=`echo $LINE |tr -d '\n' |grep -i '/td>'`
if [ -n "$ENDTD" ]
then
# We are leaving a td tag.
IN=0
# Increase the TD counter by 1
CT=`echo "$CT+1" |bc`
fi
if [ "$IN" -eq 1 -a "$CT" -eq 6 -a -z "$ENDTD" ]
then
# Use sed to remove this TD tag AND everything in between
echo $LINE |tr -d '\n' |sed 's/.*//' >> $OUTFILE
elif [ "$IN" -eq 0 -a "$CT" -eq 7 ]
then
# We may (or may not) have an opening and closing td tag in 1 line.
TMP=`echo $LINE |tr -d '\n' |sed 's/<TD.*//'`
echo $TMP |sed 's/.*\/TD>//' >> $OUTFILE
else
echo $LINE >> $OUTFILE
fi
done
Hello all,
I need to filter a dataframe composed of several columns of data to remove the duplicates according to one of the columns. I did it with pandas. In the main time, I need that the last column that contains all different data ( not redundant) is conserved in the output like this:
A ... (5 Replies)
Hello All,
I have visited many pages in Unix.com and could find out one solution for merging the HTML cells in the 1st row.
(Unable to post the complete URL as I should not as per website rules).
But, however I try, I couldn't achieve this merging to happen for all other rows of HTML... (17 Replies)
Hi, I'm looking to accomplish the following.
Insert current date into three places/cells within a cvs, every time the bash script is executed.
The cells are column A,B,C row 2. Row 1 is reserved for the headers.
The file name is always orders.csv. These three cells we always have an old... (1 Reply)
Perl script to merge cells
---------- Post updated at 12:59 AM ---------- Previous update was at 12:54 AM ----------
I am using below code to read files from a dir and print to excel.
open(my $in, '<', $file) or die "Could not open file: $!";
my $rowCount = 0;
my $colCount = 0;... (11 Replies)
Hi everyone,
I have a microbial diversity table in the format ;k__kingdom; p__phylum, etc, somer rows have descriptions before the :k__ (like the af028349.1 below) is there a way I can get rid of this text (which is different every time) and keep all the other columns?
Thanks a bunch!
;... (1 Reply)
Hi, I already succeed moving a new row to another table if the field from new row doesn't have the first word that I categorized (like: IRC blablabla, PTM blablabla, ADM blablabla, BS blablabla).
But it can't delete the old row. Please help me with the script.
my php script:
INSERT INTO... (2 Replies)
Hi, I want to move a new row to another table if the field from new row doesn't have the first word that I categorized (like: IRC blablabla, PTM blablabla, ADM blablabla, BS blablabla).
I already use this script but doesn't work as I expected.
CHECK_KEYWORD="$( mysql -uroot -p123456 smsd -N... (7 Replies)
I have this space delimited large text file with more than 1,000,000+ columns and about 100 rows. I want to delete all the cells that consist of just 2 (leave 2's that are not by themselves intact):
File before modification
aa bb cc 2 NA100 dd
aa b1 c2 2 NA102 de
File after modification... (1 Reply)
Hi, I am trying to make a command to delete my files out the trash can, but one at a time.
I am currently using rm - i to do this, but the original file locations for restoring my files are heard on a .txt file which I am using as an index table.
How would I manage to make it so that if I... (21 Replies)