Need help optimizing this piece of code (Shell script Busybox)


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need help optimizing this piece of code (Shell script Busybox)
# 8  
Old 10-04-2011
I'm assuming the XML data for that looks like what you posted before?

Working on something.
# 9  
Old 10-04-2011
Quote:
Originally Posted by Corona688
I'm assuming the XML data for that looks like what you posted before?

Working on something.
Yes, but extracting the title from the MovieInfo.nfo file is pretty fast it's all those if / test statement and writing to the file that seem to take a long time. For example if my drive contain approximately 200 movies (not really that much considering some people have thousand) divided in 9 movies genres the loop would be run like 10 times using the same list but different MOVIESPATH. For example

first run = All the movies 200 index to create XML file
Second run = genre science-fiction maybe 30 movies indexes to create XML file
Third run = genre romance maybe 20 movies indexes to create XML file
etc.....

Obviously the total of movies for the Second run to the last run (10) will include a total of 200 movies.

I hope I am making sense and thanks again for your help/time.
# 10  
Old 10-04-2011
Here's a version tested in busybox which uses almost pure shell builtins:

Code:
MOVIESPATH="./moviedir/"
SORTEDTMP="./movie"
OLDIFS="$IFS"
RSS="rssfile"

# I just do this to have any info at all...
find ./moviedir -iname '*.avi' > "$SORTEDTMP"

grep "$MOVIESPATH" "$SORTEDTMP" | while read LINE
do
        MOVIEPATH="${LINE%/*}"  # Shell builtins instead of basename
        MOVIEFILE="${LINE##*/}" # Shell builtins instead of basedir

        if ! [ "$MOVIEPATH/$MOVIEFILE" = "$LINE" ]
        then
                echo "Error processing line" >&2
                continue
        fi

        # Initialize defaults, replace later
        MOVIETITLE="${MOVIEFILE/.*}"  # Strip off .ext
        MOVIESHEET=/usr/local/etc/srjg/NoMovieinfo.bmp
        MOVIEPOSTER=/usr/local/etc/srjg/nofolder.bmp

        if [ -e "$MOVIEPATH/MovieInfo.nfo" ]
        then
                # Look for lines matching <title>
                while read LINE
                do
                        # Strip out <title> to make it shorter.
                        SHORT="${LINE/<title>}"
                        # If it's not shorter, it didn't have <title>
                        [ "${#SHORT}" = "${#LINE}" ] && continue

                        LINE="${LINE//<title>}"  # Strip out <title>
                        LINE="${LINE//<?title>}" # Strip out </title>

                        MOVIETITLE="$LINE"
                        break   # Found <title>, quit looking
                done <"$MOVIEPATH/MovieInfo.nfo"
        fi

        # Check for any files of known purpose inside the movie's folder.
        for FILE in "$MOVIEPATH"/*
        do
                [ -e "$FILE" ] || break # No files exist?

                case "${FILE##*/}" in
                "folder.jpg")           MOVIEPOSTER="$FILE"     ;;
                "${MOVIENAME}.jpg")     MOVIEPOSTER="$FILE"     ;;
                "about.jpg")            MOVIESHEET="$FILE"      ;;
                "0001.jpg")             MOVIESHEET="$FILE"      ;;
                "${MOVIENAME}_sheet.jpg")       MOVIESHEET="$FILE" ;;
                *)      ;;
                esac
        done

        # Print it all in one whack with a here-document.
        cat <<EOF
<Movie>
<title>$MOVIETITLE</title>
<poster>$MOVIEPOSTER</poster>
<info>$MOVIESHEET</info>
<file>$MOVIEFILE</file>
</Movie>

EOF
        # Note:  OVERWRITES $RSS
done > $RSS

---------- Post updated at 02:02 PM ---------- Previous update was at 01:57 PM ----------

Quote:
Originally Posted by snappy46
Yes, but extracting the title from the MovieInfo.nfo file is pretty fast it's all those if / test statement and writing to the file that seem to take a long time.
What sort of disk are you writing to?

Shell builtins are hundreds of times faster than calling an external utility to operate on tiny amounts of data.

I think I've reduced the number of if/else's by using a case, too.

Also, you were reopening $RSS dozens of times, which probably didn't help.
This User Gave Thanks to Corona688 For This Post:
# 11  
Old 10-05-2011
Quote:
Originally Posted by Corona688
What sort of disk are you writing to?

Shell builtins are hundreds of times faster than calling an external utility to operate on tiny amounts of data.

I think I've reduced the number of if/else's by using a case, too.

Also, you were reopening $RSS dozens of times, which probably didn't help.
That is awesome I did not expect anyone to do all the dirty work for me. Hopefully this script will work fine on the old default busybox available on the media player.

I forgot to mentioned on my previous post that the whole indexing process All movies + all genres (200 movies) takes about 3to4 minutes. Hopefully your way of doing things will cut that down some.

The jukebox.xml file created by the loop is normally stored/written to the media player (internal drive) or externally connected USB drive; it all depends where the movies are located. The jukebox.xml is store in the genre root directory or the movie directory for the All movies.

I can wait to try this out. I will post the results once I did.

Again thank you. Smilie

---------- Post updated at 11:27 PM ---------- Previous update was at 04:55 PM ----------

Hi Corona,

Here's your script with some minor changes to make it work for me.

Code:
#MOVIESPATH="./moviedir/"
#SORTEDTMP="./movie"
#OLDIFS="$IFS"
#RSS="rssfile"

# I just do this to have any info at all...
#find ./moviedir -iname '*.avi' > "$SORTEDTMP"

grep "$MOVIESPATH" "$SORTEDTMP" | while read LINE
do
        MOVIEPATH="${LINE%/*}"  # Shell builtins instead of dirname
        MOVIEFILE="${LINE##*/}" # Shell builtins instead of basename
        MOVIENAME="${MOVIEFILE%.*}"  # Strip off .ext       

        if ! [ "$MOVIEPATH/$MOVIEFILE" = "$LINE" ]
        then
                echo "Error processing line" >&2
                continue
        fi

        # Initialize defaults, replace later


        MOVIETITLE="$MOVIENAME"
        MOVIESHEET=/usr/local/etc/srjg/NoMovieinfo.bmp
        MOVIEPOSTER=/usr/local/etc/srjg/nofolder.bmp

  if [ -e "$MOVIEPATH/MovieInfo.nfo" ];
  then
     MOVIEINFO="$MOVIEPATH/MovieInfo.nfo"
     MOVIETITLE=`grep "<title>.*<.title>" "$MOVIEINFO" | sed -e "s/^.*<title/<title/" | cut -f2 -d">"| cut -f1 -d"<"`
  fi

#        if [ -e "$MOVIEPATH/MovieInfo.nfo" ]
#        then
                # Look for lines matching <title>
#                while read LINE
#                do
                        # Strip out <title> to make it shorter.
#                        SHORT="${LINE/<title>}"
                        # If it's not shorter, it didn't have <title>
#                        [ "${#SHORT}" = "${#LINE}" ] && continue

#                        LINE="${LINE//<title>}"  # Strip out <title>
#                        LINE="${LINE//<?title>}" # Strip out </title>

#                        MOVIETITLE="$LINE"
#                        break   # Found <title>, quit looking
#                done <"$MOVIEPATH/MovieInfo.nfo"
#        fi

        # Check for any files of known purpose inside the movie's folder.
        for FILE in "$MOVIEPATH"/*
        do
                [ -e "$FILE" ] || break # No files exist?

                case "${FILE##*/}" in
                "folder.jpg")           MOVIEPOSTER="$FILE"     ;;
                "${MOVIENAME}.jpg")     MOVIEPOSTER="$FILE"     ;;
                "about.jpg")            MOVIESHEET="$FILE"      ;;
                "0001.jpg")             MOVIESHEET="$FILE"      ;;
                "${MOVIENAME}_sheet.jpg")       MOVIESHEET="$FILE" ;;
                *)      ;;
                esac
        done

        # Print it all in one whack with a here-document.
        cat <<EOF
<Movie>
<title>$MOVIETITLE</title>
<poster>$MOVIEPOSTER</poster>
<info>$MOVIESHEET</info>
<file>$MOVIEPATH/$MOVIEFILE</file>
</Movie>

EOF
        # Note:  OVERWRITES $RSS
done >> $RSS

I could not get the function to extract the title from the MovieInfo.nfo file to work so I just inserted the one I already had in my script just to test the difference between the two script. The results were very surprising to me to say the least.

The original script as provided in my first post took 1 minutes and 49 second to process my 200 movies. The new source code you provided took 2 minutes and 46 seconds almost a whole minute longer ???? I could not believe the results so I tried it again with the same results. It would appear that using sed/cut/grep etc to get the work done is faster than using the built-in substitution command ????? I was quite shock, something in there seem to take a lot of time to accomplish.

I still think that I can use some of your code process to cut down further the process time ... well maybe. I would think that creating the file only once would be faster than appending the file for every movie element. I will try to introduce some of those step you have one at a time and see what makes thing go faster and what make things go slower.

I learned a lot from your inputs/code so again thank you.

Snappy46
# 12  
Old 10-05-2011
Quote:
Originally Posted by snappy46
I could not get the function to extract the title from the MovieInfo.nfo file to work so I just inserted the one I already had in my script just to test the difference between the two script.
Throwing away nearly all the speed gain in the process Smilie

If you could post the literal contents of one of those files, that'd be good. I built it to work with your test data. If it's actually any different, then I really need to see what it is.
Quote:
It would appear that using sed/cut/grep etc to get the work done is faster than using the built-in substitution command ????? I was quite shock, something in there seem to take a lot of time to accomplish.
The built-in substitution operator is hundreds of times faster than calling an external utility. At the very least the builtins instead of basename and basedir ought to be better. The rest of my code may have depended on certain assumptions about your data.

Are these folders full of irrelevant files? If so, the for FILE in "$MOVIEPATH"/* loop will waste a lot of time. Come to think of it, since we're only interested in .jpg, you can make it for FILE in "$MOVIEPATH"/*.jpg

---------- Post updated at 09:29 AM ---------- Previous update was at 09:20 AM ----------

Here's a version which works without trawling every file in the folder. You can replace the long if-else chain with two for's. It also helps make the lists longer without making your code longer (though testing for too many things will slow you down in any case).

Code:
for FILE in "folder.jpg" "${MOVIENAME}.jpg"
do
        [ ! -e "$MOVIEPATH/$FILE" ] && continue
        MOVIEPOSTER="$MOVIEPATH/$FILE"
        break
done

for FILE in "about.jpg" "0001.jpg" "${MOVIENAME}_sheet.jpg"
do
        [ ! -e "$MOVIEPATH/$FILE" ] && continue
        MOVIESHEET="$MOVIEPATH/$FILE"
        break
done

---------- Post updated at 09:35 AM ---------- Previous update was at 09:29 AM ----------

And you can strip out this completely:
Code:
        if ! [ "$MOVIEPATH/$MOVIEFILE" = "$LINE" ]
        then
                echo "Error processing line" >&2
                continue
        fi

I just put it there in case your input data was radically different from what I assumed it was.

Last edited by Corona688; 10-05-2011 at 01:35 PM..
# 13  
Old 10-06-2011
Quote:
Originally Posted by Corona688
Throwing away nearly all the speed gain in the process Smilie
I know this is temporary I plan on figuring out why this does not work; I am getting "bad substitution" on the busybox version that I am using on those lines:

SHORT="${LINE/<title>}"
LINE="${LINE//<title>}" # Strip out <title>
etc....

Can not remember which ones exactely but more than one of those substitution caused a problem. I had the same issue with that line:

Code:
 
="${MOVIEFILE/.*}"  # Strip off .ext

which I fixed by changing to this:

Code:
 
="${MOVIEFILE%.*}"  # Strip off .ext

I did not have much time to play around with that last nights and was curious about the change in processing time so I jump right in with the old code. Apparently my wife's needs are more important than working on this script. Smilie

Quote:
Originally Posted by Corona688
If you could post the literal contents of one of those files, that'd be good. I built it to work with your test data. If it's actually any different, then I really need to see what it is. The built-in substitution operator is hundreds of times faster than calling an external utility. At the very least the builtins instead of basename and basedir ought to be better. The rest of my code may have depended on certain assumptions about your data.
I will provide sample of the input files (SORTEDTMP, MovieInfo.nfo) and output file (jukebox.xml) hopefully tonight time permitting; I am not home right now and do not have access to those file. Agree built-in substitution should be faster.

Quote:
Originally Posted by Corona688
Are these folders full of irrelevant files? If so, the for FILE in "$MOVIEPATH"/* loop will waste a lot of time. Come to think of it, since we're only interested in .jpg, you can make it for FILE in "$MOVIEPATH"/*.jpg
Yeah there is quite a few files on there some relevant some not so much. I think you just nailed it why it takes so much more time. I will use those new "for loop" you provided and I have a feeling that I will finally see some speed increase compare to the old script.

Thanks again from one fellow Canadian to another.

Cheers!!!

---------- Post updated 10-06-11 at 12:17 AM ---------- Previous update was 10-05-11 at 01:51 PM ----------

Ok when using the for loop and deleting the unnecessary if statement as indicated in your previous post I now can process my 200 movies in 1 minutes and 10 seconds .... wow!! That is an improvement of about 40 sec compare to the original script. That makes more sense; I guess the case statement + the number of files were definitely slowing things down.

Now trying to get the procedure to pull the title from the MovieInfo.nfo file to work but I am still stuck on one thing that does not want to work. In my version of busybox we must use "#" for deletion from the left to match; and "%" for deletion from the right to match. Once I figure that out you would think that thing would have when pretty smoothly but of course it did not. I can easily delete the <title> but I am unsuccessful in deleting the </title> ... I think that the "/" forward slash is creating a problem ???? Funny thing is that it works fine in the interactive mode but refuses to work in the script ????

This works fine in the shell:

export foo="<title>hello</title>
echo "${foo%</title>}"
or
echo "${foo%%</title>}"
or
echo "${foo%<?title>}"
or
echo "${foo%<*title>}"

all those returns "<title>hello" but none of those combination will remove the </title> when used in the script ???? This is driving me crazy any ideas??????

But this works fine in the script to remove the <title>

Code:
LINE="${LINE#<title>}"  # Strip out <title>

By the way I also removed that line since it is not really needed and just used SHORT for the Strip out </title>

Here's the procedure that works fine except for the strip put </title>.

Code:
       if [ -e "$MOVIEPATH/MovieInfo.nfo" ]
        then
                # Look for lines matching <title>
                while read LINE
                do
                        # Strip out <title> to make it shorter.
                        SHORT="${LINE#<title>}"
                        # If it's not shorter, it didn't have <title>
                        [ "${#SHORT}" = "${#LINE}" ] && continue

#                        LINE="${LINE#<title>}"  # Strip out <title>
                        LINE="${SHORT%%</title>}" # Strip out </title>

                        MOVIETITLE="$LINE"
                        break   # Found <title>, quit looking
                done <"$MOVIEPATH/MovieInfo.nfo"
        fi

Thanks
# 14  
Old 10-06-2011
# strips from the beginning of the string, % from the end. ## and %% are the same when you're removing a literal string, so might as well use % and #. See string operations. Not all shells have all of these. Some don't have any of them. Developers on embedded busybox are more fortunate than some people using 'real' shells, it has at least a few Smilie
Code:
VAR="${VAR#<title>}" # Strip out <title>
VAR="${VAR%</title>}" # Then strip out </title>

---------- Post updated at 03:17 PM ---------- Previous update was at 03:08 PM ----------

YOu could also try putting quotes around "${VAR%"</title>"}"

---------- Post updated at 03:20 PM ---------- Previous update was at 03:17 PM ----------

globbing should also work inside it, so:

Code:
VAR="${VAR#<[^>]*>}" # Remove first <...>
VAR="${VAR%<[^>]*>}" # Remove last <...>

---------- Post updated at 03:21 PM ---------- Previous update was at 03:20 PM ----------

Also: If it works in the prompt and not in your script, then your data may not be what you really think it is. You feed it a nice "<title>stuf</title>" and it works but the data actually has stuff after <title> perhaps. PLEASE post a sample of your input data!!

Last edited by Corona688; 10-06-2011 at 07:16 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Optimizing the Shell Script [Expert Advise Needed]

I have prepared a shell script to find the duplicates based on the part of filename and retain latest. #!/bin/bash if ; then mkdir -p dup fi NOW=$(date +"%F-%H:%M:%S") LOGFILE="purge_duplicate_log-$NOW.log" LOGTIME=`date "+%Y-%m-%d %H:%M:%S"` echo... (6 Replies)
Discussion started by: gold2k8
6 Replies

2. Shell Programming and Scripting

Need a piece of shell scripting to remove column from a csv file

Hi, I need to remove first column from a csv file and i can do this by using below command. cut -f1 -d, --complement Mytest.csv I need to implement this in shell scripting, Whenever i am using the above command alone in command line it is working fine. I have 5 files in my directory and... (3 Replies)
Discussion started by: Samah
3 Replies

3. Shell Programming and Scripting

Bash Script to Ash (busybox) - Beginner

Hi All, I have a script that I wrote on a bash shell, I use it to sort files from a directory into various other directories. I have an variable set, which is an array of strings, I then check each file against the array and if it is in there the script sorts it into the correct folder. But... (5 Replies)
Discussion started by: sgtbobie
5 Replies

4. Shell Programming and Scripting

Optimizing the code

Hi, I have two files in the format listed below. I need to find out all values from field 12 to field 20 present in file 2 and list them in file3(format as file2) File1 : FEIN,CHRISTA... (2 Replies)
Discussion started by: nua7
2 Replies

5. Programming

what is the name of this piece of code

while ((numRead = read(inputFd, buf, BUF_SIZE)) > 0) if (write(outputFd, buf, numRead) != numRead) fatal("couldn't write whole buffer"); if (numRead == -1) errExit("read"); if (close(inputFd) == -1) errExit("close input"); if (close(outputFd) == -1) errExit("close output"); ... (1 Reply)
Discussion started by: fwrlfo
1 Replies

6. Shell Programming and Scripting

Looking for guidance (comments) on a piece of code

Hello -- I am trying to learn to do a little sed and awk scripting to search for text and numbers in text files (text processing/manipulation). My professor gave me a piece of uncommented code and I am very unfamiliar w/ the language. Can someone help me with comments so I can understand what is... (2 Replies)
Discussion started by: smithan05
2 Replies

7. Shell Programming and Scripting

Enabling sh shell in BusyBox

Hi, Does anybody know how to enable the shell sh while creating Ramdisk fs using BusyBox? while creating a configuration using the GUI, I see options only for the ash shell. Is there some option in the config file that gets created with which I can enable the sh shell also apart from the ash... (0 Replies)
Discussion started by: jake24
0 Replies

8. Shell Programming and Scripting

script or piece of code where the data returned by a stored procedure you are writing

hi fndz. Can you please help me with the code if I call a stored procedure from my shell script and stored procedure returns a cursor, cursor output should be saved to a file (3 Replies)
Discussion started by: enigma_83
3 Replies

9. Shell Programming and Scripting

what does this piece of code do?

Hi All, I am trying to understand and change some code written by some programmer a while ago. There are following three lines of code that I am unable to grasp. Could anybody please help me understand it? 1) cd - > /dev/null 2) fname=`basename "$1"` where $1 = /dirA/dirB/a.txt ... (3 Replies)
Discussion started by: Vikas Sood
3 Replies

10. Shell Programming and Scripting

a piece of code, plz help to review

use "getopts" to get params from command. Need replace black with a specified string like "%20 DEFAULT_DELIM=%20 ... while getopts dek:f:t:vh OPTION do case $OPTION in t) DELIM=`tvar=/'"$OPTARG"'/ svar="$DEFAULT_DELIM" awk 'BEGIN{T=ENVIRON;S=ENVIRON; while(index(T,S)!=0){S=S"0"};print... (0 Replies)
Discussion started by: anypager
0 Replies
Login or Register to Ask a Question