Make python script ignore .htaccess


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Make python script ignore .htaccess
# 1  
Old 05-24-2009
Question Make python script ignore .htaccess

I just wrote a tiny script with the help of ghostdog74 to search all my files for special content phrases.

After a few modifications I now made it work, but one problem is left. The files are located in public_html folder, so there might also be .htaccess files.

So I ignored scanning of that files, but for example if there is

Code:
<FilesMatch \.php$>
deny from all
</FilesMatch>

in it, the script is also not able to scan the .php files. So my question is if it is possible to tell python or the script or the cron that starts it to ignore what .htaccess tells it.

Hopefully someone has an idea.
# 2  
Old 05-24-2009
i don't understand what you want to do. If you want to skip .htaccess files, just give it a "if" statement
Code:
...
if filename != ".htaccess":

otherwise, show your whole code.
# 3  
Old 05-24-2009
The problem is not to skip the htaccess file, it is the effect that files have on python.

So here is the code for example:
Code:
#!/usr/bin/env python

import os

outfile = os.path.join("/home","user","public_html","myscanner","scans","scan_result.php")
logfile = os.path.join("/home","user","public_html","myscanner","scans","log_result.php")

datei=open(outfile,"w")
datei.close()
dateilog=open(logfile,"w")
dateilog.close()

root="/home"
for allfiles in os.listdir(root):
    if os.path.isdir(os.path.join(root, allfiles)):
    if "id" in allfiles:
        newpath = os.path.join(root,allfiles)
        for r,d,f in os.walk(newpath):
        if "public_html" in r:
                for files in f:
                if os.path.isfile(os.path.join(r,files)):
                    size=os.path.getsize(os.path.join(r,files))
                    if files.startswith("."):
                    break
                    else:
                    if size <= 2048000:
                         print files

So now for explanation, first that code should list all files within the "root" directory. Next step is to only use the folders with "id" in it and search for public_html within that directories.
Next step is to exclude all files that start with ".", that also includes .htaccess and finally there is the filter for only taking files smaller than 2048000.

That piece of code just works fine and list for example all .php files that pass all that criteria, but if in one of these folders a .htaccess with the following code in it:

Code:
<FilesMatch \.php$>
deny from all
</FilesMatch>

The scanner is not able to read the .php files. I just tested it a few times and without .htaccess he can show all files including the .php but with that htaccess in it, the .php files are not shown any more. The rest of the files, for example .html is still shown.

Hopefully you understood what I mean, because I am just starting with python, maybe there is a logical mistake within my code.
# 4  
Old 05-24-2009
Quote:
Originally Posted by medic
The problem is not to skip the htaccess file,
but your thread title say so.
Quote:
Code:
datei=open(outfile,"w")
datei.close()
dateilog=open(logfile,"w")
dateilog.close()

you are opening and closing file handles at the same time. don't understand what you want to do here.

Quote:
Code:
root="/home"
for allfiles in os.listdir(root):
    if os.path.isdir(os.path.join(root, allfiles)):    <<-------- if os.path.isdir(os.path.join(root, allfiles)) and "id" in allfiles
    if "id" in allfiles:
        newpath = os.path.join(root,allfiles)
        for r,d,f in os.walk(newpath):
        if "public_html" in r:   <---------------------- indent
                for files in f:
                if os.path.isfile(os.path.join(r,files)):     <---------------------- indent ( you also don't need this, as "f" variable already contains files...
                    size=os.path.getsize(os.path.join(r,files))
                    if files.startswith("."):
                    break      <------------------ indent
                    else:
                    if size <= 2048000:
                         print files

check your indentation. (are you sure it works.?)

Quote:
Next step is to exclude all files that start with ".", that also includes .htaccess and finally there is the filter for only taking files smaller than 2048000.
so you want to exclude .htaccess after all?

Quote:
That piece of code just works fine and list for example all .php files that pass all that criteria, but if in one of these folders a .htaccess with the following code in it:

Code:
<FilesMatch \.php$>
deny from all
</FilesMatch>

The scanner is not able to read the .php files.
now you confused me. the code doesn't have any part that scans the inside of the files. They just only get the filesize...

Quote:
I just tested it a few times and without .htaccess he can show all files including the .php but with that htaccess in it, the .php files are not shown any more. The rest of the files, for example .html is still shown.
check the permission of .htaccess and the permission of the one running the python script.

Code:
import os

outfile = os.path.join("/home","user","public_html","myscanner","scans","scan_result.php")
logfile = os.path.join("/home","user","public_html","myscanner","scans","log_result.php")

#datei=open(outfile,"w")
#datei.close()
#dateilog=open(logfile,"w")
#dateilog.close()

root="/home"
for allfiles in os.listdir(root):
    if os.path.isdir(os.path.join(root, allfiles)) an "id" in allfiles:
        newpath = os.path.join(root,allfiles)
        for r,d,f in os.walk(newpath):
            if "public_html" in r:
                for files in f:
                    size=os.path.getsize(os.path.join(r,files))
                        if files.startswith("."):
                            #break <<<----------- you are breaking out of the second for loop. which is not what you want. You use "continue" here. 
                            continue
                        else:
                            if size <= 2048000:
                                print files


Last edited by ghostdog74; 05-24-2009 at 09:33 PM..
# 5  
Old 05-25-2009
Thanks for the fast reply.

As always you found the problem, it was:

Code:
                        if files.startswith("."):
                            #break <<<----------- you are breaking out of the second for loop. which is not what you want. You use "continue" here. 
                            continue

The break was my logical mistake, I was searching all the time.

With the following piece of code, I just opened the scanner files at the beginning and killed the old logs in it or even if the file got lost it is created.

Code:
datei=open(outfile,"w")
datei.close()
dateilog=open(logfile,"w")
dateilog.close()

Thanks for the hint with the next lines, I just started with python and added line by line not thinking of being able to combine them. Smilie

Code:
if os.path.isdir(os.path.join(root, allfiles)):    <<-------- if os.path.isdir(os.path.join(root, allfiles)) and "id" in allfiles
    if "id" in allfiles:

The next line I used because at one test I had an error, because I just copied a folder into public_html, that was a link to the mail folder. So added that code to check if the file is really a file and not a folder or link. Is there also a logical mistake?

Code:
if os.path.isfile(os.path.join(r,files)):

Thanks again for that fast help.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Python: make dual vector dot-product more pythonic

I have this dot product, calculating weighted means, and is applied to two columns in a list: # calculate weighted values in sequence for i in range(len(temperatures)-len(weights)): temperatures.append(sum(*temperatures for j in range(len(weights))])) temperatures.append(sum(*temperatures... (1 Reply)
Discussion started by: figaro
1 Replies

2. Windows & DOS: Issues & Discussions

How to execute python script on remote with python way..?

Hi all, I am trying to run below python code for connecting remote windows machine from unix to run an python file exist on that remote windows machine.. Below is the code I am trying: #!/usr/bin/env python import wmi c = wmi.WMI("xxxxx", user="xxxx", password="xxxxxxx")... (1 Reply)
Discussion started by: onenessboy
1 Replies

3. Shell Programming and Scripting

Ignore lines in Shell Script

Hi, I have a shell script, which reads a *.txt file - line by line. In this text file, I have some lines beginning with "#" that I want to ignore : MY_FILE #blah blah blah 1 blah blah blah 2 blah blah blah 3 #blah blah blah 4 I want my script to read only the following lines... (3 Replies)
Discussion started by: ad23
3 Replies

4. Shell Programming and Scripting

Find all .htaccess files and make a backup copy in respective directories

Hey guys, I need to know how to locate all .htaccess files on the server and make a backup of them in the folder they reside before I run a script to modify all of them. So basically taking dir1/.htaccess and copying it as dir1/.htaccess_bk dir2/.htaccess copying as dir2/.htaccess_bk... (5 Replies)
Discussion started by: boxx
5 Replies

5. UNIX for Dummies Questions & Answers

How to ignore errors in script

I have a simple script that processes files. Here's a simplified example of what I'm doing: foreach t (web.*) mv $t dnw$t:e.log end foreach t (card.*) mv $t card$t:e.log end The problem is that sometimes there is no web.* file. In that case, I get an error "foreach: No match" and... (4 Replies)
Discussion started by: software5723
4 Replies

6. Shell Programming and Scripting

Ignore diriectories within a script

Hi there, I have a small issue with a script that I am running. I need it to ignore certain dir when copying over files. Ie the code is pointing towards the dir etest but I need to ignore the dirs INLINE and ENG which is contained within this...could anyone give me a pointer on how to do this? I... (9 Replies)
Discussion started by: lodey
9 Replies

7. UNIX for Dummies Questions & Answers

Howto locate locally installed Perl module for a CGI script in APACHE .htaccess

Hi, I have the following simple CGI script, just containg header: #!/usr/bin/perl -w use CGI ':standard'; use lib "/home/myname/lib/perl5/site_perl/5.8.5/"; use Mail::Sendmail; I also have included this directive in ~/public_html/.htaccess : SetEnv PERL5LIB... (0 Replies)
Discussion started by: monkfan
0 Replies

8. Shell Programming and Scripting

Make sed ignore lines

Hi I use sed in a script for severall changes in files. I whish one of the substitutions I made to be aplied to every line that has the word "scripts" with the exception for the ones that start with "rsh", wich I wish sed to ignore . Is this possible? If yes, how can I do it? The substitution... (2 Replies)
Discussion started by: Scarlos
2 Replies

9. UNIX for Advanced & Expert Users

how to make a current running process ignore SIGHUP signal?

I ask this question since sometimes i run a time-consuming ftp in foreground and forget to use nohup ftp.sh & to put this work background and can still running after i log off. Assume this ftp task have run 1 hour, and still 1 hour time to end, i don't want to abort the ftp, first, i use ctrl+Z... (3 Replies)
Discussion started by: stevensxiao
3 Replies
Login or Register to Ask a Question