Need some help with shell content scanner


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need some help with shell content scanner
# 15  
Old 05-21-2009
After my final test, I just found a problem again.

I started the scanner to search for about 5 phrases and write the files into my result file, that works fine so far. But after about an hour and I dont know how many files, the process is no longer listed with "top", but it is still not finished.

Now I had a look in WHM and there I see the process python running with 40% CPU usage, getting smaller every 5 seconds. In my shell with top the cpuīs are not showing that load.

Any idea where that problem comes from? I just searched, but I cant find where a limit could be that stops python.

Last edited by medic; 05-21-2009 at 01:08 PM..
# 16  
Old 05-21-2009
Quote:
Originally Posted by medic
Any idea where that problem comes from? I just searched, but I cant find where a limit could be that stops python.
put something like this in your code
Code:
            if size <= 2048000:
                # open("logger.txt","a").write("doing "+os.path.join(r,files)+"\n")
                # print "doing ..." + os.path.join(r,files)
                o=open(outfile,"a")
                .......

this is create a logger.txt file. you can tail -f this file to check progress. OR you can just print to stdout the progress. Using top only shows you partial information. best is to use ps -ef |grep "process".
If its always printed, then i believe you really have A LOT of files to process.
# 17  
Old 05-22-2009
I already added such a line that is always telling me what the script is doing, but I think there are really a lot of files.

There are about 1000 public_html directories and every folder has about 100-200 files.

So donīt know if it is a problem to work 500 000 files or even more.
# 18  
Old 05-22-2009
Quote:
Originally Posted by medic
I already added such a line that is always telling me what the script is doing, but I think there are really a lot of files.

There are about 1000 public_html directories and every folder has about 100-200 files.

So donīt know if it is a problem to work 500 000 files or even more.
well, if you really have THAT much files, there's really no choice right? between using find+xargs and one using Python, you can time both and use the one more efficient. I guess that's already a bonus if you find one that is fast enough. (or you can wait for some other solutions to come by)
# 19  
Old 05-22-2009
From what you explained me python is a very nice way to write such files. I just need to ensure that the scanner is not stopping its work after 1 hour because that happened.

It will wait till the scanner now stops with the added print so that I could find out where it happened.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Hardware

Epson Scanner

Running Debian 8.5 on a Dell Laptop I have an Epson V39 scanner. Simple scan cannot detect it. Here is what I have: root@server1:/home/server1# sane-find-scanner # sane-find-scanner will now attempt to detect your scanner. If the # result is different from what you expected, first... (2 Replies)
Discussion started by: Meow613
2 Replies

2. Ubuntu

Can Scanner be Initialized from the Terminal

Hi, somewhat of a newbie with Linux, although I have been at it for about three weeks now. Is there a way to wake up or initialize my scanner with a command in the terminal? (6 Replies)
Discussion started by: klrman
6 Replies

3. Red Hat

IP Scanner tool

Hey guys.. What is the best tool that can be used on Linux for IP scanning tool that can bring ping status, hostname, and any other open service. I wish I can find a tool like "The Dude" from Mikrotik, but that works only under Windows. Thanks (4 Replies)
Discussion started by: leo_ultra_leo
4 Replies

4. Shell Programming and Scripting

Shell :copying the content from one file to another

I have a log containing the below lines. file1.log ----------- module: module1 module10 module2 module002 module9 moduleRT100.2.1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ... (1 Reply)
Discussion started by: giridhar276
1 Replies

5. Linux

micro film scanner

epson microfilm 500 scsi: Is there any way to make this work under linux ? I'm using pclinuxos, it shows the machine in the device panel as sg2 and lists the machine , so Im guessing the kernel knows what it is, but I can't view it as a scanner or capture or input device . What catagory does... (4 Replies)
Discussion started by: tom1200
4 Replies

6. Shell Programming and Scripting

Need get content of ELF shell script

I have a script file that file type is ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.9, dynamically linked (uses shared libs) Now I want to get the contents of this file. How can I ? Any help me to get cotents of this file type? (2 Replies)
Discussion started by: karthickk02
2 Replies

7. Shell Programming and Scripting

Shell script to remove some content in a file

How can I remove all data that contain domain e.g zzgh@something.com, sdd@something.com.my and gg@something.my in one file? so that i only have data without the domain in the file. Here is the file structure "test.out" more test.out 1 zzztop@b.com 1 zzzulll 1 zzzullll@s.com.my ... (4 Replies)
Discussion started by: Mr_47
4 Replies

8. Shell Programming and Scripting

shell script to edit the content of a file

Hi I need some help using shell script to edit a file. My original file has the following format: /txt/email/myemail.txt /txt/email/myemail2.txt /pdf/email/myemail.pdf /pdf/email/myemail2.pdf /doc/email/myemail.doc /doc/email/myemail2.doc I need to read each line. If the path is... (3 Replies)
Discussion started by: tiger99
3 Replies

9. Solaris

log file scanner

anyone know of a FREE logfile checker that they would recommend? looking to scan thru syslog, sulog, messages, etc... looking for security type related entries., thanks, brian (1 Reply)
Discussion started by: BG_JrAdmin
1 Replies
Login or Register to Ask a Question