Use the get and post method in the bash ( multi process)?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Use the get and post method in the bash ( multi process)?
# 1  
Old 11-22-2018
Use the get and post method in the bash ( multi process)?

hi
I want to call a lot of links with the post method

What to do to speed it up??


Code:
####This method is slow


#!/bin/bash


func2() {
   index1=0
   while read line ; do
     index1=$(($index1+1))
     url[$index1]=$line
   done < tmp/url1.txt
}

func3() {
 for((j=1;j<=$countline;j++)); do
     i=`curl -LI ${url[$j]} -o /dev/null -w '%{http_code}\n' -s  ` ;
        echo "$i --- ${url[$j]}"
  done
 }
    
funcrun(){
   for ((n=1;n<=5;n++)) ; do
        func3 $countline ${url[@]}  >> 2.txt &
    done
}
   func2 
   funcrun


Last edited by mnnn; 11-22-2018 at 10:02 PM..
# 2  
Old 11-23-2018
Quote:
Originally Posted by mnnn
I want to call a lot of links with the post method
What to do to speed it up??
Well, first off, the script you showed us can't do anything, neither slow nor fast: you create an array "$url[]" inside func2() but it will be private to this function and not be known any more once you leave this function.

Second, this method of filling an array is overly complicated, regardless of being inside a function or not:

Code:
func2() {
   index1=0
   while read line ; do
     index1=$(($index1+1))
     url[$index1]=$line
   done < tmp/url1.txt
}

You can easily do it this way:
Code:
while read line ; do
     url[$(( index++ )) ]="$line"
done < tmp/url1.txt

Next, your way of going through the array can also be improved, not to mention your way of subprocess handling - BACKTICKS ARE DEPRECATED:

Code:
func3() {
i=0
while [ $i -lt ${#url[@]} ] ; do
     echo "${url[$i]} --- $(curl -LI ${url[$i]} -o /dev/null -w '%{http_code}\n' -s )"
done
}

Finally: in func3() you run through every element of the array. In funcrun() you do the same (that you provide func3() with an argument which it ignores doesn't change anything)! So in fact your run not n curl-invocations for n array elements but n-squared! What "helps" a little is that you only do it for the first 5 elements of $url[], regardless of how many elements it holds (see the for-loop in funcrun()), so you only do 5 times of what is necessary instead of n times so.

You might want first consolidate this mess. If it still is "too slow" you might think about this: you could put the curl-invocations in background so that they run in parallel instead of one after the other. Notice, though, that if that is successful depends on the number of URLs you want to pull: a dozen is perhaps no problem, a few hundreds might be, a few thousands are definitely going to be a problem. You will need a "fanout" value in this case so that only a certain maximum number of parallel processes run at the same time.

I hope this helps.

bakunin
This User Gave Thanks to bakunin For This Post:
# 3  
Old 11-23-2018
On top of what bakunin already said, I'd like to add a few comments:
- Did you consider the readarray / mapfile bash builtin for populating the array?
- Why arrays at all? Read the URL- file immediately for the curl command.
- You reference $countline twice but nowhere assign to it?
This User Gave Thanks to RudiC For This Post:
# 4  
Old 11-23-2018
Quote:
Originally Posted by bakunin
Well, first off, the script you showed us can't do anything, neither slow nor fast: you create an array "$url[]" inside func2() but it will be private to this function and not be known any more once you leave this function.

Second, this method of filling an array is overly complicated, regardless of being inside a function or not:

Code:
func2() {
   index1=0
   while read line ; do
     index1=$(($index1+1))
     url[$index1]=$line
   done < tmp/url1.txt
}

You can easily do it this way:
Code:
while read line ; do
     url[$(( index++ )) ]="$line"
done < tmp/url1.txt

Next, your way of going through the array can also be improved, not to mention your way of subprocess handling - BACKTICKS ARE DEPRECATED:

Code:
func3() {
i=0
while [ $i -lt ${#url[@]} ] ; do
     echo "${url[$i]} --- $(curl -LI ${url[$i]} -o /dev/null -w '%{http_code}\n' -s )"
done
}

Finally: in func3() you run through every element of the array. In funcrun() you do the same (that you provide func3() with an argument which it ignores doesn't change anything)! So in fact your run not n curl-invocations for n array elements but n-squared! What "helps" a little is that you only do it for the first 5 elements of $url[], regardless of how many elements it holds (see the for-loop in funcrun()), so you only do 5 times of what is necessary instead of n times so.

You might want first consolidate this mess. If it still is "too slow" you might think about this: you could put the curl-invocations in background so that they run in parallel instead of one after the other. Notice, though, that if that is successful depends on the number of URLs you want to pull: a dozen is perhaps no problem, a few hundreds might be, a few thousands are definitely going to be a problem. You will need a "fanout" value in this case so that only a certain maximum number of parallel processes run at the same time.

I hope this helps.

bakunin

Quote:
Originally Posted by RudiC
On top of what bakunin already said, I'd like to add a few comments:
- Did you consider the readarray / mapfile bash builtin for populating the array?
- Why arrays at all? Read the URL- file immediately for the curl command.
- You reference $countline twice but nowhere assign to it?




Thank you, the syntax was getting better



But the problem is still there



I took about 1 million variables from a database in another function

for example


http://mysite.com/api/tags/$var[$n]/


When I call the post method
The line is called and the answer is called
When the number of links is high


It may take up to one day for code execution If there is a way,

for example
One to 1000 in one thread

1000 to 2000 more in the other thread

3000 to 4000 in the next thread and .....

I think the speed is higher
But I do not know exactly how to manage this code
# 5  
Old 11-23-2018
You want to upate a million web sites on the internet? No surprise it's taking days...
If I got you wrong, please rephrase your problem and supply more details, like sample input files (not a million line, though - some 10 to 20).
# 6  
Old 11-23-2018
Quote:
Originally Posted by RudiC
You want to upate a million web sites on the internet? No surprise it's taking days...
If I got you wrong, please rephrase your problem and supply more details, like sample input files (not a million line, though - some 10 to 20).
There is no website, the index process is done in this way. I need to send multi thread requests. Because otherwise it will take a lot of time

Last edited by mnnn; 11-23-2018 at 01:53 PM..
# 7  
Old 11-23-2018
If there's no website, what is curl for?

And if this is local, then multithreading it may not be much help. If it's local, curl is about the worst way to do it.
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Multi Dimensional array in bash

Hi, I'm developing a script which contains a multi dimensional array, however for some reason the array is not iterating. When executing the script, services are listed as arguments from argument 2. Ex voice data sms. service=${@:2}; for services in $service do ... (2 Replies)
Discussion started by: nms
2 Replies

2. Programming

Multi process programming in C

So I am trying to learn C and am coding some scripts on my own. For a start I have decided to port the shell script developed by wisecracker into C.( Here is the link to that script A simple reminder script for beginners to shell scripting. | Unix Linux Forums | OS X (Apple) ) This is what I... (7 Replies)
Discussion started by: chacko193
7 Replies

3. Shell Programming and Scripting

script for multi-threaded bash processes

hey everyone, I'm having some trouble breaking down some code. It's simple a control script that takes machines meant to be backed up from a list. Then according to that will run multi-threaded processes up until the specified thread limit. for example if there are 4 machines to be backed up,... (2 Replies)
Discussion started by: terrell
2 Replies

4. Shell Programming and Scripting

[KSH/Bash] Starting a parent process from a child process?

Hey all, I need to launch a script from within 2 other scripts that can run independently of the two parent scripts... Im having a hard time doing this, if anyone knows how please let me know. More detail. ScriptA (bash), ScriptB (ksh), ScriptC (bash) ScriptA, launches ScriptB ScirptB,... (7 Replies)
Discussion started by: trey85stang
7 Replies

5. High Performance Computing

What is it about OpenMosix that supports multi-process applications?

I read that 'Any single program that can run as multiple processes can benefit from OpenMosix: "The GIMP" photo editor and the "kandel" fractal generator are known to do this. Are there other load-balancing clusters that do support multi-process applications? (1 Reply)
Discussion started by: Advice Pro
1 Replies

6. Shell Programming and Scripting

starting a bash session as child process to another bash session from a process

Hi I want to do something that might sound strange. I have a code that in written in C and is executed at startup (it's a custom process). It occasionally calls some bash scripts. The process doesn't have any terminal associated with it. One thing I don't know how to do is to start a... (5 Replies)
Discussion started by: alirezan
5 Replies

7. Programming

Redirect Standard Output Multi-Process

Hi, I'm trying to compile the following code: /************** Begin <test.c> ***************/ /* * Compiled with: gcc -Wall -o test test.c */ #include <stdio.h> #include <unistd.h> int main(void) { printf("I'm process %d, son of %d \n", getpid(), getppid()); ... (5 Replies)
Discussion started by: djodjo
5 Replies

8. Programming

Redirect Output Multi-Process

Hi, I'm trying to compile the following code: /************** Begin <test.c> ***************/ /* * Compiled with: gcc -Wall -o test test.c */ #include <stdio.h> #include <unistd.h> int main(void) { printf("I'm process %d, son of %d \n", getpid(), getppid()); printf("Hello \n");... (3 Replies)
Discussion started by: djodjo
3 Replies

9. Programming

message queues and multi-process

Hi, Am supposed to use message queues to send and receive messages between the processes. when i was working on that i realised that the message qid and the message queue related data should be maintained in a shared memory so that it can be accessed by all the processes. Could anybody refer... (10 Replies)
Discussion started by: rvan
10 Replies
Login or Register to Ask a Question