Speculative Shell Feature Brainstorming


 
Thread Tools Search this Thread
The Lounge What is on Your Mind? Speculative Shell Feature Brainstorming
# 15  
Old 03-30-2011
Quote:
Originally Posted by tetsujin
That allows you to use object lifetime to control resource allocation - including allocation of resources the shell may not have been explicitly designed to manage.
I'm actually not against that. As long as a variable always acts like a variable, the special cases I keep harping about don't exist. That's the whole point of polymorphism, yes? Lets you use different things the same way?

Perl does a really bad job of polymorphism. It's specialized itself into a big mess of special types and operators that are all context-sensitive. Depending on what kind of variable it is, print $variable, "\n" could print a string, a blank line, a mess of crap, the length of something, a pointer dump like ARRAY(0x123851023), a syntax error, or a host of other things. You can spend hours just trying to figure out what kind of variable someone's perl module hands off to you, let alone how to use it.

The shell has avoided this by keeping the meaning of its operators consistent and only having one interface for variables, strings. ${something} means string. ${#something} means length. ${*} means all arguments. "${@}" means all arguments without splitting on IFS. They work alone or in combinations, on arrays or strings or environment vars or local vars. The shell knows the difference and that's enough. In the few cases you're not allowed to combine them, that's always syntax error, not a silent substitution of nothing, not a length you didn't ask for, not a garbled mess, not a pointer dump. You can always expect to access a shell variable the same way no matter what it is, and the interface is flexible enough to encompass most possibilities. That's polymorphism.

And also why I think expanding the concept of variable instead of tacking the new concept of scoped files onto the side of it would work better. Let the variables be accessed the same way, as strings, since it's already convenient for the programmer, fits nicely with the existing syntax for redirecting files, and wouldn't need huge amounts of surplus new syntax to add. All that needs to be different about the variable is the ability to close a file when it goes out of scope. The scoping doesn't have to be something that causes the programmer to need new syntax to use, any more than a C programmer would use a stack variable differently than a global one.

---------- Post updated at 04:41 PM ---------- Previous update was at 03:52 PM ----------

Quote:
Originally Posted by tetsujin
In the design I've suggested here, the shell doesn't take responsibility for fixing that. Rather, it merely provides a mechanism that allows an external program to fix that... The shell creates the PTYs for each thread (if the user requests TTY sharing) and connects them appropriately, but the job of determining how those PTY masters are used is left up to the program specified by the user.
Does that have to be done as part of the parallel loop? That sounds like something you might want to do in general, inside or outside a loop. Let the parallel loop be the parallel loop and the TTY sharer be the TTY sharer.
Quote:
In the design I described, "screen" windows don't get created and destroyed during the loop.

Rather, when writing the loop, if the user has explicitly requested a terminal-sharing mechanism be used to synchronize display between the multiple loop iterations being run concurrently, then there will be one terminal created for each thread, not one for each iteration.
That could be confusing if the threads end up doing different things on each iteration.
Quote:
This doesn't create a perfect display, because as each loop iteration ends, a new one takes its place on the display. There's no display of history, basically. But it's a quick & dirty way for the user to get a display that's at least readable...
I agree it could be useful but I'm not sure it should be part of the shell script. You end up with shell scripts that require expect to run, and start spewing garbage to stdout if you try to disentangle them.
Quote:
Could you explain the password spoofing issue to me? Depending on the nature of the issue it could obviously be serious...
Utilities like ssh and su require password input to be from a terminal, but PTY's count. Something that gives you quick and easy access to PTY's, let alone in a high-performance parallel manner, isn't the sort of tool you want to just leave lying around -- another reason expect's a last resort.
Quote:
I consider a shell to be a programming language and a UI. To me, it's pretty much the unique defining characteristic of a shell.
True, but take a look at the way these features are made. A shell script isn't going to blow up because someone disabled tab-completion -- shell scripts don't use tab completion, that's a user thing. Nor are they going to start typing in pgup to recall the last command and blow up when history features turn out to be unavailable, that's a user thing too. You cannot write scripts depending on these features and that's intentional -- the shell might care if it's in a terminal or not, but the program usually doesn't.

That's what I mean by not building it in, making it something like a debug flag -- to change the behavior of something to let it pop up these little windows. The script doesn't have to know or care whether you're snooping on its threads outputs or not, that can be left up to the shell. After all, what would know better about the current terminal than the shell?
Quote:
Considering the "usual" case is still useful, though, for deciding what things should be well-supported and convenient in the syntax.
It doesn't have to be all-or-nothing. I think it's possible to make a syntax that's both convenient and general-purpose.
Quote:
If you wanted to limit how many of them run at once (after all, running a substantially higher number of jobs than your number of CPU cores at best is going to get you around some I/O blocking, at worst it's going to slow you down via VM thrashing) - that's more complicated.
Not THAT complicated.

Code:
#!/bin/bash

THREADS=5       ;       WAIT=0  ;       END=0

# Wait for next thread.
function waitnext
{       # Needs BASH/KSH
        wait "${PIDS[$(( (WAIT++) % THREADS))]}"
}

for ((N=0; N<100; N++))
do
        [ "$((END-WAIT))" -ge "$THREADS" ] && waitnext
        sleep 1 & PIDS[$(( (END++) % THREADS ))]=$!
done
# wait for ALL remaining processes
wait

There's a shell feature missing that'd make this better and easier -- the ability to wait for any one process without saying which one. Right now you have a choice between waiting for one specific process or waiting for all of them. [edit] Or trapping SIGCHLD. That'd work, but would work better if shell had something like semaphores

Quote:
Going back to the issue of shell threading:
I hadn't thought about the interaction between fork() and threads - But, then, "threads" in an interpreted language don't have to be interpreted as actual execution threads: Python (the C implementation) for instance, implements threads internally. From the perspective of the OS these threads don't exist (they're not separate entities in the scheduler) but within the context of the language itself they work as any other threads implementation would.
That's just timesharing. If you want actual benefits from threading, you must do multithreading and/or multiprocessing.
Quote:
If the implementation did use real threads, another option for dealing with forking would be to fork off a process that does nothing but listen to a pipe that tells it what to fork and run, and a Unix domain socket that feeds it file descriptors to attach to the new processes...
Kind of what i said but in more detail
Quote:
Of course it'd also have to communicate back information about when jobs terminate... Apart from the fact that it solves the thread problem pretty handily, it seems like kind of an ugly solution, really.
It is, and could cause other problems. If the parent closes a file descriptor, this child launcher will have to follow its lead somehow.

There's a better way to do this but I don't quite remember it. Something to do with fork callbacks.
Quote:
As for the other impacts of threading - parens could still be used specifically to specify a subshell context (after all, both bash and ksh provide curly braces as a way to group commands without creating a subshell context)
...What? Really? Smilie

That's perfect for scoping your files! Allow local variables in braces like that. Your files could be some of these variables, containing FD numbers. When they go out of scope, the files close.

Last edited by Corona688; 03-30-2011 at 08:01 PM..
# 16  
Old 03-31-2011
Quote:
Originally Posted by Corona688
Quote:
In the design I've suggested here, the shell doesn't take responsibility for fixing that. (TTY sharing in a loop) Rather, it merely provides a mechanism that allows an external program to fix that...
Does that have to be done as part of the parallel loop? That sounds like something you might want to do in general, inside or outside a loop.
I'd considered that: the multiplexing/TTY sharing stuff would be useful outside the context of loops (or, if you prefer, at least as useful as it is in the context of loops. Smilie ) so addressing it as a general mechanism rather than something specific to loops would have advantages, make the shell language more orthogonal, etc.

One approach to doing this would be to have a separate step in which the PTYs and/or pipes are created to serve as /dev/tty, stdout, stdin for various jobs that are going to be run asynchronously: the file descriptors for these could be stored in array variables and passed to the loop construct, which would then pass them on to individual loop steps. Then probably one could come up with other cases where those FD arrays could be useful...

Quote:
You end up with shell scripts that require expect to run, and start spewing garbage to stdout if you try to disentangle them.
"expect" shouldn't be needed. Probably the sensible default for this "TTY sharing" stuff is to simply not do the TTY sharing thing if there's no TTY. ('course, there would be exceptions - cases where you might want TTY sharing to do its thing even if there's no TTY... If you're displaying the TTYs with xterms, for instance.)

Quote:
Utilities like ssh and su require password input to be from a terminal, but PTY's count. Something that gives you quick and easy access to PTY's, let alone in a high-performance parallel manner, isn't the sort of tool you want to just leave lying around
Hm, so is the problem just that people could use this capability to spoof input to programs that interact on /dev/tty?

I can live with that. Anyone who wants to do damage with that kind of capability can do it regardless of whether the shell serves up the functionality or they have to write their own program for it.

Quote:
Quote:
(Interpreted languages can implement threading without threads)
That's just timesharing. If you want actual benefits from threading, you must do multithreading and/or multiprocessing.
Well, yeah, but my aim with "shell threading" is mostly just to keep things from getting booted into a "subshell context" simply because they're occupying a particular position in the command. But I'm starting to feel like it's probably not the best way to go. It'd mean keeping track of all the "threads" that are blocking on something, maybe stuffing all that into a big select() call...

Actually, though (I had to look this up) - it turns out when you fork() in a (Posix) threaded app, the new process has just one thread initially. The threads aren't cloned. So there'd be a little care necessary to make sure the environment's in a usable state prior to calling fork() but otherwise using Posix threads shouldn't be a problem.
# 17  
Old 03-31-2011
Quote:
Originally Posted by tetsujin
"expect" shouldn't be needed.
I meant screen, sorry. Probably doesn't change the answer though.
Quote:
Probably the sensible default for this "TTY sharing" stuff is to simply not do the TTY sharing thing if there's no TTY.
But where would the output go instead?
Quote:
Hm, so is the problem just that people could use this capability to spoof input to programs that interact on /dev/tty?

I can live with that. Anyone who wants to do damage with that kind of capability can do it regardless of whether the shell serves up the functionality or they have to write their own program for it.
Sure, if given access to them. Just something to keep in mind.
Quote:
Well, yeah, but my aim with "shell threading" is mostly just to keep things from getting booted into a "subshell context" simply because they're occupying a particular position in the command.
External commands have to be in a seperate context. There's just no other way to run them. Builtins could be run in threads, so wouldn't need a separate context.

Even if you fork(), it's possible to share memory by other means. Memory you've created with mmap() can be shared between such processes. So just being in a subshell doesn't mean you'd have to lose access to variables. Environment variables though come preallocated so can't be shared that way without torturing your process environment in cruel and unusual ways.

Some builtins, like echo, might not need any context. When you know the amount of text is smaller than the size of your pipe, you know it'll never block -- so just cram the text in the write-end and close the write-end in advance. You don't need a new thread just to do that.
Quote:
But I'm starting to feel like it's probably not the best way to go. It'd mean keeping track of all the "threads" that are blocking on something, maybe stuffing all that into a big select() call...
Why not let the mutexes do the blocking?
Quote:
Actually, though (I had to look this up) - it turns out when you fork() in a (Posix) threaded app, the new process has just one thread initially. The threads aren't cloned.
Reference, please? Things I've seen suggest quite differently. Then again, that's something I saw inside the linux pthreads library, so that might be things they had to do to prevent threads being cloned rather than things I'd have to do...

If true, that would make it a ton easier.

Last edited by Corona688; 03-31-2011 at 08:15 PM..
# 18  
Old 04-15-2011
Another way to handle anonymous files could be similar to the awk way. When you redirect a file in awk, it stays open, referring to the same filename later just gets you the same handle over and over.

Code:
LINE=1
# Read lines from "otherfile" and "cmpfile" one by one, without
# redirecting the entire loop's stdin.
# Also set the close-on-exec flag for files opened in this fashion.
while read LINE <<"otherfile" && read OTHERLINE <<"cmpfile"
do
        if [ "$LINE" != "$OTHERLINE" ]
        then
                echo "Line $LINE doesn't match"
                echo "What would you like to do?"

                read RESPONSE
                case "${RESPONSE}" in
                *)   # todo: something
                      ;;
                 esac
        fi
        ((LINE++))
done

close "cmpfile" "otherfile"

Using << for 'keep the file open' seems a nice opposite of the >> 'append to file' redirection. This breaks the syntax for here-documents though. Not quite sure how to get the FD out of that either, maybe $<"filename" ?

Last edited by Corona688; 04-15-2011 at 04:35 PM..
# 19  
Old 04-16-2011
Quote:
Originally Posted by Corona688
Quote:
Probably the sensible default for this "TTY sharing" stuff is to simply not do the TTY sharing thing if there's no TTY.
But where would the output go instead?
Same place it would go to normally. Whatever the shell sees as FD #1.

Obviously if you're using stdout as some kind of formatted value stream, you don't want a bunch of processes writing data to it concurrently without some kind of synchronization: this is why I described a stdout sharing mechanism (separate from the /dev/tty sharing mechanism).

If somebody ran a multithreaded loop whose jobs produced newline-delimited value streams - that's a common enough formatting convention that it could just be built in to the shell. If those processes were writing out JSON or some XML schema, then it's something the user should probably be handling, by providing a program that takes those output streams and merges them into one.


Quote:
Quote:
Well, yeah, but my aim with "shell threading" is mostly just to keep things from getting booted into a "subshell context" simply because they're occupying a particular position in the command.
External commands have to be in a seperate context. There's just no other way to run them.
Yeah, mainly by "things" I meant builtins like variable assignment or "read". Things you would do to modify the environment, in cases where it's pretty reasonable for the user to expect that it'll modify the main shell's environment...

Quote:
Even if you fork(), it's possible to share memory by other means. Memory you've created with mmap() can be shared between such processes. So just being in a subshell doesn't mean you'd have to lose access to variables. Environment variables though come preallocated so can't be shared that way without torturing your process environment in cruel and unusual ways.
Hm, might have to think about that one. Of course you could get around it by simply copying all the env. variables to the shell's process memory and operating on those copies of the variables - a bit wasteful but a pretty simple dodge...

Quote:
Why not let the mutexes do the blocking?
Because I'm not talking about cases where threaded built-ins are blocking on some shared resource, I'm talking about them blocking on I/O, probably to a pipe as part of a larger job...

APUE chapter 12.9 explains the situation with pthreads and fork():
"Inside the child process, only one thread exists"
Of course I would be willing to bet there are some platforms or versions of platforms that got that wrong... APUE provides some coverage of different implementations to show differences but it'd take more time to tell you if it lists any for fork() in pthreads... Smilie
# 20  
Old 04-17-2011
Quote:
Originally Posted by tetsujin
Hm, might have to think about that one. Of course you could get around it by simply copying all the env. variables to the shell's process memory and operating on those copies of the variables - a bit wasteful but a pretty simple dodge...
That's partway to just creating a new process then, since it would have some of the same side-effects -- changes from one wouldn't propagate back and vice versa.
# 21  
Old 04-17-2011
Quote:
Originally Posted by Corona688
That's partway to just creating a new process then, since it would have some of the same side-effects -- changes from one wouldn't propagate back and vice versa.
Well, we're talking about using shared memory to export this copy of the shell variables to fork()'ed shell processes - so as long as the memory is shared writable, and as long as access to it is synchronized adequately, those shell processes would be able to propagate their variable changes back to the main shell...
Login or Register to Ask a Question

Previous Thread | Next Thread

4 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Can we create any check-point feature in shell ?

I have a script as below and say its failed @ function fs_ck {} then it should exit and next time i execute it it should start from fs_ck {} only Please advise #!/bin/bash logging {} fs_ck {} bkp {} dply {} ## main function### echo Sstarting script echo '####' logging fs_ck... (3 Replies)
Discussion started by: abhaydas
3 Replies

2. UNIX for Dummies Questions & Answers

brainstorming automated response

I am managing a database of files for which there is a drop-box and multiple users. what i would like to do is set a criteria for files coming into the drop-box based on file structure. (they must be like this W*/image/W*-1234/0-999.tif) If the files do not match the criteria i want them to be... (1 Reply)
Discussion started by: Movomito
1 Replies

3. Shell Programming and Scripting

Creating a command history feature in a simple UNIX shell using C

I'm trying to write a history feature to a very simple UNIX shell that will list the last 10 commands used when control-c is pressed. A user can then run a previous command by typing r x, where x is the first letter of the command. I'm having quite a bit of trouble figuring out what I need to do, I... (2 Replies)
Discussion started by: -=Cn=-
2 Replies

4. SCO

BASH-like feature. is possible ?

Greetings... i was wondering if there is any shell configuration or third party application that enables the command history by pressing keyboard up arrow, like GNU/BASH does or are there an SCO compatible bash versio to download? where ? just wondering (sory my stinky english) (2 Replies)
Discussion started by: nEuRoMaNcEr
2 Replies
Login or Register to Ask a Question