You sure don't mention many details. Not even which OS you use. "Connections"... so you are awaiting connections from a listen(2)? How many fd's are involved? Lots of connections to lots of fd's is a different problem than lots of connections to a few fd's.
For lots of fd's, select/poll bogs down because massive structures to describe the fd's must be copied between user/kernel space. This is one of several problems addressed be the epoll facility which I have only seen in Linux. (But note that fd_set can be large with modern implementions. People need to stop assuming it is an int!)
Lots of connections to a few fd's can be addressed by not allowing select to get involved with every connection... instead drain the listen queue on the first select.
This paper argues that when select(2) returns, accept(2) should be called in loop until it gets EWOULDBLOCK. The authors describe a considerable performance boost. You might want to try this... I thought it sounded very interesting.
Or are you reading and writing on established connections? Assuming your OS has a decent thread implementation, I would try one thread per socket in each direction that data flows.
You ask about asyncronous I/O. Traditionally this is a problem if more than one fd is involved since there is only one SIGPOLL signal to deliver to a process. Sure, you can get the signal and then do poll(2)/select(2) but this is only useful for very sparse data arriving on multiple fd's. And now with threads, perhaps it is not useful at all. Posix defines some new async i/o stuff, but your OS may not have and I have never used the new stuff.