Please let me know where to send my resume. I'd love to work at your datacenter where nothing ever breaks.
Unfortunately, I'm not in that situation. We have 7 admins responsible for a couple hundred servers and several hundred more workstations. We support well over 1000 users and have automounter maps that allow them to connect to several hundred project directories on a SAN with over 100 TB of storage. And we're responsible for EVERYTHING in the Unix and Storage environment from password resets to desktop linux support to system architecture to filling out purchase orders for new equipment.
In a perfect world I agree with you we'd be able to keep machines up constantly by fixing each problem as it happened. But with the thousands of mounts and unmounts that happen every day we get some stale file handles for example. There are plenty of other little problems that come up which really don't need to be solved immediately that the monthly or 90-day reboot clears up. There is absolutely no way we could spend the time having a system administrator track each of them down individually without double the people. And there is no need for us to do it - for over a decade the monthly maintenance policy has been in place and the business units and users we support agree with it. So we let the minor stuff I mentioned go and clean it up during maintenance by rebooting.
Also, we are in the medical industry so there are very strict regulations about reliability and disaster recovery. Many of our machines are required to be rebooted on a schedule to prove that they are configured properly and will come up correctly after an unplanned outage. For example, the Veritas clusters I mentioned rebooting monthly. Our DR policy requires that to prove the clusters are able to function properly in a failover situation where one system crashes. We actually have to sign and file documents verifying the status of each system after it comes back up. Thus it doesn't matter if we think they need it for a technical reason or not, a lot of those reboots are going to happen to satisfy the policies put on us by the regulatory department.
So I'd finish by pointing out my last paragraph of the original message. It all depends on the environment. Just as you said is the right way to do it, most places I've worked did not have scheduled reboots. However, due to specific factors in the environment I work in now we have to do it. You need to know your users, machines and environment well enough to know what reboot policy is best for your situation.