"If you see something, say something..."

My team is in the process of evaluating a number of monitoring packages for our development and production environments. So far we have mostly used gum and duct tape ^H^H^H a combination of mon, monit, zenoss, PRTG to get alerted when something goes wrong and we are clearly at a point where this type of contraption cannot really scale anymore. It was good while it lasted but it has now passed its optimal benefit/cost ratio. I have some experience with HP OpenView in the past and the prospect of implementing a beast like that does not really appeal to me. Paying top dollars and top man-hours to get a big "enterprise" piece of machinery to survey 300 hosts on 2 sites really seems overkill to me. Other solutions that we are entertaining are: Since we operate in a largely unix environment but also have a growing number of Microsoft boxes (and VMware ESXs), having one package that is reasonably good at monitoring and alerting is desirable. Configuration management would be a nice plus but is not a strict requirement. I am also thinking of throwing splunk to the mix as a forensic tool that aids in incident and problem management. I will report on our progress as we evaluate these tools.