Hi Darragh: Can you go into more detail on what these "monitor screens" are? It sounds like a billboard hanging on a wall displaying system status. We don't have anything like that in my environment. We have a pager that rotates through the department each week, and our monitoring software is set to flag critical alerts and send those off to the person on call and they'll be responsible for finding the right party and informing them. Beyond that, everyone's kind of responsible for monitoring their own systems. I really have to pitty our backup guy, that used to be me but that got too much, and its getting too much for the current backup guy too. I agree with Andrew here, its all about using whatever software you have's monitoring and alerting, but really narrowing down just what alerts you need. If its something I need to know about right away, that's one type of alert. If its something that can wait for the next business day, but needs to be top on my plate come next business day then that's something else. Also I just skim systems from time to time to see if anything is slipping through the cracks. Is it perfect, no, but what is? Good monitoring is actually very difficult, sighted or blind. We recently spent about $250000 on a package called EG, the vendor made our management believe EG would not just alert us to systems being down, but it would also automatically do route cause analisys and find the real cause of problems. Surprisingly it hasn't worked too well, I only discovered a bad SAN drive today when I wandered into the server room and heard a really odd high pitched squeal coming from the SAN. No other alerts period. So getting the information is tough regardless, and it just boils down to good tweaking. On a side note Darragh, I'd be very interested in talking to you further privately about your setup. It sounds like we manage roughly the same types of environments, and I'd love to hear more about what challenges you face and the things you've done to get around them. For example, I didn't find SCOM to be that accessible, nothing like SCCM. I'd love to set up SCOM to monitor SCCM, right now our SCCM environment is completely unmonitored, well, accept for me browsing sight status from time to time. Ryan -----Original Message----- From: blind-sysadmins-bounces@lists.hodgsonfamily.org [mailto:blind-sysadmins-bounces@lists.hodgsonfamily.org] On Behalf Of Darragh OHeiligh Sent: Tuesday, December 20, 2011 2:05 AM To: Blind sysadmins list Subject: [Blind-sysadmins] Keeping track of your environment. Good morning, I've asked about this a long time ago but I didn't really follow it up. Our environment has doubled in size in terms of servers over the past year. We now have two DR sites, I'm in the process of virtualizing our DMZ and we're expanding our SQL cluster. The amount of work that has been done is actually quite impressive if I do say so myself. The problem is that there's just too much information to digest every morning. I've syslog showing errors, What's up gold showing utilization and availability, SCOM showing system errors, System insight manager monitoring the SAN, Storage escential checking for storage bottle necks, OfficeScan monitoring for viruses and other infections and Nessus running security reports. It all mounts up to a huge amount of reports and statistics to monitor every day. The problem is, if I get distracted by OfficeScan for example, the other reports are neglected and I potentially miss things that have happened during the night. Of course, the other person that works with me can glance up at the monitoring screens and see at a glance what's up and what's not. He can see systems that are running low on disk space or using up far too much memory. I start an hour before this person so it looks terrible if I miss something that can be spotted by simply looking up at the screens. How do you monitor hundreds of servers? Are there any tips or tricks you'd like to share? We're working with a mixed environment here but if I need to use two approaches for monitoring both widnows and Linux then I don't mind. Once I can get information in a more condensed format without overwelming me with things I don't need to know about. Thanks. any suggestions will be appreciated. Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins