Hi: A quick question, has anyone on this list played with Office 365? I’ve been asked a few times if its accessible and I have no idea. I know it offers more than my current Exchange hosting provider does, but I really don’t need anymore than I have so haven’t ever looked at switching. As I said though I’ve been asked by other people several times, so thought I’d send the question out here as I’m sure someone’s played with it. Thanks. Ryan
Hello, I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails from a folder that is used for air conditioning and environment notifications since the 15th of May. that's just one system among ....... a lot. I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more. I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind. There are just too many alerts coming in. It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it. You known in linux you can type tail -f *.log in a certain directory and you'll see all the log files as their written? I want something like that for all my systems. Unrealistic, I know, but I'm open to ideas. Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided. Any suggestions? Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie
Hello, I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails from a folder that is used for air conditioning and environment notifications since the 15th of May. that's just one system among ....... a lot. I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more. I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind. There are just too many alerts coming in. It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it. You known in linux you can type tail -f *.log in a certain directory and you'll see all the log files as their written? I want something like that for all my systems. Unrealistic, I know, but I'm open to ideas. Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided. Any suggestions? Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie
Hi, I have the same problem, we have an event corilation system here which is pritty much unusable for me as it is all Java based, but it is supposed to provide this information in a specific report which highlights areas that need looking into. However, in practise this doesn't really help as it requires constant manipulation of the system to keep right so you don't just end up with a load of dross. Andrew. ________________________________________ From: Blind-sysadmins [blind-sysadmins-bounces@lists.hodgsonfamily.org] on behalf of Darragh OHeiligh [Darragh.OHeiligh@Oireachtas.ie] Sent: 11 July 2012 11:52 To: Blind sysadmins list Subject: [Blind-sysadmins] Information overlode. Hello, I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails from a folder that is used for air conditioning and environment notifications since the 15th of May. that's just one system among ....... a lot. I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more. I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind. There are just too many alerts coming in. It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it. You known in linux you can type tail -f *.log in a certain directory and you'll see all the log files as their written? I want something like that for all my systems. Unrealistic, I know, but I'm open to ideas. Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided. Any suggestions? Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
Hi, I have the same problem, we have an event corilation system here which is pritty much unusable for me as it is all Java based, but it is supposed to provide this information in a specific report which highlights areas that need looking into. However, in practise this doesn't really help as it requires constant manipulation of the system to keep right so you don't just end up with a load of dross. Andrew. ________________________________________ From: Blind-sysadmins [blind-sysadmins-bounces@lists.hodgsonfamily.org] on behalf of Darragh OHeiligh [Darragh.OHeiligh@Oireachtas.ie] Sent: 11 July 2012 11:52 To: Blind sysadmins list Subject: [Blind-sysadmins] Information overlode. Hello, I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails from a folder that is used for air conditioning and environment notifications since the 15th of May. that's just one system among ....... a lot. I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more. I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind. There are just too many alerts coming in. It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it. You known in linux you can type tail -f *.log in a certain directory and you'll see all the log files as their written? I want something like that for all my systems. Unrealistic, I know, but I'm open to ideas. Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided. Any suggestions? Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
Isn't that what system Center operations manager is supposed to do? I would use a system like that to capture all the incoming alert information and then presented in a much more useful manner. On Jul 11, 2012, at 6:52 AM, Darragh OHeiligh <Darragh.OHeiligh@Oireachtas.ie> wrote:
Hello,
I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails from a folder that is used for air conditioning and environment notifications since the 15th of May.
that's just one system among ....... a lot.
I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more.
I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind.
There are just too many alerts coming in.
It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it.
You known in linux you can type tail -f *.log in a certain directory and you'll see all the log files as their written? I want something like that for all my systems.
Unrealistic, I know, but I'm open to ideas.
Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided.
Any suggestions?
Regards
Darragh Ó Héiligh Fujitsu
Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
Ok, How! SCOM is a massive beast and the interface is a million miles from being very accessible. It can take a half an hour to create subscriptions because everything has to be done with the jaws cursor or what ever mouse cursor is available... SCOM is great because it highlights everything but it's not intellegent. 2012 has a few nice new views so that only systems in a red state are shown but it's far from perfect. It also has very poor integration with ESXI and it provides no hardware monitoring. The debate raged here for a while as to why we had so many monitoring tools. Management wanted us to justify why we were paying so many licence feeds every year. It's as simple as this though. SCOM is great for analysis of problems from a high level. It's integration with SRS is also fantastic. What's up gold is great for making sure systems are up and accessible but it doesn't do any real analysis. Unfortunately though, netbots is required for camera monitoring and environmental testing and the meriod of other hardware testing tools for the SAN, HP and Dell servers are absolutely vital because without them failures wouldn't be caught in time. Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie From: Matthew White <matt@wh1t3.net> To: Blind sysadmins list <blind-sysadmins@lists.hodgsonfamily.org> Date: 11/07/2012 14:36 Subject: Re: [Blind-sysadmins] Information overlode. Sent by: "Blind-sysadmins" <blind-sysadmins-bounces@lists.hodgsonfamily.org> Isn't that what system Center operations manager is supposed to do? I would use a system like that to capture all the incoming alert information and then presented in a much more useful manner. On Jul 11, 2012, at 6:52 AM, Darragh OHeiligh <Darragh.OHeiligh@Oireachtas.ie> wrote:
Hello,
I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails
from a folder that is used for air conditioning and environment notifications since the 15th of May.
that's just one system among ....... a lot.
I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more.
I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind.
There are just too many alerts coming in.
It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it.
You known in linux you can type tail -f *.log in a certain directory and
you'll see all the log files as their written? I want something like that for all my systems.
Unrealistic, I know, but I'm open to ideas.
Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided.
Any suggestions?
Regards
Darragh Ó Héiligh Fujitsu
Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
_______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
Ok, How! SCOM is a massive beast and the interface is a million miles from being very accessible. It can take a half an hour to create subscriptions because everything has to be done with the jaws cursor or what ever mouse cursor is available... SCOM is great because it highlights everything but it's not intellegent. 2012 has a few nice new views so that only systems in a red state are shown but it's far from perfect. It also has very poor integration with ESXI and it provides no hardware monitoring. The debate raged here for a while as to why we had so many monitoring tools. Management wanted us to justify why we were paying so many licence feeds every year. It's as simple as this though. SCOM is great for analysis of problems from a high level. It's integration with SRS is also fantastic. What's up gold is great for making sure systems are up and accessible but it doesn't do any real analysis. Unfortunately though, netbots is required for camera monitoring and environmental testing and the meriod of other hardware testing tools for the SAN, HP and Dell servers are absolutely vital because without them failures wouldn't be caught in time. Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie From: Matthew White <matt@wh1t3.net> To: Blind sysadmins list <blind-sysadmins@lists.hodgsonfamily.org> Date: 11/07/2012 14:36 Subject: Re: [Blind-sysadmins] Information overlode. Sent by: "Blind-sysadmins" <blind-sysadmins-bounces@lists.hodgsonfamily.org> Isn't that what system Center operations manager is supposed to do? I would use a system like that to capture all the incoming alert information and then presented in a much more useful manner. On Jul 11, 2012, at 6:52 AM, Darragh OHeiligh <Darragh.OHeiligh@Oireachtas.ie> wrote:
Hello,
I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails
from a folder that is used for air conditioning and environment notifications since the 15th of May.
that's just one system among ....... a lot.
I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more.
I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind.
There are just too many alerts coming in.
It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it.
You known in linux you can type tail -f *.log in a certain directory and
you'll see all the log files as their written? I want something like that for all my systems.
Unrealistic, I know, but I'm open to ideas.
Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided.
Any suggestions?
Regards
Darragh Ó Héiligh Fujitsu
Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
_______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
Well, I don't know if this is really applicable in your situation but we use nagios because it has plugins for everything. I think that is the key thing, not nagios itself but the plugin concept. If you can't find a plugin to monitor whatever it is you want to monitor, you write your own. I wrote a plugin to monitor temperature, disk status, and RAID status on our Dell servers. Actually, I have nagios set up to call my cell phone and speak a brief problem report if something really bad happens. That gives you some idea of how flexible nagios can be. I'm not familiar with the tools you are talking about but it seems to me that if you can't write a plugin to make the tool do whatever yu like, then its deficient. I suppose those are Windows tools, right? Maybe that's why they don't allow you to script plugins. On linux systems, you can assume the system will have bash, perl, and C++. But if these are Windows tools, maybe the developer didn't want to assume you'd have access to a scripting language. I'm not sure any of this is helpful. But maybe next time your department specs out a new monitoring system, they can make the ability to write your own plugins a top priority. ----- Original Message ----- From: "Darragh OHeiligh" <Darragh.OHeiligh@Oireachtas.ie> To: "Blind sysadmins list" <blind-sysadmins@lists.hodgsonfamily.org> Cc: "Blind-sysadmins" <blind-sysadmins-bounces@lists.hodgsonfamily.org> Sent: Wednesday, July 11, 2012 8:43 AM Subject: Re: [Blind-sysadmins] Information overlode. Ok, How! SCOM is a massive beast and the interface is a million miles from being very accessible. It can take a half an hour to create subscriptions because everything has to be done with the jaws cursor or what ever mouse cursor is available... SCOM is great because it highlights everything but it's not intellegent. 2012 has a few nice new views so that only systems in a red state are shown but it's far from perfect. It also has very poor integration with ESXI and it provides no hardware monitoring. The debate raged here for a while as to why we had so many monitoring tools. Management wanted us to justify why we were paying so many licence feeds every year. It's as simple as this though. SCOM is great for analysis of problems from a high level. It's integration with SRS is also fantastic. What's up gold is great for making sure systems are up and accessible but it doesn't do any real analysis. Unfortunately though, netbots is required for camera monitoring and environmental testing and the meriod of other hardware testing tools for the SAN, HP and Dell servers are absolutely vital because without them failures wouldn't be caught in time. Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie From: Matthew White <matt@wh1t3.net> To: Blind sysadmins list <blind-sysadmins@lists.hodgsonfamily.org> Date: 11/07/2012 14:36 Subject: Re: [Blind-sysadmins] Information overlode. Sent by: "Blind-sysadmins" <blind-sysadmins-bounces@lists.hodgsonfamily.org> Isn't that what system Center operations manager is supposed to do? I would use a system like that to capture all the incoming alert information and then presented in a much more useful manner. On Jul 11, 2012, at 6:52 AM, Darragh OHeiligh <Darragh.OHeiligh@Oireachtas.ie> wrote:
Hello,
I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails
from a folder that is used for air conditioning and environment notifications since the 15th of May.
that's just one system among ....... a lot.
I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more.
I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind.
There are just too many alerts coming in.
It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it.
You known in linux you can type tail -f *.log in a certain directory and
you'll see all the log files as their written? I want something like that for all my systems.
Unrealistic, I know, but I'm open to ideas.
Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided.
Any suggestions?
Regards
Darragh Ó Héiligh Fujitsu
Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
_______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
Hi Darragh: I deal with the same problem, and I've not found a perfect solution. To be honest, most of the systems administrators I know, blind or sighted, deal with this. There's so much data coming at you and filtering it is a problem. My first piece of advice is to come up with an internal plan as to what's critical, what's important and what's nice to know. Critical would be something like a down system, something you'd interrupt whatever it is you're doing, even if it's a really hot date, and go fix. For me, an ESX server being down, I need to know that right away. Those alerts get e-mailed to me and go right into my inbox as high priority. Important things are things I need to know, it can probably wait until the next business day but it would still shape what I do that day. Those get e-mailed to me and filtered into their own filter. A secondary SCCM site being down, or being low on disk space but not out, or something like that would be important. Finally nice to knows, how things are performing, etc. For me, slight network bottlenecks but nothing causing noticible application performance issues would be on that list, an ESXI server needing a patch would be on this list. Many of those are things I go check when I'm feeling board. Like that ever happens. What is critical, important and nice to know varies on your environment. If you're doing OSD with SCCM or something where people expect it to be up 24/7, than it might be a critical for you. Another important thing is to filter out things you just don't care about. Most monitoring packages give you too much information, and the problem then is you dig for the needle in a hey stack. I've gotten into arguments with people, but my philosophy is if an alert isn't important, turn it off. Don't bother with it. If you're getting an alert but have no hope of fixing it, turn it off. By that statement, I don't mean down systems or anything like that, I mean an alert where you go to management tell them this is a problem and they tell you they don't care. In that case, make it clear to them that since they don't care you're turning off that alert, document their OK and move forward. It sounds bad in a way, but really sys admins get too much information, and you don't want to be in a place where you have to dig through the hey stack for the needle, at least as little as possible. We're there anyway. We don't use SCOM where I work. I wish we did, right now our SCCM environment isn't monitored at all because the software we use can't monitor it. We use a program called EG, which I'd never heard of before I started at MiTek, and I'm not too happy with it. EG monitors Citrix well, but that's the highlight of EG. Its all web based and is really inaccessible, so I don't touch it much. It sounds like we have more people in our IT department than you do, and so that helps spread the load around. For example, I don't care about storage alerts since I don't manage the SAN. I care about ESX, SCCM, and some AD alerts. If you can, spread the love around, you have coworkers that's what they're there for. Another thing is we have a cellphone that routates through the department. Whenever a priority 1 alert is generated, AKA a server down, a site is offline, etc. a text message is sent to that phone and the person who has it has to look up the person responsible for that system and alert them. Again, that just helps spread things around, its not all on one person's shoulders and we can try to get a good night's sleep from time to time. Does this fix the problem, no. As I said though, information overload is a system admin problem, not a blind sys admin problem or a sighted sys admin problem, but something every admin faces and has to work through. Hope this helps. Ryan -----Original Message----- From: Blind-sysadmins [mailto:blind-sysadmins-bounces@lists.hodgsonfamily.org] On Behalf Of Darragh OHeiligh Sent: Wednesday, July 11, 2012 7:43 AM To: Blind sysadmins list Cc: Blind-sysadmins Subject: Re: [Blind-sysadmins] Information overlode. Ok, How! SCOM is a massive beast and the interface is a million miles from being very accessible. It can take a half an hour to create subscriptions because everything has to be done with the jaws cursor or what ever mouse cursor is available... SCOM is great because it highlights everything but it's not intellegent. 2012 has a few nice new views so that only systems in a red state are shown but it's far from perfect. It also has very poor integration with ESXI and it provides no hardware monitoring. The debate raged here for a while as to why we had so many monitoring tools. Management wanted us to justify why we were paying so many licence feeds every year. It's as simple as this though. SCOM is great for analysis of problems from a high level. It's integration with SRS is also fantastic. What's up gold is great for making sure systems are up and accessible but it doesn't do any real analysis. Unfortunately though, netbots is required for camera monitoring and environmental testing and the meriod of other hardware testing tools for the SAN, HP and Dell servers are absolutely vital because without them failures wouldn't be caught in time. Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie From: Matthew White <matt@wh1t3.net> To: Blind sysadmins list <blind-sysadmins@lists.hodgsonfamily.org> Date: 11/07/2012 14:36 Subject: Re: [Blind-sysadmins] Information overlode. Sent by: "Blind-sysadmins" <blind-sysadmins-bounces@lists.hodgsonfamily.org> Isn't that what system Center operations manager is supposed to do? I would use a system like that to capture all the incoming alert information and then presented in a much more useful manner. On Jul 11, 2012, at 6:52 AM, Darragh OHeiligh <Darragh.OHeiligh@Oireachtas.ie> wrote:
Hello,
I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails
from a folder that is used for air conditioning and environment notifications since the 15th of May.
that's just one system among ....... a lot.
I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more.
I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind.
There are just too many alerts coming in.
It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it.
You known in linux you can type tail -f *.log in a certain directory and
you'll see all the log files as their written? I want something like that for all my systems.
Unrealistic, I know, but I'm open to ideas.
Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided.
Any suggestions?
Regards
Darragh Ó Héiligh Fujitsu
Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
_______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
Hi Darragh: I deal with the same problem, and I've not found a perfect solution. To be honest, most of the systems administrators I know, blind or sighted, deal with this. There's so much data coming at you and filtering it is a problem. My first piece of advice is to come up with an internal plan as to what's critical, what's important and what's nice to know. Critical would be something like a down system, something you'd interrupt whatever it is you're doing, even if it's a really hot date, and go fix. For me, an ESX server being down, I need to know that right away. Those alerts get e-mailed to me and go right into my inbox as high priority. Important things are things I need to know, it can probably wait until the next business day but it would still shape what I do that day. Those get e-mailed to me and filtered into their own filter. A secondary SCCM site being down, or being low on disk space but not out, or something like that would be important. Finally nice to knows, how things are performing, etc. For me, slight network bottlenecks but nothing causing noticible application performance issues would be on that list, an ESXI server needing a patch would be on this list. Many of those are things I go check when I'm feeling board. Like that ever happens. What is critical, important and nice to know varies on your environment. If you're doing OSD with SCCM or something where people expect it to be up 24/7, than it might be a critical for you. Another important thing is to filter out things you just don't care about. Most monitoring packages give you too much information, and the problem then is you dig for the needle in a hey stack. I've gotten into arguments with people, but my philosophy is if an alert isn't important, turn it off. Don't bother with it. If you're getting an alert but have no hope of fixing it, turn it off. By that statement, I don't mean down systems or anything like that, I mean an alert where you go to management tell them this is a problem and they tell you they don't care. In that case, make it clear to them that since they don't care you're turning off that alert, document their OK and move forward. It sounds bad in a way, but really sys admins get too much information, and you don't want to be in a place where you have to dig through the hey stack for the needle, at least as little as possible. We're there anyway. We don't use SCOM where I work. I wish we did, right now our SCCM environment isn't monitored at all because the software we use can't monitor it. We use a program called EG, which I'd never heard of before I started at MiTek, and I'm not too happy with it. EG monitors Citrix well, but that's the highlight of EG. Its all web based and is really inaccessible, so I don't touch it much. It sounds like we have more people in our IT department than you do, and so that helps spread the load around. For example, I don't care about storage alerts since I don't manage the SAN. I care about ESX, SCCM, and some AD alerts. If you can, spread the love around, you have coworkers that's what they're there for. Another thing is we have a cellphone that routates through the department. Whenever a priority 1 alert is generated, AKA a server down, a site is offline, etc. a text message is sent to that phone and the person who has it has to look up the person responsible for that system and alert them. Again, that just helps spread things around, its not all on one person's shoulders and we can try to get a good night's sleep from time to time. Does this fix the problem, no. As I said though, information overload is a system admin problem, not a blind sys admin problem or a sighted sys admin problem, but something every admin faces and has to work through. Hope this helps. Ryan -----Original Message----- From: Blind-sysadmins [mailto:blind-sysadmins-bounces@lists.hodgsonfamily.org] On Behalf Of Darragh OHeiligh Sent: Wednesday, July 11, 2012 7:43 AM To: Blind sysadmins list Cc: Blind-sysadmins Subject: Re: [Blind-sysadmins] Information overlode. Ok, How! SCOM is a massive beast and the interface is a million miles from being very accessible. It can take a half an hour to create subscriptions because everything has to be done with the jaws cursor or what ever mouse cursor is available... SCOM is great because it highlights everything but it's not intellegent. 2012 has a few nice new views so that only systems in a red state are shown but it's far from perfect. It also has very poor integration with ESXI and it provides no hardware monitoring. The debate raged here for a while as to why we had so many monitoring tools. Management wanted us to justify why we were paying so many licence feeds every year. It's as simple as this though. SCOM is great for analysis of problems from a high level. It's integration with SRS is also fantastic. What's up gold is great for making sure systems are up and accessible but it doesn't do any real analysis. Unfortunately though, netbots is required for camera monitoring and environmental testing and the meriod of other hardware testing tools for the SAN, HP and Dell servers are absolutely vital because without them failures wouldn't be caught in time. Regards Darragh Ó Héiligh Fujitsu Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie From: Matthew White <matt@wh1t3.net> To: Blind sysadmins list <blind-sysadmins@lists.hodgsonfamily.org> Date: 11/07/2012 14:36 Subject: Re: [Blind-sysadmins] Information overlode. Sent by: "Blind-sysadmins" <blind-sysadmins-bounces@lists.hodgsonfamily.org> Isn't that what system Center operations manager is supposed to do? I would use a system like that to capture all the incoming alert information and then presented in a much more useful manner. On Jul 11, 2012, at 6:52 AM, Darragh OHeiligh <Darragh.OHeiligh@Oireachtas.ie> wrote:
Hello,
I'm managing more and more systems here but I cant keep up with all the notifications. For example, I just cleared out over six thousand emails
from a folder that is used for air conditioning and environment notifications since the 15th of May.
that's just one system among ....... a lot.
I've notifications from SCOM, VMWare, What's up gold, Diskeeper, Event manage engines syslog, Netbots, Backups, the mail gateway, the SAN and more.
I know there are others out there that have the same amount of responsibilities so my question is, how do you stay up to date with the events. I am tired of being on the back foot. A few years ago I was able to tell when disk utilization was spiking on a server. Now, I'm way behind.
There are just too many alerts coming in.
It's not that the network is in bad condition. For example, one of the application servers is showing high CPU and disk utilization this morning. It could be just that a user is hammering away at it but it could be a dodgy application as well. Eitherway, I need to be aware of it.
You known in linux you can type tail -f *.log in a certain directory and
you'll see all the log files as their written? I want something like that for all my systems.
Unrealistic, I know, but I'm open to ideas.
Everything is tied up in red tape here but there's nothing that cant be done after a well written change request is provided.
Any suggestions?
Regards
Darragh Ó Héiligh Fujitsu
Offices of the Houses of the Oireachtas, Fredrick Building, South Fredrick Street, Dublin2 Telephone: +353 (1) 618 3559 Email: darragh.oheiligh@oireachtas.ie Internet: http://www.oireachtas.ie _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
_______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins _______________________________________________ Blind-sysadmins mailing list Blind-sysadmins@lists.hodgsonfamily.org http://lists.hodgsonfamily.org/listinfo/blind-sysadmins
participants (5)
-
Andrew Hodgson
-
Darragh OHeiligh
-
John Heim
-
Matthew White
-
Ryan Shugart