It pretty much goes without saying that if you want to fix something you have to know that it is broken and if you want to prevent something ‘breaking bad’ it helps to know in advance that things are starting to go wrong. This is where monitoring tools come in to the picture.
There are various types of monitoring tools around which perform different tasks in different ways. Some will monitor your hardware or networks internally while others do remote monitoring of your web sites or public facing internet services. There are also some which do a combination of both and more such as SolarWinds.
For example if a disk fails in your RAID5 array it is not usually catastrophic because the RAID system is designed for such a situation, however if a second disk fails in that RAID5 array you could be in big trouble, depending on your setup. So knowing that a disk in your RAID array has failed allows you to replace it before the situation becomes critical and data is lost. A popular and well established tool for infrastructure monitoring is Nagios which is open source and has an active user and developer community around it.
Another example is when your website crashes or some part of it fails, a remote monitoring tool like ServerMojo can alert you via SMS, Email or sometimes even Twitter. A common situation is when a server or component of a web site is malfunctioning but still responding to requests on port 80 (or whichever port your service runs on) and so in such a case the basic test to see if a server is ‘alive’ is not enough, but fortunately monitoring tools can go further and check for specific text which should appear on a page, something that might be pulled from a database perhaps, and so if that text is not found then you will get an alert.
Some remote monitoring systems, Pingdom for example, go further still and do what is known as transaction monitoring which means they will actually perform actions on a site, interacting with it and emulating a real visitor, not just clicking links but even filling in and submitting forms then logging response times and other data along the way which you can then use to easily locate and diagnose problems in complex applications.
Yet another type of monitoring tool is known as an RMM which means Remote Monitoring and Management and not only monitors your servers and other IT assets but allows you to manage them remotely. Obviously you can’t replace hardware remotely (with some exceptions) but you can certainly perform software and operating system installs and upgrades and keep track of the status of IT assets down to the individual component if required, such as monitoring disk space, memory usage, system load or checking for errors and warnings in logs.
RMM tools are popular with MSP‘s which depend on the ability to monitor and manage a large array of assets across many customer locations which could be anywhere in the world . Having advance or timely notification of problems or potential issues with clients systems is invaluable for the MSP and of course the client whose systems need to be kept functioning. Popular RMM tools these days include GFIMax and Continuum, amongst others.
Choosing the right tool for your needs can be a difficult task with so many to choose from, but in this interconnected age it is worth looking for those tools which provide an API for integrating with various other systems which an IT Services business often depends on so your data can be more easily moved around between systems and effectively managed to ensure maximum efficiency in workflows and ultimately profitability.