MITA’s Network Operations Centre and the evolving world of IT Monitoring Tools
Written by Kester Stoner
One of the most sophisticated systems ever developed is the human body. Apart from the complex structure of different organs, body parts, and their continuous interaction, one remarkable factor of the human body is the ability to monitor itself and take the necessary proactive or reactive measures accordingly.
Such monitoring is one of the elements which permit the human body to thrive and prolong its lifespan. This complexity of elements, the required level of performance as well as the need for ongoing improvement can easily be mapped to a man-made counterpart – the computer.
The term computer is a vague term as it could refer to any digital device which supports any specific business or even domestic process. Commonly, computers are servers hosted within Data Centres, intended to support considerable loads and perform thousands and possibly millions of transactions per hour.
The Malta Information Technology Agency (MITA) with its state-of-the-art data centers, hosts numerous servers which deliver several Government services. The value here is often the result of a complex network of nodes each of which alone might serve little purpose. Yet, the status and performance levels of each node are critical, since one failure on any of the components within, might impact an entire ecosystem. With the current technology, many components are purposely equipped with sensors to provide signals accordingly. Each of these signals is typically detected through special tools that translate into human-readable forms for any required action. The Network Operations Centre (NOC) within MITA has the remit to ensure round-the-clock monitoring and subsequent event management. The primary objective here is to maintain the right levels of availability, capacity, and security. This mechanism applies to the conventional hardware technology but is also critical for software-defined Data Centres, where through Infrastructure-as-Code solutions, virtual systems are created and maintained.
The evolution of monitoring tools is also an ongoing fact. In the early days of technology, such tools were largely unavailable or otherwise difficult to interpret and reap value from. Quite often, the only insight available was that given by the Operating System (OS) in the form of text or logs which still required effort to correlate and parse, hence lengthening the resolution process where required. This also required the involvement of highly trained technical people. It was in the last decade of the 20th century when real-time performance monitoring became a standard element built-in the OS. By that time, this also permitted operators to visually determine in real-time the utilization of specific resources, while also enabling the possibility to set alerts when a specific threshold is exceeded.
The digital transformation started to unfold and as more business sectors started being disrupted by technology, the demand and reliance on Information Technology (IT) started to exponentially increase. This gradually implied that monitoring of services, and specifically the underlying infrastructure became not only daunting but also largely ineffective and inefficient. At this point, the need for third-party monitoring systems became inevitable.
By the end of the nineties and early noughties, the majority of such monitoring systems were in the form of a centralized portal where all network nodes reported to. Such tools consisted of several performance metric collection, pre-defined rules, threshold-based alerting functionalities, and frequently a feature which runs an automatic logic and based on the output, the status of the respective element visually changes. This allowed technical personnel to monitor centrally and more effectively, allowing more time to invest elsewhere. The value derived from this functionality was being confirmed and many IT organizations realized that such assets had to form part of their strategies since it was otherwise not possible to keep track on performance.
Across many areas of technology, various monitoring systems were developed and while the same foundations apply, various third-party vendors invested their resources to specialize in particular areas of service management and underlying technologies such as web, networking, and application monitoring. Whereas every element was catered for, this still left considerable disconnection in the sense that service providers had to invest in separate, a siloed tool to cater to the entire ecosystem.
Needless to say, this proved to be extremely expensive and complex. In more recent times, the industry could then take advantage of more centralization whereby investment in one tool meant the coverage across several different elements became possible.
As services continue to evolve and become more sophisticated, monitoring technology is also in a constant drive to not only retain the pace but also to provide aspects which were unthinkable in the recent past.
Services are now making use of Cloud Technology and the advent of software-defined infrastructure, containerization amongst others, implies that having an encompassing monitoring regime must be a thoughtful by-design approach rather than an after-thought add-on.
Thankfully, breakthroughs such as Artificial Intelligence (AI) (applying principles of human decision making and learning) makes it even more possible to not only control but also conquer the complex landscape of today’s systems. With its vast range of capabilities and potential, several vendors are taking advantage of AI, incorporating it into their tools. This has formed the Artificial Intelligence for IT Operations (AIOps). Henceforth, a whole new world in-service monitoring emerges, where each component in the service configuration is no longer monitored in isolation but considered in a larger context. This requires alert correlation which means hundreds if not thousands of machine-generated data is automatically assessed and analyzed in split seconds for a more insightful presentation. This hastens reaction time and also improves deliverables to the business.
It is also now possible to set such technology into learning modes and based on set periods of time, the “behaviour” of a given service is analyzed by the tool in order to highlighted where deviations are noted. This also permits what is called predictive monitoring. In a nutshell, this is where the monitoring tool anticipates issues (such as resource over or under-utilization) and automatically remediates or reports in advance.
The evolution which we have witnessed over the past decade augurs well for a future of more developments. As systems continue to evolve and society depends more and more on IT, making sure that relevant systems operate and deliver their outputs are required by the consumer, become more critical than ever. Along its thirty years of existence, MITA’s NOC has progressed in-line with the monitoring tools evolution ensuring to be at the forefront of upcoming technologies.
MITA’s NOC has progressed in-line with the monitoring tools evolution ensuring to be at the forefront of upcoming technologies.