By John P. Desmond, AI Trends Editor
Artificial intelligence for IT operations, AIOps, refers to the application of machine learning and data science to IT operations. AIOps systems monitor huge volumes of log and performance data typically generated in a large enterprise, to gain visibility into dependencies and solve problems.
An AIOps platform should include these three capabilities, suggests a recent report in TechTarget:
Automate routine practices. These include user requests and non-critical IT system alerts. For example, a help desk system can process and fulfill a user request to provision a resource automatically. The system is also able to evaluate alerts and determine which ones require action, and which are based on metrics and supporting data within normal parameters.
Recognize serious issues faster and with greater accuracy than humans. The system should be able to detect behavior out of the norm, especially on critical servers, by processing volumes of data not possible for humans to monitor on their own.
Streamline the interactions between data center groups and teams. AIOps provides each functional IT group with relevant data and perspectives. The AIOps system learns what analysis and monitoring data from the large pool of resource metrics to show each group or team.
AIOps is suited to complex IT operations typical of large enterprises, involving hybrid cloud platforms for example. Data comes from multiple sources including log files, metrics, monitoring tools and help desk ticketing systems. Big data technology is used to aggregate and organize the output into a useful form. Analytics techniques are used to interpret the raw data and spot trends and patterns that can identify and isolate problems, including capacity issues.
Algorithms in the system codify the organization’s IT expertise, business policies and goals, so the platform can deliver the most desirable outcomes or actions. The algorithms are used to prioritize security-related events and teach the platform what application performance decisions are appropriate. These algorithms form the foundation for machine learning; they establish a baseline of normal behaviors and activity, and they can learn and evolve as the environment changes over time.
Automation enables the AIOps tools to take action, triggered by outcomes of the analytic and machine learning. A tool’s predictive analytics and ML may determine that an application needs more storage, for example. An automated process can then be initiated to add storage in increments consistent with the rules embedded in the algorithms.
Visualization tools deliver dashboards, reports, graphics and other output so that human operators and managers can see the changes and events in the environment. These typically allow human operators to take actions that require decision-making capabilities beyond those of the AIOps software.
The technology underlying AIOps is fairly mature, and the field is poised to enter the next phase of maturity in combining the technologies for practical use. The amount of time and effort needed to implement, maintain and manage an AIOps platform can be substantial. Results can vary.
Gartner Coined AIOps Term
The origin of the term AIOps is credited to Gartner, the analyst firm, which coined it in 2016. In a recent report on How to Get Started With AIOps, Gartner analyst Padraig Bryne explained the origin. “IT operations are challenged by the rapid growth in data volumes generated by IT infrastructure and applications that must be captured, analyzed and acted on,” he stated. “Coupled with the reality that IT operations teams often work in disconnected silos, this makes it challenging to ensure that the most urgent incident at any given time is being addressed.”
Gartner sees AIOps functionality as partially replacing primary IT operations functions such as availability and performance monitoring, event correlation and analysis and IT service management and automation. Gartner predicts that large enterprise exclusive use of AIOps and digital experience monitoring tools to monitor applications and infrastructure will rise from 5% in 2018 to 30% in 2023. Market size today Gartner estimates to be in the $300M to $500M range.
According to Byrne, the long-term impact of AIOps on IT operations will be transformative. “IT leaders are enthusiastic about the promise of applying AI to IT operations, but as with moving a large object, it will be necessary to overcome inertia to build velocity,” stated Byrne. “The good news is that AI capabilities are advancing, and more real solutions are becoming available every day.”
Moogsoft Capitalizing on Trend
Among vendors competing in the AIOps space is Moogsoft, co-founded in 2012 by Phil Tee, CEO, the past chairman and CTO of Riversoft, which eventually became part of the IBM Tivoli Netcool suite. He was also the co-founder and CTO of Micromuse, a supplier of network management software.
“I have been in the IT operations market for 30 years,” said Tee in an interview with AI Trends. Moogsoft was formed to help companies keep their digital infrastructure available. “Our customers, including Verizon, have billions invested in their infrastructure. The particular nuance today is that the environment is so complex and so large, we need to bring in AI and machine learning as part of the set of tools to help. The complexity is not possible for the human mind to comprehend.”
Fundamentally the firm’s software, rebranded to AIOps after Gartner coined the term, is an operations tool used by IT professionals responsible for delivering more services and keeping the infrastructure healthy. The firm’s software was written from scratch and has 50 patents pending on it, Tee said.
It is a cloud-resident software service with pricing based on volume, based on the number of events being managed. The company has raised approximately $100 million from investors including Goldman Sachs, and is close to being profitable, Tee said.
Customers include American Airlines, Verizon Media and GoDaddy. “The market is growing and maturing,” Tee said. “In two or three years, it will have completely supplanted the traditional IT operations base,” served for example by IBM, BMC and the former CA (now CA Technologies, owned by Broadcom).
A case study published by Moogsoft on GoDaddy outlined how the company needed to replace its IT operations infrastructure after making a commitment to Amazon AWS as its primary platform. GoDaddy had been relying on CA Spectrum, a legacy event management system. Customers were experiencing issues before the GoDaddy operational teams were aware of them, not a best practice.
“We came to Moogsoft because we’re experiencing significant business growth and our legacy event management systems couldn’t scale, nor could they accommodate new data sources like AWS,” stated Felix Gorodishter, Principal Architect for Monitoring at GoDaddy. “Moogsoft was initially deployed as a layer on top of these systems, but now it is completely replacing the legacy systems.”
The migration enables GoDaddy to be able to deploy software globally in minutes. The company is expecting to see a 50 percent reduction in mean time to resolve issues through earlier detection of incidents, and fewer support escalations. GoDaddy’s existing monitoring tools and AWS cloud tools have now been integrated with Moogsoft.