Dane Overfield, product development lead at Exele Information Systems (www.exele.com), says:
Ensuring the reliability and efficiency of a data center involves the monitoring of many disparate types of data across multiple vendors and protocols. Real-time data such as hardware and network performance, building power management, and environmental conditions need immediate attention if behavior deviates from the desired or normal operating ranges.
Commonly, this division results in splitting the responsibility between different internal groups and the implementation of different software with varying capabilities and features. Some may implement vendor-based software solutions, while others may be able to seek solutions based on common protocols among multiple devices and equipment.
Luckily, today’s data centers can benefit from unification steps made in automation and process monitoring field since the mid-1990’s. Faced with the same dilemma of multiple vendors and protocols, the need to unify the communication has resulted in a clear winner: OPC (www.opcfoundation.orgg). OPC provides a single communication translation between those needing the data (the monitoring tools) and the underlying protocols needed to access this data. The result is an abundance of OPC-based tools like Exele TopView that can be used across industries and vendors to solve common needs.
Yet, this unification is only beneficial if the translation layer (the OPC Server) exists for the required data and protocols. Again, data centers can benefit from established vendors and third-party companies that are providing the required OPC Servers for vendor-specific data communication and open communication protocols such as SNMP, BACNet, and Modbus.
Detect… and Notify
Once the data is centralized, a single solution such as Exele TopView can monitor the current values and statuses of the disparate measurement data in an attempt to identify abnormal operating conditions within the data center.
For some data, the identification of abnormal conditions is straight-forward (e.g. power relay tripped) but others may involve more complex logic such as multiple variables, aggregates, time delays, rates of change, and deadbands. The solution must allow the user to easily specify both simple and complex conditions that indicate abnormal events in need of attention.
Immediate action requires immediate notification. The notification solution should support multiple notification channels (email, text/SMS, voice callout, audible) and notification escalation to ensure delivery of the alarm condition to those responsible for handling and correcting the abnormality. Through flexible messaging content, the recipient can learn about the alarm as well as related details and conditions of the monitored data.
Birds-eye View of Alarms
In the process and automation world, operators expect real-time displays of current measurement values and alarms. Within the data center, similar displays provide a birds-eye view of the current values, state of alarms (how many and in what areas) as well as allowing individual alarm acknowledgement and annotations. These actions can influence alarm notification (e.g. only notify if the alarm is unacknowledged for 2 minutes) and should be stored along with the alarm history for later reporting and analysis.
Learn Through History
While the real-time displays and immediate alarm detection and notification are critical to the data center operation, additional value is gained through the storage and analysis of the abnormal and alarm event activity. The personnel responsible for overall health of the data center may not need to receive individual alarm notifications, but instead may gain insight through scheduled and ad-hoc reports of global or grouped alarm activity.
For ad-hoc alarm analysis, TopView provides the tools to query and report “bad actors”, times of heavy alarm activity (flooding), and periods of high active alarms counts. Scheduled reports can deliver hourly, daily, and weekly summaries of the alarms.
Alarm reports and analyses will enable users to identify failing equipment, time-of-day related failures (e.g. power load or network), and incorrectly configured alarms.
Embrace Unification, Reap the Rewards
The required data unification tools exist today, and Exele TopView can provide centralized data monitoring, alarm detection, and notification across your data center to allow immediate response to disruptions. In addition, you gain the tools necessary to identify long-term trends in order to detect problem areas and failing equipment, optimize performance and avoid more critical failures.