Big Data World
– Dan Joe Barry, vice president of marketing, Napatech, says:
Real-Time Big Data Analytics (RTBDA), a frequent buzzword in big data discussions, encompasses the essential need to make decisions in real time based on analysis of available data. Many Internet/Over-The-Top (OTT) companies such as Amazon and Google have made this the cornerstone of their strategy.
These OTT players are generating a surge in traffic on the network with little or no revenue contribution, which is a source of inspiration and frustration to telecom carriers.
We will take a closer look at RTBDA, honing in on what this means for telecom networks. Fortunately, the technologies needed to implement such a strategy already exist; however they are not being utilized as efficiently as they could be.
RTBDA: The Basics
Simply put, big data analytics is composed of two parts that distinguish it from data warehousing, mining or business intelligence:
- Distributed, parallel processing
- The ability to act in real time
Big data analytics addresses the challenge of processing large unrelated data sets that typically cannot be accommodated by a single database or server. One solution to address this issue is the use of distributed parallel processing where large data sets are distributed to multiple servers, whereby each server processes a part of the data set, in parallel. Big data analytics work with both structured and unstructured data. For example, the ability to use Hadoop with MapReduce is one approach and can be credited as a driving force behind the current interest in big data.
What is unique to the big data perspective is that processing is completed within a defined time frame that is progressively being associated with “real-time”.
Although RTBDA is fairly new, it addresses the need to act proactively or reactively in real time. The ability for Internet content and services providers to analyze the situation and react in real time is the driver behind RTBDA.
“Real time” for Telecom
How real is “real time?” That depends on the context of your goals and the environment in which you are operating. For some, seconds or microseconds are enough, while others need real-time to be faster.
This is an interesting question from a telecom point of view. It exposes a potential flaw with existing practices in telecom that must be addressed if carriers are to take on the trials that OTT traffic presents. The current definition of “real time” in telecom may no longer be adequate.
Previously, telecom networks were based on connection-oriented technology. The network did not change much from one minute to the next, and changes could only be applied centrally in a highly structured process.
In this environment, gathering information from the network using consistent intervals was sufficient. The given protocols were rich in management information, so much so that just one protocol header could provide significant insight. In this instance, “real time” can be defined in seconds or minutes, whereas the collection of Call Detail Records (CDRs) every five to 15 minutes is sufficient.
However, this is no longer the case. With the migration to LTE, telecom carriers have completed the transition to packet networks based on Ethernet and IP, which function completely different compared to previous methods.
The fundamental principle of IP networks is for the network to take care of itself. The network defines the path that traffic takes and reroutes in the event of congestion or other setbacks. This enables a quick reaction from the network. However, one drawback is that you cannot predict where traffic will be going. By design, Ethernet and IP protocols do not contain the same level of management information overhead that connection-oriented protocols have, thus contributing to the challenge.
Packet networks are dynamic and bursty in nature. They are intended to maintain multiple services used by multiple users sharing a sole infrastructure. Over an extended period of time the utilization of the network seems quite low, but this is attributed to the fact that traffic is transmitted in bursts, which can consume all the available bandwidth. Given these circumstances, the IP network is expected to route traffic through the network in a balanced way. The bottom line is changes can occur in the network from one IP packet or Ethernet frame to the next.
The primary concern with telecom network management and data analytics today is that they both rely on CDRs, IP Detail Records (IPDRs) and Event Detail Records (EDRs) to understand what is happening in real time.
However, this definition of “real time” is dated and collecting data every few minutes is no longer sufficient. Ethernet frames in a 10 Gbps network can be transmitted with as little as 67 nanoseconds between each frame. In this situation we realized that “real time” in a packet network is not minutes; it is not even seconds. It is nanoseconds.
Decision Making in Real-Time
Using CDRs, IPDRs and EDRs for big data analytics is a good idea, depending on your goals. Big data analytics can be designated for two broad categories of decision-making:
- Real-time decision making
- Better planning and optimization of services and networks based on trends and predictive analysis
Utilizing detailed records for enhanced planning and optimization, along with various structured and unstructured data outlets, is valuable as they determine useful trends and predictions. However, this information cannot provide a complete picture unless coupled with real-time information from packet networks that can provide accurate details on what happened and when.
Detailed records cannot be used for real-time decision-making since they are only collected every five to 15 minutes. This is not compatible with what real-time should be in packet networks. For true real-time decision-making, network information must be continuously collected, stored and analyzed.
By collecting and storing network information in this way, we have the ability to analyze the data and make decisions in real time. This is also a source of detailed, reliable information on when and what occurred in the network to supplement other big data analytic activities.
RTBDA in Telecom
The real-time data collection layer can provide telecom carriers with a consistent flow of information that can help optimize decision-making. Both the TM Forum and the IP Network Monitoring for Quality of Service Intelligent Support (IPNQSIS) project, part of the European CELTIC-Plus program, have explored this need as part of their research on customer experience management. Both projects concluded that probes and appliances are critical to providing reliable, real-time insight into what is happening in the network.
Traditionally, probes are data collectors that provide information to other management systems. Appliances use the same technology but also analyze the information and store it locally. Normally appliances are focused on a specific task, such as security, performance monitoring, or test and measurement. However, probes and appliances can be used as sources or implementations of real-time data for big data analytics strategies. The following provides a three-step view of how such an infrastructure could be applied.
Deployment
First is the deployment of appliances for data collection. The key requirement here is that all the Ethernet frames and IP packets need to be captured, in real time, at line speeds with zero packet loss, no matter the environment. This visibility guarantees that a reliable stream of information is being collected.
It is imperative that each frame is given a unique time stamp, ensuring an accurate timeline can be established, not only local to the appliance, but also across multiple appliances. The accuracy of these time stamps must occur within nanoseconds. For example, with only 67 nanoseconds between Ethernet frames in a 10 Gbps network, the time stamp resolution must be better than 67 nanoseconds. Otherwise two Ethernet frames would receive the same time stamp, making it difficult to distinguish which came first.
The combination of zero packet loss capture with nanosecond precision time stamping ensures that we have a consistent, accurate flow of data analysis information.
Storage
Next is storing this information in real time. Many appliances provide capture to disk, allowing real-time data to be stored directly to a local hard disk on the appliance. Alternatively, this data can be forwarded to a Storage Area Network (SAN) or another location. By using the stored data, you can build a historical timeline of what has happened in the network with specific details, making it possible to recreate events.
This history is a source of rich information for data analytics, such as insight into behavior and usage trends. If the appliance has Deep Packet Inspection (DPI) capabilities, then usage of services, including OTT services, can be tracked and analyzed to provide usage patterns with respect to time, type of device and location.
This is a valuable resource for network and service optimization. New services can be aimed at matching users’ preferences. But, perhaps even more importantly, this information can be used to provide insight to OTT content service providers, enabling carriers to offer compelling service offerings to these potential customers.
Real-Time Decisions
Finally, there is the potential to use real-time and stored data to enable optimized decision-making. The historical data that is stored to the disk can be used to profile expected behavior. This also makes detecting unexpected events or anomalies possible. These issues can be a security threat, performance degradation or an opportunity to offer a customer a package extension or a complementary service.
From a RTBDA perspective, this capacity is comparable to the capabilities of OTT content and service providers have implemented. The ability to react in real time, based on an understanding of what is happening right now, and comparing it to what has happened in the past.
Reconsidering RTBDA in Telecom
It is time to re-evaluate what “real-time” means in modern telecom networks and what sources are used for big data analytics. Telecom carriers need to consider utilizing the probe and appliance technology already in the network in a more strategic way to support RTBDA. By doing so they will not only provide a better source of information for planning decisions, but they will also create new opportunities to offer better services, not only to end users, but also to OTT service providers. This ability then has the potential to provide the means to address the issue of monetizing OTT traffic in telecom networks.
About the Author
Daniel Joseph Barry is VP of Marketing at Napatech and has over 20 years experience in the IT and Telecom industry. Prior to joining Napatech in 2009, Dan Joe was Marketing Director at TPACK, a leading supplier of transport chip solutions to the Telecom sector. From 2001 to 2005, he was Director of Sales and Business Development at optical component vendor NKT Integration (now Ignis Photonyx) following various positions in product development, business development and product management at Ericsson. Dan Joe joined Ericsson in 1995 from a position in the R&D department of Jutland Telecom (now TDC). He has an MBA and a BSc degree in Electronic Engineering from Trinity College Dublin.