Breaking down data silos to succeed at digital transformation - Information Management

Published in Information Management, January 2, 2019

Data is the new currency, yet public and private sector enterprises face a digital deadlock when it comes to how information is accessed, stored and consumed. Even with the promise of data lakes and data hubs, the crushing problem of timely access to data assets remains, because the data is not physically located where it needs to be used, processed, edited or analyzed.

To break down data silos and advance initiatives for digital transformation, enterprises must look beyond the network, storage and compute stack to create meaningful insights out of data—the true business asset.

Data Access: The Constant Organizational Pain Point

Enterprises generate massive amounts of geographically dispersed data, which is often trapped in silos within a given location or region. Distance perpetuates “copy sprawl” with copies being moved to various locations to meet user, compute or application needs. Organizations in this predicament are forced to make tradeoffs between data collection, utilization and location, while balancing business needs, costs and resource availability.

Traditional approaches to solve data access are:

  • Physically shipping drives (yes, it’s still being done);
  • WAN optimization using local data pre-processing – compression, dedup, etc. to reduce data sent over the WAN;
  • Application specific acceleration that caches frequently used data;
  • Edge caching to reduce latency effects; or
  • Extreme File Transfer (EFT) solutions that improve network performance and utilization but require changing existing workflows.

Alternatively, companies turn to a hybrid cloud approach with on-premise, colo and/or cloud resources for business agility. But reader be warned, these options require extensive planning to determine what information goes where, how, when and for what purpose in order to balance the business’s scalability needs with control of its data.

Evolving business needs still require dynamic workload rotation within hybrid environments lacking seamless data access.

Transcending Digital Deadlock to Digital Transformation

Data access isn’t exactly a new problem. Even in the High-Performance Computing (HPC) and supercomputing world, data needs to be available in an expedient and parallel fashion to feed unprecedented compute capabilities. HPC infrastructure uses all-flash storage systems, lossless /deterministic high-speed network protocols like InfiniBand or Remote Direct Memory Access (RDMA) over Ethernet (RoCE), and parallel file systems to manage data access.

These same techniques can be leveraged and combined with existing enterprise infrastructure to create a seamless and transparent data overlay. This enables a global federated data platform for business applications to access, use, manage and analyze data across hybrid environments.

Using the same type of lossless and deterministic protocols used in HPC environments gives you data access that scales independent of its location or size and transparent to applications that need it.

Don’t take my word for it, Penguin Computing moved over a petabyte of data coast to coast (70ms) in less than a day. That’s effectively 96% “goodput” of data over a 100 Gbps long-haul fiber without compression or other reduction of the source data. These same techniques have also been used by the U.S. government for some of the most critical global implementations and are a significant part of the Oracle and Microsoft Azure clouds.

Now, the physics of latency cannot be changed. So it’s all about taking a holistic approach that maximizes elements in the infrastructure that affect performance, starting with CPU offload/bypass techniques, creating parallel data flows, and caching hot/active data.

You do not have to think of these as exclusive, exotic technologies only available to supercomputing folks. They can now solve today’s enterprise problems and even tomorrow’s. Look into how HPC and advanced storage technologies bring unprecedented performance to ANY enterprise by creating a unified data fabric so users can access all data, no matter where it lives without moving it – ever.

Written by Russel Davis, Chief Technology Officer, Vcinity