What a Difference a Decade Makes: Data Storage Challenges in 2014 Versus Now
Kris LOU  2024-07-31 14:27   published in China

As Einstein explained, the perception of how fast time is passing depends on the perspective of the observer.

For a dog, a year might feel the same way that 7 years feel to a human.

In the storage industry, changes happen at a far higher rate than in many other spheres of human activity.

We asked a panel of experts to compare the storage and management challenges that enterprises faced 10 years ago with those they face now.

We also asked the panel to discuss the way that the current storage landscape and its increasingly complex challenges are influencing technology developments.

As well identifying major trends, the panel's comments also tended to confirm the saying that history doesn't repeat itself, but it certainly rhymes.

1. Brock Mowry - CTO, Tintri

2. Craig Carlson - TC Advisor, SNIA

3. David Norfolk - Practice Leader, Bloor Research

4. Drew Wanstall - VP of BD, Scale Logic

5. Enrico Signoretti - VP of Product, Cubbit

6. Erfane Arwani - CEO, Biomemory

7. Ferhat Kaddour - VP of Sales, Atempo

8. Johan Pellicaan - VP, Scale Computing

9. Kim King - Senior Director, HYCU

10. Paul Speciale - CMO, Scality

11. Randy Kerns - Senior Strategist, Futurum Group

12. Ricardo Mendes - CEO, Vawlt Technologies

13. Roy Illsley - Chief Analyst, Omdia

14. Scott Sinclair - Practice Director, Enterprise Storage Group

15. Sergei Serdyuk - VP of Product, Nakivo

16. Valéry Guilleaume - CEO, Nodeum

     The storage challenges in 2014 are very similar to those today – at least at a high level.

Randy Kerns (Futurum Group): The challenges haven't changed much, even though the technology has. Probably the biggest was dealing with ever-increasing demands for storage capacity. The second challenge was protecting the data. Even though the intensity of ransomware attacks was not the same as it is today, data protection was still a major issue. The third challenge was not having enough staff to handle the storage workload. That staffing problem has only gotten worse since then.

Brock Mowry (Tintri): The challenges are fundamentally the same as they were 10 years ago, while the scope and scale of these challenges have dramatically changed.

Erfane Arwani (Biomemory): Companies struggled to manage exponential data growth with technology solutions that weren't yet optimised for large data volumes. 10 years ago, enterprise disk capacities ranged from only 1 TB to 4 TB. In the 10 years since then, disk capacities have soared, and the highest-capacity disks now handle 30 TB. Meanwhile, data center usage of flash storage has surged, and the largest enterprise flash disks now exceed 60 TB in capacity. In 2014, enterprises still focused on on-premise storage and were using public cloud storage services to a lesser extent than now.

Ferhat Kaddour (Atempo): It was a matter of choosing between NAS and SAN, and cloud solutions were comparable to ice baths – beneficial but not suitable for everyone.

Ensuring sufficient overall capacity for an organization was a multi-faceted activity.

Drew Wanstall (Scale Logic): The scalability challenge involved predicting future storage needs, optimizing storage utilization, and implementing effective storage tiering strategies.

     Fast forward to now, data is still expanding at a very rapid rate.

Enrico Signoretti (Cubbit): It's interesting to see how data keeps growing at a crazy pace.

Valéry Guilleaume (Nodeum): Some of the new sources of data that are perpetuating this growth and have already ushered in the era of so-called big data. Today, it's not just users that are generating data, but also the systems being developed within each industry, for example: data-generating cars, electronic microscopes, blade scanners, or seismic sensors. These new sources are creating data at a speed that is incommensurate with the data-generating sources of 10 to 15 years ago.

However, the difficulties of scaling up physical storage capacity to keep up with data growth have been lessened to at least some extent by the increased use of public cloud storage, and by improvements in storage technology. Among the last 10 years' technology developments, the most notable has been the enormous reduction in the price of flash memory, which has led to the widespread use of flash in enterprise data centers.

Randy Kerns (Futurum Group): Capacity demand continues, but the scale and performance of flash allow for greater consolidation and fewer physical systems, less power/cooling/space demands, and simpler means for addressing performance. The technology to address problems is available and more effective than 10 years ago. Having the staff to take advantage of it is the big issue.

Storage scalability remains a major problem.

Scott Sinclair (Enterprise Storage Group): More data does make management more complex, but less so than it had in the past. Storage solutions are far more scalable than they used to be. The challenge of data explosion, especially in AI, is finding the right data, getting it in the right clean format, and leveraging it as quickly as the organization wishes. The challenge today isn't storing data as much as it is using data.

David Norfolk (Bloor Research): The technical issues of 10 years ago have largely gone. Storage is now cheap, reliable and easy to scale. But storage management – including threat management – is now a source of cost.

The threats include cyberattacks, which have grown significantly in number and intensity over the last decade.

Paul Speciale (Scality): Security is clearly today's top storage challenge. While there have always been security threats from malicious actors and users, today's issues are indeed harder and more expensive to address, as a result of the well-organized and funded ransomware actors, often from state-sponsored groups.

Sergei Serdyuk (Nakivo): With the ongoing ransomware boom and the emergence of malicious AI tools and as-a-service cybercrime models, data protection is at the forefront of storage challenges today. Breaches are not only more frequent, but they also pack a more powerful punch with improved tactics like double (and triple) extortion and the more recently observed dual-strain attacks.

That is not the only change in the IT landscape that has driven up storage management costs. 10 years ago, data growth was being driven by the overall digitization of business and by the increasing use of analytics. Now it is also being driven by the need to collect data to train AI and ML systems, and the growth of the IoT as a data source. Although the term IoT was coined in the 1990s, it is only over the last 10 years that it has become a commonplace reality. At the same time, enterprises have also been storing more unstructured data such as video and text. Unstructured data accounts for the majority of data stored by enterprises. Unlike structured data, unstructured data is not organized according to pre-defined database schema, making it far harder to manage.

Ferhat Kaddour (Atempo): Today, it's akin to navigating a vast ocean of big data. From customer interactions to sensor data collected, even smaller entities handle petabytes, and the larger ones, exabytes. The difficulties lie not only in the sheer amount of data but also in the strategic tactics needed to extract, categorize, and safeguard it.

David Norfolk (Bloor Research): Quality is a critical data attribute that is challenging to achieve when using unstructured data. The reason is that data comes from a swamp instead of a proper database.

   Edge computing and the use of public clouds as part of hybrid computing strategies have also complicated storage.

Johan Pellicaan (Scale Computing): Managing data at the edge efficiently has become crucial. Ensuring data availability and resilience in distributed environments present new challenges.

   As well as securing data at the edge, enterprises must also be able to move data between multiple locations.

Scott Sinclair (Enterprise Storage Group): Today's challenges are all related to the movement of data across multi- and hybrid-cloud environments. Around 50% of organizations identify that they move data between on and off premises environments "all the time" or "regularly". These issues are more difficult to address because of how disparate the environments are when your data spans AWS, Azure, GCP, the data center, edge, etc.

Data movements and the need for interoperability across multiple computing venues are not the only complications created by public cloud computing.

Ricardo Mendes (Vawlt Technologies): Since public clouds are one of the main solutions for keeping the majority of organizations' data, the dependency on these external vendors for business continuity (BC), or even other more important sovereignty-related matters, is now a growing challenge.

    Data sovereignty is a challenge for businesses using public clouds.

Enrico Signoretti (Cubbit): Navigating complex data sovereignty regulations, such as GDPR and NIS2, adds a layer of complexity for businesses. Public cloud SaaS services have also introduced new locations in which data must be protected.

Kim King (HYCU): One big difference today is in the number of different places where companies house critical data. This is particularly apparent when you look at the increased use of SaaS applications. The average midsize company uses over 200 SaaS applications, but there are very few options available to deliver enterprise data protection that can scale to protect those applications and provide rapid, granular recovery. Over 50% of successful ransomware attacks begin by targeting SaaS applications.

Randy Kerns (Futurum Group): Meeting the same enterprise requirements for protection of information assets in the public cloud as on premises has been a learning experience that requires effort, and, usually, new software solutions. However, there have been cases where some believed this effort was not necessary for data in a public cloud.

    The advantages public clouds have delivered include the democratization of technologies, to the benefit of smaller businesses.

David Norfolk (Bloor Research): There used to be a huge difference between big firms with proper databases and small firms with data stores that didn't support ACID (Atomicity, Consistency, Isolation and Durability). Cloud technologies have evened this up a lot.

      How these challenges are changing storage technologies and services offered by vendors?

Sergei Serdyuk (Nakivo): Security challenges are being addressed by developing yet more sophisticated defences against cyber-attacks. Vendors are incorporating advanced encryption mechanisms, access controls, and compliance features into their solutions. Many offer secure enclaves and hardware-based security to address the evolving threat landscape. However, many storage solutions remain lacking in terms of comprehensive backup and recovery tools. Therefore, the need to extract and categorize data from diverse sources is driving the development of software tools that automate that process. Management tools like metadata tagging, version control, and analytics capabilities are gaining traction.

Valéry Guilleaume (Nodeum): Emerging solutions providing data analysis now make it possible to make data talk, and to extract metadata from it in a way that is incomparable with what was possible in the past.

       Meanwhile, enterprises also require data management software to support hybrid and multi-cloud infrastructures.

Sergei Serdyuk (Nakivo): Vendors who recognize this are developing solutions that support easy integration with various cloud providers, on-premises infrastructure, and mixed configurations. They are also offering tools for seamless data migration and synchronization across different environments.

Scott Sinclair: There is a push for consistency of technology across environments. Some vendors are putting their technology in the cloud. One example of such vendors is NetApp, whose on-premises storage and data management software is also incorporated into the AWS, Microsoft Azure, and Google Cloud public clouds. Others are integrating third-party technologies like VMware or Red Hat OpenShift that can be deployed in multiple locations.

Enrico Signoretti (Cubbit): With respect to the complications caused by the need to maintain data sovereignty and comply with multiple data regulations that apply to storage in what may be multiple public clouds and multiple countries, vendors are prioritizing sovereign solutions for regulated industries like healthcare and the public sector, emphasizing compliance in regions such as EMEA and APAC. Though still subject to the CLOUD Act, Microsoft and AWS recently introduced sovereign cloud storage offers.

The CLOUD Act is US legislation implemented in 2018 that gives US and non-US authorities investigating crimes the right to access enterprise data held by service providers.

     There is a need to provide AI systems with fast access to data.

Craig Carlson (SNIA): AI is currently being addressed by looking at what can be done to bring networks to their highest performance while also being highly scalable. This work is ongoing in groups such as Ultra Ethernet.

A body called the Ultra Ethernet Consortium is developing an architecture that it says will make Ethernet as fast as current supercomputing interconnects, while being highly scalable and as ubiquitous and cost-effective as current Ethernet, and backwards compatible. Members of the heavily-backed consortium include AMD, Arista, Broadcom, Cisco Systems, Huawei, HPE, and Intel.

     In the context of AI and ML, enterprises are expected to face a series of future challenges as data volumes continue to grow.

Brock Mowry (Tintri): The size of data is closely related to management difficulties. More data absolutely drives increasingly complex challenges related to storage. Data growth stretches demands in every dimension, illuminating the need for more leverage – the proverbial "do more with less".

Valéry Guilleaume (Nodeum): The much-needed bigger levers are likely to be available from advances in data management systems – the metadata tagging, version control, and analytics capabilities.

David Norfolk (Bloor Research): I suppose the big issue today is the AI industry and its appetite for data – and the sustainability and resource cost of vast amounts of data, even if each individual bit is cheaper to store. Data quality will be a huge challenge. Decisions shouldn't be based on outdated, incorrect, or biased data. AI, in particular, doesn't cope well with training on biased data

Craig Carlson (SNIA): These storage management and mobility advances may not be restricted purely to AI usage. There's always a trickle down in technology. So, technologies being developed now for the highest-end AI data centers will become more mainstream in a few years.

       About sustainability

Roy Illsley (Omdia): The key question is how can storage and all the data we have be as "green" as possible? At some point we either have to change our lives and way we do things, or technology rides to the rescue. I think it will be a combination of these two, which means we need to work out how we can generate less data or be more precise about what data we have.

Erfane Arwani (Biomemory): The environmental impact of storage, particularly in terms of CO2 emissions and energy use, as a current storage challenge, alongside platform interoperability and security. According to the International Energy Agency (IEA), the electricity consumption of data centers in 2022 was around 1% to 1.3% of global demand. The IEA has also predicted that the energy consumption of data centers could rise 3 to 4-fold by 2026. These problems are more costly and complex to solve, as they require not only technological advances, but also awareness and changes in data governance.

Craig Carlson (SNIA): On the hardware side, the flash technology curve appears to be running out of steam, as it has become very much harder for flash chip makers to reduce costs by packing yet more data bits into each flash memory cell. What will be the next technology to bring reliable high performance to storage in the next 10 to 20 years? Long-term usage of the current tape-disk-flash model may not be feasible. Hence the development of new (and still highly experimental) technologies such as DNA storage.

Erfane Arwani (Biomemory): DNA storage technologies are expected to be a feasible solution to mitigate the environmental impact of storage. Suppliers are now developing greener solutions, such as helium HDDs that reduce energy consumption, or DNA storage technologies such as those being developed by Biomemory and Catalog DNA. These technologies promise a storage density of one exabyte per gram and a durability of several millennia. What's more, they open up the possibility of new use cases, such as the first space data centers.

If that last prediction comes true, remember that you read it here first.

 

-----

Source: Federica Monsone; What Difference Decade Makes for Storage Challenges in 2014 Vs. Now? April 25, 2024

 

Source: WeChat Official Account Andy730

Replies(
Sort By   
Reply
Reply