https://futurumgroup.com/insights/ai-data-infrastructure-field-days-what-we-can-take-away/
Data Infrastructure Is A Lot More Than Storage
The rise of AI and the importance of data to modern businesses has driven us too recognize that data matters, not storage. This episode of the Tech Field Day podcast focuses on AI data infrastructure and features Camberley Bates, Andy Banta, David Klee, and host Stephen Foskett, all of whom will be attending our AI Data Infrastructure Field Day this week. We’ve known for decades that storage solutions must provide the right access method for applications, not just performance, capacity, and reliability. Today’s enterprise storage solutions have specialized data services and interfaces to enable AI workloads, even as capacity has been driven beyond what we’ve seen in the past. Power and cooling is another critical element, since AI systems are optimized to make the most of expensive GPUs and accelerators. AI also requires extensive preparation and organization of data as well as traceability and records of metadata for compliance and reproducibility. Another question is interfaces, with modern storage turning to object stores or even vector database interfaces rather than traditional block and file. AI is driving a profound transformation of storage and data.
Infrastructure Beyond Storage
The rise of AI has fundamentally shifted the way we think about data infrastructure. Historically, storage was the primary focus, with businesses and IT professionals concerned about performance, capacity, and reliability. However, as AI becomes more integral to modern business operations, it’s clear that data infrastructure is about much more than just storage. The focus has shifted from simply storing data to managing, accessing, and utilizing it in ways that support AI workloads and other advanced applications.
One of the key realizations is that storage, in and of itself, is not the end goal. Data is what matters. Storage is merely a means to an end, a place to put data so that it can be accessed and used effectively. This shift in perspective has been driven by the increasing complexity of AI workloads, which require not just vast amounts of data but also the ability to access and process that data in real-time or near real-time. AI systems are highly dependent on the right data being available at the right time, and this has led to a rethinking of how data infrastructure is designed and implemented.
In the past, storage systems were often designed with a one-size-fits-all approach. Whether you were running a database, a data warehouse, or a simple file system, the storage system was largely the same. But AI has changed that. AI workloads are highly specialized, and they require storage systems that are equally specialized. For example, AI systems often need to access large datasets quickly, which means that traditional storage systems that rely on spinning disks or even slower SSDs may not be sufficient. Instead, AI systems are increasingly turning to high-performance storage solutions that can deliver the necessary bandwidth and low latency.
Moreover, AI workloads often require specialized data services that go beyond simple storage. These include things like data replication, data reduction, and cybersecurity features. AI systems also need to be able to classify and organize data in ways that make it easy to access and use. This is where metadata management becomes critical. AI systems need to be able to track not just the data itself but also the context in which that data was created and used. This is especially important for compliance and reproducibility, as AI systems are often used in regulated industries where traceability is a legal requirement.
Another important aspect of AI data infrastructure is the interface between the storage system and the AI system. Traditional storage systems often relied on block or file-based interfaces, but AI systems are increasingly turning to object storage or even more specialized interfaces like vector databases. These new interfaces are better suited to the needs of AI workloads, which often involve large, unstructured datasets that need to be accessed in non-linear ways.
Power and cooling are also critical considerations in AI data infrastructure. AI systems are highly resource-intensive, particularly when it comes to GPUs and other accelerators. These systems generate a lot of heat and consume a lot of power, which means that the data infrastructure supporting them needs to be optimized for energy efficiency. This has led to a shift away from traditional spinning disks, which consume a lot of power, and towards more energy-efficient storage solutions like SSDs and even tape for long-term storage.
The rise of AI has also blurred the lines between storage and memory. With the advent of technologies like CXL (Compute Express Link), the distinction between memory and storage is becoming less clear. AI systems often need to access data so quickly that traditional storage solutions are not fast enough. In these cases, data is often stored in memory, which offers much faster access times. However, memory is also more expensive and less persistent than traditional storage, which means that data infrastructure needs to be able to balance these competing demands.
In addition to the technical challenges, AI data infrastructure also needs to address the growing need for traceability and compliance. As AI systems are increasingly used to make decisions that impact people’s lives, whether in healthcare, finance, or other industries, there is a growing need to be able to trace how those decisions were made. This requires not just storing the data that was used to train the AI system but also keeping detailed records of how that data was processed and used. This is where metadata management becomes critical, as it allows organizations to track the entire lifecycle of the data used in their AI systems.
In conclusion, AI is driving a profound transformation in the way we think about data infrastructure. Storage is no longer just about performance, capacity, and reliability. It’s about managing data in ways that support the unique needs of AI workloads. This includes everything from specialized data services and interfaces to energy-efficient storage solutions and advanced metadata management. As AI continues to evolve, so too will the data infrastructure that supports it, and organizations that can adapt to these changes will be well-positioned to take advantage of the opportunities that AI presents.
Apple Podcasts | Spotify | Overcast | Amazon Music | YouTube Music | Audio
Learn more about AI Data Infrastructure Field Day 1 on the Tech Field Day website. Watch the event live on LinkedIn or on Techstrong TV.
Podcast Information:
Stephen Foskett is the Organizer of the Tech Field Day Event Series, now part of The Futurum Group. Connect with Stephen on LinkedIn or on X/Twitter.
Camberley Bates is the VP and Practice Lead at The Futurum Group. You can connect with Camberley on LinkedIn and her podcast Infrastructure Matters through The Futurum Group.
Andy Banta is a consultant at MagnitionIO and a storage expert promoting simplicity and economy. You can connect with Andy on X/Twitter or on LinkedIn. Learn more about Andy on his Substack.
David Klee is the Founder at Heraflux Technologies. You can connect with David on X/Twitter or on LinkedIn. Learn more about David on his personal website or about Heraflux Technologies on their website.
Thank you for listening to this episode of the Tech Field Day Podcast. If you enjoyed the discussion, please remember to subscribe on YouTube or your favorite podcast application so you don’t miss an episode and do give us a rating and a review. This podcast was brought to you by Tech Field Day, home of IT experts from across the enterprise, now part of The Futurum Group.
#AI #AIDIFD1 #TFDPodcast #Andybanta #CamberleyB #KleeGeek #SFoskett #TechFieldDay #TechstrongTV #TheFuturuemGroup
AI Field Day and AI Data Infrastructure Field Day Tech Field Day is hosting AI Field Day 5 in September and our first-ever AI Data Infrastructure Field Day in October. What’s the difference? AI Field Day is focused on building and running AI applications in the enterprise, including training and inferencing hardware and software. AI […]
Exploring the Importance of Solid Data Infrastructure at AI Data Infrastructure Field Day 1
The use of artificial intelligence technology is taking hold in the enterprise, and many are discovering the importance of a solid data infrastructure for these new applications. No matter what model or technology is used, or whether it’s part of training or inferencing or RAG, data is the foundation for productive AI applications. That’s why the next Tech Field Day event will focus on AI Data Infrastructure, serving as a companion to our recent AI Field Day event, happening October 2 and 3. Tune in live or watch our recordings. Here’s a quick overview of what to look forward to.
AI Data Infrastructure Field Day is broadcast live on LinkedIn and Techstrong TV starting at 8:00 AM Pacific time Wednesday and Thursday, with more presentations throughout the day. All of our sessions will be recorded and posted to YouTube in case you miss anything!
Kicking things off Wednesday October 2 at 8:00 AM Pacific with MinIO, who are building an enterprise object store for AI data. MinIO will discuss the considerations for large-scale object storage for AI, including connectivity, caching, and observability. Google Cloud presents at 10:30 AM, focusing on storage solutions for different stages of the AI pipeline. We recently heard about Google Cloud storage at our Cloud Field Day event, and this will dive deep into Vertex integration and managed AI services. After lunch we’ll hear from HPE at 1:30 PM. HPE recently announced NVIDIA AI Computing by HPE, a new private AI solution that provides turnkey accelerated computing.
On Thursday we will start with Infinidat at 8:00 AM. They are bringing advanced cybersecurity and AI features to their leading enterprise storage platform. At 10:30 AM we welcome enterprise SSD leader Solidigm back to Tech Field Day. Their presentation focuses on the way SSDs can bring TCO and performance benefits to AI workloads. We’ll wrap up AI Data Infrastructure Field Day at 2:00 PM with Pure Storage. Their consolidated scale-out storage platform known as FlashBlade is seeing rapid adoption with AI workloads.
You can also catch the AI Data Infrastructure Field Day delegates on our Tech Field Day podcast, including two special episodes recorded next week. Check out our appearances on Techstrong Gang and the Gestalt IT Rundown, too. And tune in for behind-the-scenes shorts, Tech Talks, and extras recorded in Silicon Valley!
All of our sessions are broadcast live on LinkedIn at the Tech Field Day page and recorded and shared on YouTube. You can also catch the sessions on Techstrong TV, with our coverage continuing on Techstrong AI. We welcome participation on X/Twitter, LinkedIn, and Mastodon using hashtag AIDIFD1.
You can learn more about the event and our panel of independent technical influencers by The Tech Field Day website. Each of our delegates has their own blog, podcast, or social media platform where they share their thoughts on enterprise technology from servers to storage to networking and beyond. We’re proud to have Camberley Bates and Steven Dickens from The Futurum Group joining us on the delegate panel, and we look forward to their analysis and reactions.
Thank you for joining AI Data Infrastructure Field Day live October 2 and 3. While you are on YouTube, please subscribe to our channel, and please follow our LinkedIn page for more Field Day content.
#AIDIFD1 #CamberleyB #Google #GoogleCloud #HPE #INFINIDAT #MinIO #PureStorage #SFoskett #Solidigm #StevenDickens3 #TechFieldDay #TechstrongGroup #TechstrongTV #TheFuturumGroup
AI Field Day and AI Data Infrastructure Field Day Tech Field Day is hosting AI Field Day 5 in September and our first-ever AI Data Infrastructure Field Day in October. What’s the difference? AI Field Day is focused on building and running AI applications in the enterprise, including training and inferencing hardware and software. AI […]
CISA Focuses on Election Security | Gestalt IT Rundown: July 10, 2024
https://www.youtube.com/watch?v=GZakndM-Jk8
The Cybersecurity and Infrastructure Security Agency (CISA) has released a comprehensive guide to bolster operational security (OPSEC) for election officials. This guide aims to enhance the security of election infrastructure by offering detailed strategies for identifying and mitigating potential risks within the election context. This and more on The Rundown.
Apple Podcasts | Spotify | Overcast | Amazon Music | Audio | YouTube
1:34 – Japan Bids Farewell to Floppy Disks
Japan’s government has officially ended the use of floppy disks in all its systems, marking a significant step in modernizing its bureaucratic processes. This move comes after a dedicated campaign to phase out outdated technology and update over a thousand regulations.
Read More: Japan wins 2-year “war on floppy disks,” kills regulations requiring old tech
4:15 – NetApp to Focus on AI Integration and Data Resilience
NetApp is a familiar face in the datacenter and the company has reached into the cloud in recent years. But the storage-focused company has not really been part of the AI data infrastructure conversation until now. The latest announcements show NetApp taking retrieval-augmented generation or RAG seriously, with advanced storage features directed to support modern LLM workloads.
Read More: NetApp, Perspectives on Intelligent Data Infrastructure and Strategic Direction
9:04 – Broadcom Launches Omnissa while Streamlining VMware
Broadcom has sold its End-User Computing Division to KKR for $4 billion, forming Omnissa as a standalone company focused on digital workspace solutions like Workspace ONE and Horizon. This move marks Broadcom’s strategic shift towards concentrating on the private cloud sector, shedding non-core assets. Omnissa aims to innovate and expand its market presence under KKR’s ownership, emphasizing customer-centric strategies and product development.
Read More: Broadcom Further Streamlines VMware, Omnissa is Born
12:26 – Billions upon Billions of Passwords Published on the Dark Web
In a major cybersecurity breach, a hacker has uploaded nearly 10 billion stolen passwords to a popular crime forum. Dubbed “RockYou2024,” this massive compilation of plaintext passwords spans data breaches from the last two decades, posing significant risks for credential stuffing and brute-force attacks.
Read More: Hacker Uploads 10 Billion Passwords To Crime Forum—Report
16:02 – Douglas Gourlay Takes the Reins at Qumulo
Qumulo is a quiet success in the storage world, growing to Unicorn status with a scale-out distributed storage platform that has found success in hybrid cloud and for AI workloads. Douglas Gourlay was an early Arista Networks employee, helping lead the networking upstart to become a real challenger. The storage company hopes that Gourlay can challenge that industry with new leadership and ideas.
Read More: Qumulo Board Announces Douglas Gourlay as new Chief Executive Officer
20:27 – Cloudflare Shields Users From Data-Harvesting AI Bots
Cloudflare has introduced a new tool designed to combat the increasing challenge of AI bots scraping websites for data to train AI models. This initiative aims to prevent unauthorized data harvesting while ensuring that AI companies adhere to web scraping rules.
Read More: The future is all bot vs. bot
23:59 – CISA Focuses on Election Security
The Cybersecurity and Infrastructure Security Agency (CISA) has released a comprehensive guide to bolster operational security (OPSEC) for election officials. This guide aims to enhance the security of election infrastructure by offering detailed strategies for identifying and mitigating potential risks within the election context
Read More: CISA Guide Aims to Bolster OPSEC for Election Officials
36:33 – The Weeks Ahead
Networking Field Day 35 – July 10 – 11
Tech Field Day Experience at SHARE Kansas City 2024 – August 4 – 8
Gestalt IT and Tech Field Day are now part of The Futurum Group.
The Gestalt IT Rundown is your look at the IT news of the week. Be sure to subscribe to Gestalt IT on YouTube for even more weekly video content.
#ElectionSecurity #FloppyDisk #Omnissa #Rundown #Broadcom #CamberleyB #CloudFlare #GestaltIT #NetApp #Poller #Qumulo #SFoskett #TechFieldDay #TechstrongTV #TheCTOadvisor #TheFuturumGroup #VMware
https://wp.me/p4YpUP-mrj