NVIDIA Reveals Rubin AI and Blackwell Ultra || Tech Field Day News Rundown: March 19, 2025

NVIDIA unveiled new AI-focused chips at its GTC conference, including the Blackwell Ultra series launching this year and the next-gen Vera Rubin GPUs set for 2026. CEO Jensen Huang emphasized the company’s shift to an annual release cycle, a departure from its previous biennial schedule. This move reflects NVIDIA’s response to the growing AI market and increasing competition. This and more on the Rundown.

https://youtu.be/49b4_OrMpvI

Apple Podcasts | Spotify | Overcast | Amazon Music | Audio | YouTube

2:29 – Taara Spun Out from Google’s Parent Company Alphabet

Alphabet has spun off Taara, its laser-based internet backbone provider, into an independent company focused on delivering high-speed connectivity to underserved regions. The newly formed company will continue developing its wireless optical communication technology, which aims to provide reliable internet without the need for traditional fiber infrastructure.

Read More: Alphabet spins off laser-based Internet backbone provider Taara

6:41 – Intel Names Lip-Bu Tan as New CEO

Intel has appointed Lip-Bu Tan as its new CEO, succeeding Pat Gelsinger who stepped down in December 2024. Tan, formerly CEO of Cadence Design Systems and an Intel board member, aims to revitalize the company by streamlining operations and focusing on advanced chip manufacturing, including AI server chips, to better compete with industry leaders like TSMC. Following the announcement, Intel’s stock surged nearly 8%, reflecting investor optimism about Tan’s strategic vision.

Read More: Intel Names Lip-Bu Tan as New CEO

12:38 – DevOps Gets Empowered by Semaphore Going Open Source

Semaphore has open-sourced its core CI/CD platform under the Apache 2.0 license, aiming to provide developers with greater flexibility and transparency. This move addresses the limitations of existing CI/CD solutions by combining enterprise-grade reliability with the customizability of open-source software. By making its codebase publicly accessible, Semaphore empowers developers to explore, modify, and enhance the platform to suit their unique requirements, fostering a community-driven approach to innovation.

Read More: Semaphore Goes Open Source: A New Dawn for DevOps Professionals

16:38 – Microsoft’s New Quantum Chip Greeted with Major Skepticism

Microsoft’s recent announcement of its Majorana 1 quantum chip, claimed to be a significant advancement in quantum computing, has been met with skepticism from the scientific community. Experts question the validity of Microsoft’s findings, citing a lack of peer-reviewed evidence and unproven underlying physics, leading to concerns about the legitimacy of the breakthrough.

Read More: Microsoft’s New Quantum Chip Greeted with Major Skepticism

22:04 – Solo.io Launches Kagent for Agentic AI-Driven Cloud Ops

Solo.io has launched Kagent, an open-source agentic AI framework designed to automate cloud-native operations within Kubernetes environments. Built on Microsoft’s AutoGen, Kagent integrates with existing tools to streamline tasks like configuration, troubleshooting, and network security, enabling DevOps teams to leverage AI-driven automation.

Read More: Solo.io’s Kagent Brings Agentic AI to Cloud-Native Operations

25:37 – Amazon, Google, and Meta Nuclear Datacenters by 2050

Amazon, Google, and Meta are investing heavily in nuclear energy as part of their strategy to achieve net-zero carbon emissions by 2050. These tech giants plan to power their operations with 100% renewable energy, with nuclear energy playing a key role in meeting their ambitious sustainability goals. Alongside nuclear, they are exploring advanced technologies like carbon capture to reduce their environmental impact and address the urgency of climate change.

Read More: Amazon, Google, Meta Are Going Super Nuclear by 2050

29:38 – NVIDIA Reveals Rubin AI and Blackwell Ultra

Nvidia unveiled new AI-focused chips at its GTC conference, including the Blackwell Ultra series launching this year and the next-gen Vera Rubin GPUs set for 2026. CEO Jensen Huang emphasized the company’s shift to an annual release cycle, a departure from its previous biennial schedule. This move reflects Nvidia’s response to the growing AI market and increasing competition.

Read More: Nvidia announces Blackwell Ultra and Rubin AI chips

40:58 – The Weeks Ahead

Upcoming Tech Field Day Events

Networking Field Day 37 – March 19 – 20

Tech Field Day Extra with HPE – April 8

AI Infrastructure Field Day 2 – April 23 – 24

Mobility Field Day 13 – May 7 – 8

Security Field Day 13 – May 29 – 30

Host Information

Stephen Foskett is the President of the Tech Field Day Business Unit and Organizer of the Tech Field Day Event Series, now part of The Futurum Group. Connect with Stephen on LinkedIn or on X/Twitter.

Alastair Cooke is a Tech Field Day Event Lead, now part of The Futurum Group. You can connect with Alastair on LinkedIn or on X/Twitter and you can read more of his research notes and insights on The Futurum Group’s website.

Gestalt IT and Tech Field Day are now part of The Futurum Group.

The Gestalt IT Rundown is your look at the IT news of the week. Be sure to subscribe to Gestalt IT on YouTube for even more weekly video content.

#AI #NVIDIAGTC #Rundown #AWSCloud #DemitasseNZ #GestaltIT #Google #GoogleCloud #Intel #IntelBusiness #Meta #Microsoft #Nvidia #SFoskett #TechFieldDay #TechstrongTV #TheFuturumGroup

https://wp.me/p4YpUP-n5B

NVIDIA Reveals Rubin AI and Blackwell Ultra || Tech Field Day News Rundown: March 19, 2025

YouTube

Qiskit and IBM’s New Quantum Innovations | The Gestalt IT Rundown: September 25, 2024

https://youtu.be/MZfZa_jF5q8

We’re happy to have Dr. Bob Sutor joining us this week on the Rundown, since he covers quantum and advanced computing for The Futurum Group. IBM made two important announcements in the quantum space this week. The first announcement was Qiskit, a quantum SDK that runs on Python for quantum computers. This promises to bring quantum compute to a more mainstream audience and converting the underlying code to Rust. This solution is much faster than competing solutions from Google, Amazon, and Quantinuuum. IBM is also putting together an app store for quantum applications and runtime functions, including from third-party developers. This matches the moves that we have seen in areas like cloud and AI, and serves to push IBM as the leader in quantum computing. This and more on The Rundown.

Apple Podcasts | Spotify | Overcast | Amazon Music | Audio | YouTube

1:57 – Qualcomm to Buy Intel?

Qualcomm is exploring a potential acquisition of Intel, a move that could strengthen both companies and enhance U.S. leadership in the chip industry, despite potential antitrust review. Intel, under CEO Pat Gelsinger, is pursuing strategic changes to boost its competitive edge, including expanding its manufacturing capabilities and investing in next-generation technologies. A successful deal would significantly broaden Qualcomm’s portfolio and position both companies for growth in the rapidly evolving AI and semiconductor markets.

Read More: Qualcomm Approached Intel About a Takeover in Recent Days

4:11 – Kioxia No Longer Planning IPO

Kioxia has postponed its planned IPO, which we discussed on the August 28 show, due to challenges in achieving its target valuation amid a broader market downturn. Despite recent improvements in memory chip prices, the company has been impacted by a slump in the chip market, which they had already previously delayed its IPO in 2020. Kioxia, with a 14% share in the flash memory market, remains focused on listing when market conditions improve.

Read More: Exclusive: Bain-backed chipmaker Kioxia scraps October IPO plan, sources say

6:42 – Announcements from Dreamforce

At its Dreamforce conference, Salesforce introduced Agentforce, a suite of AI-powered agents designed to streamline app development and improve customer experiences across industries. The platform’s low-code AI tools, new partnerships, and enhanced AI models aim to drive adoption of these autonomous agents, helping businesses automate tasks and unlock the full potential of their data. By integrating AI more deeply across its ecosystem, Salesforce seeks to differentiate its offering and support customers in modernizing operations and achieving higher ROI through scalable AI solutions. For more on this, let’s turn it over to The Futurum Group’s Keith Kirkpatrick who was at the event.

Read More: Dreamforce Announcements Focus on AI, Agentforce, and Cloud Enhancements

16:32 – Commvault Buys Clumio

Commvault just announced its acquisition of Clumio, an AWS data protection specialist we’ve previously discussed here on the Rundown. The acquisition enhances Commvault’s cyber resilience capabilities for cloud-native applications and allows the company to leverage Clumio’s innovations, including rapid access to Amazon S3 data during critical recovery operations, expanding its offerings for AWS-based businesses. Clumio’s expertise in protecting complex data sets will now reach a global scale through Commvault’s platform.

Read More: Commvault Accelerates Cyber Resilience Capabilities for AWS with Acquisition of Clumio

20:07 – CISA Wants to Say Ciao to Ivanti EOL Units

In a very telling move, CISA has made a statement telling customers to move off of Ivanti Cloud Services Appliance 4.6. The notice comes as yet another security update has been released and Ivanti is not porting it to versions prior to 5.0. The CISA has been very up front about exploits this year and Ivanti is no stranger to having issues with their underlying code quality.

Read More: Ivanti Releases Security Update for Cloud Services Appliance

22:47 – Veeam Acquires Alcion

Veeam announced today that they are acquiring Alcion. Alcion has focused on SaaS backups since being founded back in 2022. Veeam had led an investment round for Alcion in 2023 and Veeam had also acquired Kasten, which was the startup that had been founded previously by Alcion founders. In addition to the acquisition, Niraj Tolia will move into the vacant CTO role at Veeam to help guide the integration between all the products.

Read More: Veeam, the #1 Data Resilience Company, Appoints Niraj Tolia as Chief Technology Officer to Accelerate Innovation of Data Resilience as a Service

25:29 – Qiskit and IBM’s New Quantum Innovations

We’re happy to have Dr. Bob Sutor joining us this week on the Rundown, since he covers quantum and advanced computing for The Futurum Group. IBM made two important announcements in the quantum space this week. The first announcement was Qiskit, a quantum SDK that runs on Python for quantum computers. This promises to bring quantum compute to a more mainstream audience and converting the underlying code to Rust. This solution is much faster than competing solutions from Google, Amazon, and Quantinuuum. IBM is also putting togetther an app store for quantum applications and runtime functions, including from third-party developers. This matches the moves that we have seen in areas like cloud and AI, and serves to push IBM as the leader in quantum computing.

Read More: Quantum in Context: IBM Qiskit Boosts Software Development Speed

Read More: Microsoft unveils new quantum computing hybrid solution in Azure

34:27 – The Weeks Ahead

Networking Field Day Exclusive with Nokia – September 24

AI Data Infrastructure Field Day 1 – October 2 – 3

Commvault Shift – October 8 – 9

Security Field Day 12 – October 16 – 17

Cloud Field Day 21 – October 23 – 24

Gestalt IT and Tech Field Day are now part of The Futurum Group.

The Gestalt IT Rundown is your look at the IT news of the week. Be sure to subscribe to Gestalt IT on YouTube for even more weekly video content.

#Dreamforce #Qiskit #QuantumComputing #Rundown #1 #Alcion #Clumio #Commvault #GestaltIT #IBM #Intel #IntelBusiness #Qualcomm #Salesforce #SFoskett #TechFieldDay #TheFuturumGroup #Veeam

https://wp.me/p4YpUP-mDQ

- YouTube

Enjoy the videos and music you love, upload original content, and share it all with friends, family, and the world on YouTube.

A Smaller VMware Explore Still Delivers News | The Gestalt IT Rundown: September 4, 2024

https://youtu.be/Z9AR6IvGIjA?feature=shared

This year, at the smallest VMware conference in many years, Broadcom made some important announcements. As we’ve previously discussed, VMware Cloud Foundation 9 was detailed, along with Tanzu Platform 10, an enhanced Edge Orchestrator platform, VeloCloud Software-Defined Edge, and more. Although we miss the community coming together at VMworld every year, this smaller VMware Explore conference is very much in line with the direction taken by Broadcom since their acquisition of the company. This and more on The Rundown.

Apple Podcasts | Spotify | Overcast | Amazon Music | Audio | YouTube

1:40 – AnandTech Shuts its Doors

After 27 years setting the standard for online tech journalism, AnandTech has published its last article. The popular tech news site, which was revered by industry insiders and tech enthusiasts, was known for impartiality, depth, and never sensationalizing the news. Although some were worried when founder and namesake Anand Lal Shimpi left for Apple, the site just continued to deliver excellent news coverage for decades. But it appears that the current owner, Future LLC (no relation to Gestalt IT parent Futurum) will cease publishing new articles on the site, thankfully keeping existing articles and forums online. The company also owns Tom’s Hardware, which will continue.

Read More: End of the Road: An AnandTech Farewell

3:02 – Rumors Fly that Intel is Changing Course

The rumor mill is going wild that Intel’s financial troubles are about to knock the chipmaker off of Pat Gelsinger’s intended course. We’ve heard that next-generation chip fabrication isn’t going well, leading to an executive’s departure. Others say Intel is looking to spin off the fab business altogether, or perhaps sell the Altera FPGA business. All of this would be a significant departure from the plans set in motion by Pat Gelsinger.

Read More: Exclusive: Intel CEO to pitch board on plans to shed assets, cut costs, source says

6:43 – Cloud providers consuming all the GPUs for AI

While Dell, like other server vendors has experienced a huge growth in dedicated AI server sales there is a scarcity of Nvidia GPUs that is making delivering these servers hard. NVIDIA GPUs are being gobbled up by hyperscalers before they reach Dell and other OEMs. It may still be some time before we see the racks full of Dell PowerEdge XE9680L servers with Blackwell GPUs that Michael Dell mentioned at Dell Technologies world.

Read More: Dell’s AI Server Business Now Bigger than VMware Used to Be

9:54 – Hammerspace Pulls Out a Server

Hammerspace produces a respected data virtualization platform, enabling applications like AI training to access any data anywhere. One of the strengths of Hammerspace is that it is pure software, and can run just about anywhere. But now the company is announcing a partnership with reseller Arrow to ship a storage virtualization appliance based on Dell hardware.

Read More: Hammerspace Expands Global Data Platform with New Appliances

12:25 – AMD Is Ready to Take On NVIDIA

The latest MLCommons results just came out, and they sure make AMD look great! We have long speculated that AMD’s Instinct series is competitive with Nvidia’s best AI processing GPUs, and now there’s data that says exactly that. But Nvidia still has an incredible market share lead over the competition.

Read More: The First AI Benchmarks Pitting AMD Against NVIDIA

15:13 – California Readies an AI Kill Switch

The state of California is awfully worried about AI! The state legislature has overwhelmingly passed a bill requiring a so-called kill switch to all AI systems in case they get out of hand. Now it’s on to Governor Gavin Newsom, who is under pressure to veto the bill. But even if he does, there appears to be support in the state legislature to override his veto and make it law.

Read More: California legislature passes controversial “kill switch” AI safety bill

18:05 – A Smaller VMware Explore Still Delivers News

This year, at the smallest VMware conference in many years, Broadcom made some important announcements. As we’ve previously discussed, VMware Cloud Foundation 9 was detailed, along with Tanzu Platform 10, an enhanced Edge Orchestrator platform, VeloCloud Software-Defined Edge, and more. Although we miss the community coming together at VMworld every year, this smaller VMware Explore conference is very much in line with the direction taken by Broadcom since their acquisition of the company.

Read More: What’s Happened at VMware Explore 2024 Las Vegas?

28:02 – The Weeks Ahead

AI Field Day 5 – September 11 – 12

Edge Field Day 3 – September 18 – 19

Networking Field Day Exclusive with Nokia – September 24

AI Data Infrastructure Field Day 1 – October 2 – 3

Gestalt IT and Tech Field Day are now part of The Futurum Group.

The Gestalt IT Rundown is your look at the IT news of the week. Be sure to subscribe to Gestalt IT on YouTube for even more weekly video content.

#AI #Rundown #VMwareExplore #AMD #AnandTech #Broadcom #DemitasseNZ #GestaltIT #Intel #IntelBusiness #Nvidia #SFoskett #TechFieldDay #TheFuturumGroup #VMware

https://wp.me/p4YpUP-mB7

A Smaller VMware Explore Still Delivers News | The Gestalt IT Rundown: September 4, 2024

YouTube
Summertime Fun with Networking Field Day 35 - Gestalt IT

The heatwave seems like it's here to stay but that means more time to enjoy the great presentations at Networking Field Day 35. We've got a great lineup of presenters and delegates ready to discuss the hottest trends in the networking industry live for our community July 10 and 11. It promises to be an exciting event filled with great conversations and wonderful technical content.

Gestalt IT

Are GPUs Essential for Every AI Problem? Intel Says, No.

AI’s complex workloads require extreme computing, the kind that only the fastest accelerators are known to provide. It is little surprise that GPUs (Graphics Processing Units) have emerged as the holy grail of compute as AI gains ground. But exorbitant pricing and scarce access, not to mention heavy power consumption and cooling requirements, raise barriers for enterprise adoption. 

What if IT shops truly didn’t need these silicon beasts for the bulk of their AI works? 

An Affordable Alternative for AI Workloads below 20B Parameters

Intel has closely followed the GPU trend over the past few years. To gauge the depth of need of GPUs for AI workloads, the teams have run a series of trials, and they’ve arrived at an interesting conclusion. Based on their findings, CPUs can accommodate a third of AI workloads, with the exception of the insanely intense ones

For example, the most intense large language model (LLM)-based AI workloads, like Meta’s LLAMA2, typically fluctuate within the range of 7 and 30 billion.

Figure 1. The Reality of AI: GPUs Are Needed Only For Insanely Intense Workloads 

Intel found that a majority of AI workloads remain below 20 billion parameters. The Xeon chipsets meet almost all latency requirements for the general-purpose workloads in that category. There is rarely the need to leverage the massive acceleration of the GPU technology for these AI workloads, Intel says. 

Intel shared benchmarks from real-world scenarios to back this up. In one example of an inference-heavy AI implementation, a customer used Intel Xeon CPUs to perform extremely rapid image processing. The Xeon CPU family was able to scale from a not-insignificant scanning speed of 400 frames per second (fps) using 24 AVX2 CPUs, to over 19,000 fps using 64 cores of their AMX-powered Emerald Rapids processors. 

The test results revealed that the newest AMX chipsets engineered with power efficiency kept the  64-core configuration’s energy consumption even with the 24 AVX2.

Figure 2. Same Wattage, Dramatically Increased Workloads: 24 AVX2 Cores vs. 64 AMX Cores 

Another real-world use case Intel shared is of a customer adding speech translation and real-time transcription services to an existing video conferencing offering. Intel engineers put together a solution with just a few additional servers using Intel CPUs.

Figure 3. Handling LLM Demands: Response Time Latencies Kept Below 100 ms (1/10th of a second)

This configuration was particularly interesting because it had to deal with two different AI workloads. Imagine translating the phrase “Pleased to meet you, sir!” to Spanish. The latency for the first word returned in the sequence – “Mucho” – tends to be compute-bound because the AI model needs to find exactly the right word. However, retrieving each of the next words that would be returned in the phrase may also depend on contexts, like the formality of the greeting (e.g. ?Mucho gusto! versus ?Mucho gusto, a sus ordenes!) and tends to be a memory-bound operation.

Intel’s hardware solutions work well for complex AI workloads like the above. Intel has heavily invested in native software support for OSS solutions for data analytics, like Pandas, NumPy, and Apache Spark. Likewise, their commitment to support popular machine learning and deep learning toolsets like PyTorch, TensorFlow, and AutoML (Figure 4) go a long way to extend support to the userbase.

Figure 4. It’s Not Just the Hardware: Intel’s Array of Software Support For Analytics and ML

To GPU, Or Not To GPU?

Intel positions its current array of Xeon CPU chipsets for organizations that are grappling with the pressures of providing adequate compute power for AI workloads. Only the most intense AI workloads – specifically, those north of 20 billion parameters – require GPU-level computing power. For everything below that, Intel’s offerings appear to fill the bill. Additionally, their deep support for software compatible with common analytics, machine learning, and deep learning requirements, make their hardware a compelling choice.

Be sure to check out Intel’s presentations on CPUs for AI workloads from the recent AI Field Day event to get a technical deep-dive. 

#AI #AIFD4 #Sponsored #Intel #IntelBusiness #JimTheWhyGuy #TechFieldDay

https://wp.me/p4YpUP-mdb

Intel | Data Center Solutions, IoT, and PC Innovation

Intel's innovation in cloud computing, data center, Internet of Things, and PC solutions is powering the smart and connected digital world we live in.

Intel

Nature Fresh Farms – Optimizing Indoor Farming Practices with AI and Intel

Artificial intelligence is restyling one of the oldest jobs of mankind – farming. A growing presence of robots in the fields, and AI-based discovery and prediction, are driving farmers away from traditional farming towards precision agriculture.

One of the companies at the helm of this quiet transition is Nature Fresh Farms. Established in Leamington, Ontario in 1999, Nature Fresh Farms is one of the largest greenhouse farms in North America.

The farm sprawls over 250 acres of land. Organic peppers, berries, tomatoes and cucumbers grow in rows in the controlled environment of enormous hothouses. Keith Bradley, VP of IT and Security for Nature Fresh Farms, gave a presentation at the recent AI Field Day event, revealing the behind-the-scenes workings. NFF’s computerized plant lifecycle management is based off of a private, Intel-based infrastructure.

Controlled Environment Agriculture

Lately, the farming industry has survived several tough years in a row. Yields were devastated by unpredictable weather patterns involving record number of droughts and hard freezes, crop diseases, seasonal cycles, environmental problems, and labor shortage. This has turned many farmers and growers towards more intensive forms of agriculture.

Nature Fresh Farms is a passionate advocate of artificial intelligence for farming. Over the last 8 years, the farm has vigorously employed AI solutions to surmount many daunting challenges, and the outcome has been nothing short of satisfactory.

Little by little, Nature Fresh Farms has changed the way they do things, and think about the business in a whole new way through strategic adoption of AI.

“We are constantly looking at new ways to understand what the plant does,” says Bradley.

Real-World Data for Better Farming Practices

Greenhouse farming is a comprehensive rethinking of traditional farming as we know it. Eliminating the natural factors like sunlight and weather, by bringing the farm indoors is an impactful change. But for maximum payoff, farmers need to constantly monitor the vegetation growth.

This prompted using real-world data gathered from the farm. The Nature Fresh Farms greenhouses are installed with an assortment of edge sensors pointing down and looking closely at the vegetation. The data they pick up is computed locally at edge points. The information is hauled back nightly to a core datacenter when insights on a varying range of things are inferred – use of natural resources, irrigation level, yield measurement, growth forecast, ideal harvest times, ripeness, distressed crops, plant infections, and so on.

Data is synced up every hour for most updated information, says Bradley. “We’ve gained a lot of knowledge because we have the infrastructure to handle things,” he says talking about the technological changes it has embraced over the years. “We’re one of the few greenhouses that are going for a large infrastructure. We have the compute, and the storage, so we don’t have to go to the cloud.”

Cloud is not an option for farming companies. “Because most farms in our industry don’t have an Internet connection that can handle cloud-type resource, we have to be on prem or it won’t work for us,” he explains.

Nature Fresh Farms updates and tweaks its farming and harvesting plans based on data several times a year. Today the farm grows more high-quality produce per square meter than ever before.

“Every year we want to see an increase in yield. We measure everything by per meter square, and one of the big things that [technology] did for us is, we are seeing a good trend at a three or four increase per year. Think about the same plants growing more produce in the same space, and using the same amount of resources.”

A Xeon-Based Tech Stack to Power the Models

At the outset, Nature Fresh Farms’ in-house stack comprised a 3-core cluster that used Intel Xeon Gen2 processors. But, soon they realized that for a farm its kind and size, uptime is mission-critical. “An hour or two of downtime on the wrong day is too much. We could lose a crop in that window and it could cost us millions of dollars in damage.”

Need for more compute was felt, and the stack was updated with newer generations of Xeon CPUs. The performance gains were staggering, says Bradley.

“Each time we add nodes, we see a direct reduction. One of the ETLs that we run on our labor side compares how efficiently the labor is working. Earlier it would take 3 hours to run. With the latest generation, it is down to 30 minutes.”

Today, Nature Fresh Farms has 32 AI models of which a majority is home-grown on its private stack, and some borrowed from external vendors. With sizes ranging from 4 GB to 32 GB, the models are deployed on a 7-node cluster, of which two are dual-core 4th Gen Intel Xeon CPUs.

Leveraging AI for Higher-Value Yields

Nature Fresh Farms relies on a network of growers and their extensive farming knowledge. But working in different parts of the acreage, their knowledge is often siloed. The company uses its large language models to bring all the information together and provide the team a holistic view required to plan a response.

“We have multiple facilities where growers won’t see each other for weeks but once that data is in the system and that model is there, it’ll now know how to react.”

The models are retrained periodically to keep the predictions and calculations as close to accurate as possible. “It gives us an accurate forecast for what the sales department needs to sell, so that we’re not overselling. The farther away it is, the harder it is to predict but when we get closer, we know pretty accurately how many boxes of strawberries we’re going to pick, how much weight, what grade, etc.”

Inferencing is a big part of their AI workload. “One of the areas that we do a lot of inferencing with pictures is on the packing lines. There are multiple things that we got out of doing that that we never realized.”

Through a vision system, the models sort produces based on grade and maturity, a function that has helped maximize shelf-life by putting yields of the equal ripeness together.

“Vision actually looks at the vegetables, takes pictures as they are rolling in, and decides which shoot it should go down.”

Nature Fresh Farms is slowly transitioning towards Kubernetes as a way of modernizing some of the older models. “We’re trying to start getting rid of older models, and groom and grow them. Refining these models is one of the things our team always strives to do,” Bradley says.

Are CPUs going to be sufficient as operation scales, or are GPUs on the horizon? “I’ll probably say no. For our scale and size, we’re not at that point that we really need it,” replies Bradley.

With inference calls taking place only few times a day, GPUs will be an overkill, he explains. “As long as we get the results we need, I don’t see us wanting to change what we do.”

Be sure to watch Nature Fresh Farms’ presentation from the AI Field Day event to follow the discussion. Also check out Jim Davis’ view on Nature Fresh Farms’ use of AI and Intel CPUs in his article. You can also view the Intel Spotlight Podcast episode featuring Keith Bradley and Nature Fresh Farms.

#AI #AIFD4 #LLM #ML #Intel #IntelBusiness #NatureFresh #TechFieldDay #WriterOfTech1

https://wp.me/p4YpUP-m86

Home | Nature Fresh Farms

Fresh all year round. Family owned and managed, Nature Fresh Farms grow flavorful greenhouse Tomatoes, Bell Peppers & Cucumbers.

Nature Fresh Farms

Taking on AI Inferencing with 5th Gen Intel Xeon Scalable Processor

There is nary a trend today that looms as large as artificial intelligence, and everybody wants a slice. New hardware products leap out of the assembly line like clockwork, vitalizing this revolution. Of them, GPUs capture a bigger mindshare and market share. Packed with infinite compute power and sub-millisecond latency, they have emerged as the hottest commodity in the hardware market.

But as chip shortage and long wait-times leave companies hungry and scrambling for compute power, on the inference side where the AI algorithms make sense of the data, CPUs are making a comeback.

Intel has been readying an alternative to seize this opportunity for a while. After Habana Labs Gaudi, the 5th Generation Intel Xeon Scalable Processor – nicknamed Emerald Rapids – is the new prize. Xeon processors, with adequate performance and millisecond latency, are poised to set off a wave in general-purpose AI market.

At the recent AI Field Day event in California, hosted by Tech Field Day, part of The Futurum Group, where Intel hosted a full-day presentation (perfectly summarized here by Paul Nashawaty, Futurum Group analyst), partners and customers like VMware by Broadcom, Google Cloud, Kamiwaza and Nature Fresh Farms, shared their success stories of deploying CPUs for private AI.

In back-to-back sessions, Ro Shah, AI Product Director at Intel, outlines how the growing use of CPUs in AI inferencing is changing the picture, and elaborates it through examples of Intel partners.

A Lighter Alternative for Lean Models

So what’s causing enterprises to shift from the big beefy GPUs to these skinner cousins? AI models generally fall into one of two camps – massive models that have billions and trillions of parameters, and the lighter, nimble models that are about billion parameters strong.

“That’s where we’re expecting to see a majority of enterprises to end up because of the cost associated with deploying a trillion plus parameter model, and the ability to leverage capabilities like retrieval augmented generation (RAG), and fine-tune models to specific use cases,” says Shah.

A wide variety of models fit into this latter category. As more enterprises pivot to private AI, the general-purpose AI bucket is filling up fast. For mixed workloads, where AI is one of many, CPUs are increasing becoming the hardware of choice.

Shah explains that general purpose AI comprises applications like real-time audio-video, group chat, screen sharing, recording and such. These workloads have a relatively low set of throughput and latency requirements that are more akin to what CPUs offer.

“A lot of it comes down to how fast we can run those AI cycles on CPUs,” says Shah.

Xeon fits right into this market. Intel has prepped the Xeon brand of CPUs with the right kind of power and performance that can knock these requirements out of the park without trouble. Loaded with the trademark Advanced Matrix Extensions (AMX), the cores deliver max performance for deep-learning inference.

Not a Replacement for GPUs

But Intel is not marketing Xeon as a contender to GPUs. Right out the gate, Shah makes it plain that the Intel Sapphire Rapids are not a replacement for GPUs in bigger models. “If you’re inferencing a trillion-parameter model like GPT-4 where a majority of the cycles are AI cycles, it makes sense to go for an accelerator,” he states.

Some of the biggest reasons for deploying CPUs for AI inferencing are the relatively lower latency requirements of general-purpose AI workloads, cringy costs of GPUs, and deployment complexities.

Intel’s Xeon CPUs sustainably meet the critical requirements of multi-workloads, while trimming down cost and complexity.

Real-World Numbers

Enterprises working with models south of 20 billion parameters has a baseline requirement of 100 millisecond latency. Shah gave the example of a couple well-known models to establish this. GPT-J, a 6 billion parameter model, has next token latency requirement of 100 ms. In plain-speak, next token latency is the speed at which a model can render a full response. At a 100 milliseconds rate, it safely surpasses the average adult reading speed, which is what the goal is.

For a model of its size, the 5th Gen Xeon Scalable Processor provides a 30 ms latency, which, for a relatively larger model like Llama 2 that has13B parameter, is 60 ms – still well within the threshold value.

Of course, to realize these numbers, all cores have to be engaged at once. However, if the workload can do with less, fewer cores can be put to use to push up the latency and increase the number of concurrent users, says Shah.

Shah shares datapoints from case studies to demonstrate Xeon’s performance and throughput benchmarks in AI inferencing. For a certain customer doing Resnet-50 batch inferencing, Emerald Rapids delivered a whopping 19,000 frames per second.

“If you’re supporting cameras, that’s hundreds of cameras you can support,” says Shah.

It’s important to note that high outcomes can be achieved without utilizing all cores. “You don’t need to use the entire CPU socket to deploy your AI inference. You only need a few cores to do your AI, and you can use the rest of the cores for your general-purpose application,” says Shah. This unlocks significant TCO benefits which is one of the top priorities of smaller companies.

“This is not to say that in every scenario less than 20 billion, you should use a CPU, but 20 billion is where we can meet your critical requirements,” he reminds.

Intel pledges even higher performance and lower latency in the future generations of Xeon making it possible to deploy bigger deep-learning models on fewer CPU cores.

Wrapping Up

As artificial intelligence becomes ubiquitous with more companies incorporating AI workloads, a gap has opened up dividing enterprises engaged in big-money projects from the smaller guys that are dabbling in general-purpose AI. Intel has cleverly tapped into this, bringing to a class of underserved customers a product that meets them in the middle, and affords them the chance to stay competitive. Xeon neither takes big spending power, nor comes with long wait-lists. Instead, it alleviates the anxiety around scarce access to GPUs by showing that AI can be kicked off perfectly with less powerful hardware.

Watch Shah’s full presentation to know how Intel is continuing the work on the software level to accelerate AI. Don’t forget to check out the presentations from VMware by Broadcom, Google Cloud, Kamiwaza, and Nature Fresh Farms, from AI Field Day event that elaborate on the suitability of CPUs for AI inferencing.

#AI #AIFD4 #AIInferencing #LLMs #Xeon #Intel #IntelAI #IntelBusiness #TechFieldDay #WriterOfTech1

https://wp.me/p4YpUP-m6b

Intel | Data Center Solutions, IoT, and PC Innovation

Intel's innovation in cloud computing, data center, Internet of Things, and PC solutions is powering the smart and connected digital world we live in.

Intel
@sfabel talks about the growth path for operators and how we work with #IntelBusiness to help telcos build out their nextgen technologies. https://t.co/W47xox51gN tweeted by @ubuntu
Open Source Innovation for Network from Canonical | Intel Business

YouTube