Jim Czuprynski navigates the complexities of implementing generative AI, noting that while there's no one-click solution, Google Cloud provides significant support to streamline the process #GoogleCloud #JimTheWhyGuy #CFD20 https://jimthewhyguy.com/2024/06/20/theres-no-genai-easy-button-but-google-cloud-helps/
There’s No Gen AI Easy Button. But Google Cloud Helps. - Generally, It Depends

As part of Tech Field Days Cloud Field Day last week, I spent a whole day at the Google Cloud Moffett Place campus in Sunnyvale, CA to experience what I can only describe as drinking from an information firehose as our Googler hosts explain everything Google Cloud enables - file / block / object storage,

Generally, It Depends - @JimTheWhyGuy's Ruminations on IT, Technology, & the Future
#SymLink: Google Cloud's infrastructure supports GenAI with specialized TPUs, Hyperdisk storage, and a resilient network architecture. @jimthewhyguy #GoogleCloud #JimTheWhyGuy #CFD20
https://jimthewhyguy.com/2024/06/20/theres-no-genai-easy-button-but-google-cloud-helps/
There’s No Gen AI Easy Button. But Google Cloud Helps. - Generally, It Depends

As part of Tech Field Days Cloud Field Day last week, I spent a whole day at the Google Cloud Moffett Place campus in Sunnyvale, CA to experience what I can only describe as drinking from an information firehose as our Googler hosts explain everything Google Cloud enables - file / block / object storage,

Generally, It Depends - @JimTheWhyGuy's Ruminations on IT, Technology, & the Future
#SymLink: Google recently demonstrated advanced infrastructure for generative AI using TPUs and their Gemini tool within Vertex AI, promising enhanced performance with minimal coding. @jimthewhyguy #GoogleCloud #JimTheWhyGuy #CFD20 #LinkedIn
https://www.linkedin.com/pulse/so-thats-how-do-google-vertex-ai-powered-gemini-cloud-jim-czuprynski-jvgfc
So That's How They Do IT: Google Vertex AI, Powered by Gemini and Google Cloud

I'm adding a quick shout-out to all the folks at Google last Thursday who hosted the #CFD20 delegates at the Moffett Place center. They were such excellent hosts and presenters, and they offered our #TechFieldDay team an inside look at what the future of #GenerativeAI holds for anyone using their Go

#SymLink: Oxide Computer Company innovates on-prem computing with efficient, cable-less designs and eliminates BIOS for improved setup and maintenance. @jimthewhyguy #OxideComputer #JimTheWhyGuy #CFD20 #LinkedIn
https://www.linkedin.com/pulse/oxide-bidding-fond-farewell-on-prem-computing-we-know-jim-czuprynski-u5hrc
Oxide: Bidding a Fond Farewell to On-Prem Computing As We Know It

So we wrapped up #CFD20 this morning with an in-depth review of the computing technology from Oxide Computer Company, a startup in Emeryville, CA near Oakland. (Yep .

#SymLink: Jim Czuprynski discusses the intricacies of IT and how tools like Morpheus's MVM simplify managing complex IT environments. @jimthewhyguy #MorpheusData #JimTheWhyGuy #CFD20 #LinkedIn
https://www.linkedin.com/pulse/hard-our-users-dont-care-shouldnt-have-jim-czuprynski-rhzlc
IT Is Hard, Our Users Don't Care, And They Shouldn't Have To.

Anyone who's worked in the trenches of IT for even only a few months quickly realizes that - no matter what our professors may have told us - our jobs often demand close attention to details, seat-of-the-pants estimation of required resources (usually we just grab the maximum), and navigating legacy

#SymLink: Juniper Networks highlighted the importance of robust network infrastructure for optimal AI performance at Cloud Field Day in Santa Clara. @jimthewhyguy #JuniperNetworks #JimTheWhyGuy #CFD20 #LinkedIn
https://www.linkedin.com/pulse/sub-optimal-ai-performance-maybe-its-your-network-jim-czuprynski-fuhdf

Are GPUs Essential for Every AI Problem? Intel Says, No.

AI’s complex workloads require extreme computing, the kind that only the fastest accelerators are known to provide. It is little surprise that GPUs (Graphics Processing Units) have emerged as the holy grail of compute as AI gains ground. But exorbitant pricing and scarce access, not to mention heavy power consumption and cooling requirements, raise barriers for enterprise adoption. 

What if IT shops truly didn’t need these silicon beasts for the bulk of their AI works? 

An Affordable Alternative for AI Workloads below 20B Parameters

Intel has closely followed the GPU trend over the past few years. To gauge the depth of need of GPUs for AI workloads, the teams have run a series of trials, and they’ve arrived at an interesting conclusion. Based on their findings, CPUs can accommodate a third of AI workloads, with the exception of the insanely intense ones

For example, the most intense large language model (LLM)-based AI workloads, like Meta’s LLAMA2, typically fluctuate within the range of 7 and 30 billion.

Figure 1. The Reality of AI: GPUs Are Needed Only For Insanely Intense Workloads 

Intel found that a majority of AI workloads remain below 20 billion parameters. The Xeon chipsets meet almost all latency requirements for the general-purpose workloads in that category. There is rarely the need to leverage the massive acceleration of the GPU technology for these AI workloads, Intel says. 

Intel shared benchmarks from real-world scenarios to back this up. In one example of an inference-heavy AI implementation, a customer used Intel Xeon CPUs to perform extremely rapid image processing. The Xeon CPU family was able to scale from a not-insignificant scanning speed of 400 frames per second (fps) using 24 AVX2 CPUs, to over 19,000 fps using 64 cores of their AMX-powered Emerald Rapids processors. 

The test results revealed that the newest AMX chipsets engineered with power efficiency kept the  64-core configuration’s energy consumption even with the 24 AVX2.

Figure 2. Same Wattage, Dramatically Increased Workloads: 24 AVX2 Cores vs. 64 AMX Cores 

Another real-world use case Intel shared is of a customer adding speech translation and real-time transcription services to an existing video conferencing offering. Intel engineers put together a solution with just a few additional servers using Intel CPUs.

Figure 3. Handling LLM Demands: Response Time Latencies Kept Below 100 ms (1/10th of a second)

This configuration was particularly interesting because it had to deal with two different AI workloads. Imagine translating the phrase “Pleased to meet you, sir!” to Spanish. The latency for the first word returned in the sequence – “Mucho” – tends to be compute-bound because the AI model needs to find exactly the right word. However, retrieving each of the next words that would be returned in the phrase may also depend on contexts, like the formality of the greeting (e.g. ?Mucho gusto! versus ?Mucho gusto, a sus ordenes!) and tends to be a memory-bound operation.

Intel’s hardware solutions work well for complex AI workloads like the above. Intel has heavily invested in native software support for OSS solutions for data analytics, like Pandas, NumPy, and Apache Spark. Likewise, their commitment to support popular machine learning and deep learning toolsets like PyTorch, TensorFlow, and AutoML (Figure 4) go a long way to extend support to the userbase.

Figure 4. It’s Not Just the Hardware: Intel’s Array of Software Support For Analytics and ML

To GPU, Or Not To GPU?

Intel positions its current array of Xeon CPU chipsets for organizations that are grappling with the pressures of providing adequate compute power for AI workloads. Only the most intense AI workloads – specifically, those north of 20 billion parameters – require GPU-level computing power. For everything below that, Intel’s offerings appear to fill the bill. Additionally, their deep support for software compatible with common analytics, machine learning, and deep learning requirements, make their hardware a compelling choice.

Be sure to check out Intel’s presentations on CPUs for AI workloads from the recent AI Field Day event to get a technical deep-dive. 

#AI #AIFD4 #Sponsored #Intel #IntelBusiness #JimTheWhyGuy #TechFieldDay

https://wp.me/p4YpUP-mdb

Intel | Data Center Solutions, IoT, and PC Innovation

Intel's innovation in cloud computing, data center, Internet of Things, and PC solutions is powering the smart and connected digital world we live in.

Intel
#SymLink: Jim Czuprynski's article critiques excessive resource allocation in Kubernetes clusters and presents Platform9's Elastic Machine Pool as a solution to streamline resource use and enhance performance. @jimthewhyguy #Platform9Sys #JimTheWhyGuy #CFD19 #LinkedIn
https://www.linkedin.com/pulse/runaway-k8-clusters-meet-your-match-platform9s-pool-emp-czuprynski-lp70c
Runaway K8 Clusters, Meet Your Match: Platform9's Elastic Machine Pool (EMP)

OK, I'll admit it: I'm not a huge fan of the containerization movement. As a longtime Oracle DBA who trained a few hundred data engineers on so-called monolithic architectures like Oracle's Exadata DBM, I've struggled to understand why anyone would deploy scores or hundreds of separate Kubernetes no

#SymLink: Jim Czuprynski evaluates SoftIron's development of advanced private cloud solutions that effectively tackle the complexities of scalability and resilience, highlighting their capability to manage large-scale, high-demand IT projects. @jimthewhyguy #SoftIron #JimTheWhyGuy #CFD19 #LinkedIn
https://www.linkedin.com/pulse/winning-private-cloud-war-softirons-hyperscaled-jim-czuprynski-rsw3c
Winning the Private (Cloud) War: SoftIron's Hyperscaled Solutions

My first serious IT job was at a shop in downtown Chicago, writing COBOL to complement a FOCUS database's functionalities beyond its limited capabilities. So I've got a long view of the history of IT, starting with mainframes and progressing through client-server to application server to VMs and fin

#SymLink: Jim Czuprynski introduces NeuroBlade's SQL Processing Unit (SPU), a specialized hardware solution designed to enhance performance for complex analytics and ML workloads in containerized database environments. @jimthewhyguy #NeuroBlade #JimTheWhyGuy #CFD19 #LinkedIn
https://www.linkedin.com/pulse/when-cpusgpus-enough-neuroblades-spu-jim-czuprynski-reelc
When CPUs/GPUs Are Not Enough: NeuroBlade's SPU

As an Oracle DBA who's worked with extremely powerful hardware like Exadata DBMs with dozens of cores and scads of memory, it's a rare situation when even an complex analytic or ML workload can't complete in a timely fashion. I guess I've been spoiled by my monolithic legacy (hah!) RDBMS with its so