Saad khan

@saad_devops
23 Followers
206 Following
58 Posts
Linux | Cloud | DevOps | Security β€’ Automation
Github :- https://github.com/saadcnx
Twitter :- https://x.com/saad__devops
Linkedin :- https://www.linkedin.com/in/saad-khan-sysops/
Medium :- https://medium.com/@saadcnx

REST isn't dead. But for internal K8s traffic? gRPC wins.

HTTP/1.1 problems:
- Head-of-line blocking
- Connection overhead
- JSON parsing tax

HTTP/2 + gRPC:
- True multiplexing
- ProtoBuf binary (5x smaller)
- 4 streaming patterns

K8s control plane already uses gRPC.

Read more: https://medium.com/@saadcnx/beyond-rest-why-modern-kubernetes-clusters-are-migrating-to-grpc-and-http-2-15d6c48b6641

#Kubernetes #gRPC #DevOps

Beyond REST: Why Modern Kubernetes Clusters are Migrating to gRPC and HTTP/2

If you are building microservices in Kubernetes today and still relying solely on traditional REST over HTTP/1.1 for internal…

Medium

Will AI kill Kubernetes? πŸ€–β“

NO.

CNCF ecosystem = $200B+ investment

Security, monitoring, service mesh, tooling – ALL built around K8s.

Too big. Too established. Too valuable.

AI will RUN ON K8s, not replace it.

Kubernetes is the operating system of the cloud era. Here to stay. πŸ—οΈ

#Kubernetes #GenAI #CloudNative

K8s QoS Classes Explained βš–οΈπŸ’€

OOM Killer order:

BestEffort (no requests/limits) β†’ First sacrifice
Burstable (requests < limits) β†’ Second
Guaranteed (requests = limits) β†’ Last

Root cause: No requests = scheduler overcommits = node runs out of RAM

Fix:

Set requests = limits

Use LimitRange

Reserve memory for OS (--system-reserved)

medium post :- https://medium.com/@saadcnx/the-silent-app-killer-how-kubernetes-qos-classes-can-secretly-wipe-out-your-nodes-919edae91712

#Kubernetes #DevOps #SRE

The Silent App Killer: How Kubernetes QoS Classes Can Secretly Wipe Out Your Nodes

Every very experienced DevOps engineer has faced it at least once: The infrastructure looks healthy, CPU utilization is fine, resources…

Medium

K8s 5-Minute Timeout Trap ⏰πŸ’₯

Node dies? K8s waits 300 seconds before evicting pods.

Your app = dead for 5 minutes.

Fix:

tolerationSeconds: 15
Downtime: 5 min β†’ 15 sec

For spot instances:

AWS: Node Termination Handler
GKE: Graceful Node Shutdown (built-in)

Don't let defaults kill your SLA!

Check out my medium post for more details :- https://medium.com/@saadcnx/stop-letting-kubernetes-kill-your-app-the-5-minute-timeout-trap-bc1e344f4c75

#Kubernetes #DevOps #SRE #K8s

🚨 Stop Letting Kubernetes Kill Your App: The 5-Minute Timeout Trap

If you are running Kubernetes in production and haven’t touched your default pod eviction settings, you are sitting on a ticking time bomb.

Medium

Why is GCP different? πŸ€”

Live Migration: Google patches hardware and updates the host without rebooting your VM. Zero downtime maintenance.

Shared Core: Gmail & YouTube run on the same infrastructure you rent in GCP. Enterprise-grade security by default.

Cloud Service Models simplified:

-IaaS (Infrastructure): Rent the kitchen. Cook yourself. (VMs)

-PaaS (Platform): Kitchen + Chef provided. Just bring the recipe. (App Engine)

-SaaS (Software): Full meal delivered. Just eat. (Gmail)
Responsibility shifts up the stack!

#GCP

5 Pillars of Cloud Computing you must know:

- On-Demand Self-Service (Automation)
- Broad Network Access (Any device)
- Resource Pooling (Carpooling for servers!)
- Rapid Elasticity (Grow/Shrink instantly)
- Measured Service (Pay only for what you use)

Pocket OS: AI Didn't Fail, DevOps Did πŸ€–πŸ”§

AI agent deleted production database + backups in 9 seconds.

Root causes:

Token with PROD delete rights just lying around

Staging agent = PROD access

Backups on same volume

No least privilege

AI amplifies what's already there (good or bad).

Fundamentals first. Always.

#DevOps #SRE #AIAgents #CloudSecurity

πŸ” Spark + Elasticsearch Debugging 🧡

Building a cybersecurity analytics platform. Hit 2 blockers:

❌ JAR path mismatch β†’ Fixed absolute path
❌ No data nodes (single-node Docker ES) β†’ Added es.nodes.wan.only=true

βœ… Result: 89 records loaded. Working pipeline!

Lesson: Verify JAR paths + disable node discovery for single-node ES.

#PySpark #Elasticsearch #DataEngineering #CyberSecurity #Debugging

Kafka Streaming for Cyber Security πŸ”πŸš€

Built a multi-source streaming engine pushing to Kafka:

β€’ Network logs – CICIDS2017 style (500/sec, 5% attacks)
β€’ User activity – Insider threat patterns (50/sec)
β€’ System events – ADFA-LD host intrusions (200/sec)
β€’ Correlated alerts – Real-time threat detection

Attack simulation: DDoS, Botnet, Web Shell, Rootkit

Kafka = Perfect for SIEM data ingestion! πŸ“Š

#ApacheKafka #CyberSecurity #ThreatDetection