Brandon Seppa Navigation
  • Home
  • About
  • Search
  • Home
  • About
  • Search
Home Posts AI Infrastructure

Tag Archive

Below you'll find a list of all posts that have been tagged as “AI Infrastructure”

The Hidden Tax on Every AI Training Run

Accelerator utilization rates below 70% are not a GPU problem — they are a storage problem. Google Cloud’s Hyperdisk ML and Rapid Storage are built specifically to fix that, and the numbers are not subtle.

AI InfrastructureCloud StorageGoogle Cloud StorageGPU ComputingMachine Learning Operations

TurboQuant: Kind of a Big Deal.

Google Research just published a way to cut AI serving costs by 50% with zero accuracy loss. The interesting part is what happens to the ISVs who figure this out first.

AI InfrastructureCost OptimizationGoogle CloudLLM InferenceTPU

Google Built a TPU for the Age of Inference. Meet Ironwood.

TPU Ironwood is Google’s 7th-generation custom AI chip, and unlike its predecessors, it was built for inference first. Here’s what that means and why it matters.

AI InferenceAI InfrastructureCustom SiliconGoogle CloudIronwoodISVTPU

LLM Traffic Is Weird. Your Infrastructure Needs to Know That.

Standard load balancers treat LLM inference like any other HTTP traffic. That is expensive and slow. GKE Inference Gateway knows the difference.

AI InfrastructureGKEGoogle CloudKubernetesLLM Inference
LinkedIn
BRANDONSEPPA.COM © 2026