Mastodawn

Next afternoon talk: "Beyond Blanket Freezes: Enabling Safe Innovation During Critical Events at Netflix" by Prachi Jain and Sandhya Narayan, Netflix.

#srecon

Show thread

drmorr 9h ago

Show of hands: "Who has been in a no-deploys-for-a-week code freeze before? Oh, all of you, excellent!"

#srecon

Show thread

drmorr 9h ago

Same problem occurs during every blanket freeze: people don't stop building, they just stop shipping. Changes pileup, risk builds, and critical patches that really ought to be pushed out are just sitting there.

#srecon

Show thread

drmorr 9h ago

Freezes don't remove risk, they just reschedule and concentrate it.

#srecon

Show thread

drmorr 9h ago

"This is definitely a group therapy session"

Show thread

drmorr 9h ago

What if we tuned our controls to the real impact for each service:

1. blast radius
2. impact to customers
3. acceptable risk tradeoffs

This lets us tune tiered responses to the actual level of risk.

#srecon

Show thread

drmorr 9h ago

Very curious to know how they decide what tier each service belongs in. In my experience, everybody thinks their own service is tier 0.

#srecon

Show thread

drmorr 9h ago

They appear to have a set of automated tools that look at a variety of signals to determine if a change is safe to ship.

#srecon

Show thread

drmorr

Who can bypass these freezes? A bypass decision is based on "event type + service tiering + risk signals + resilience data (canary/staggered rollouts)"

#srecon