Galo Navarro

@srvaroa
0 Followers
127 Following
22 Posts
Software engineer. Principal KISS apologist at Midokura.
Personal sitehttps://srvaroa.github.io/

"What should I measure when the CEO asks for engineering metrics?" is probably the most frequently recurring eng leadership question, and connects into a larger, somewhat nebulous topic: how should you measure engineering organizations?

The biggest challenge with answering this question directly imo is that there are a number of different stakeholders who all want very different things. Literal answers anchor on one stakeholders rather than solving for whole set.

https://lethain.com/measuring-engineering-organizations/

Measuring an engineering organization.

This is an unedited chapter from O’Reilly’s The Engineering Executive’s Primer. For the past several years, I’ve run a learning circle with engineering executives. The most frequent topic that comes up is career management–what should I do next? The second most frequent topic is measuring engineering teams and organizations–my CEO has asked me to report monthly engineering metrics, what should I actually include in the report? Any discussion about measuring engineering organizations quickly unearths strong opinions.

I cannot keep this to myself. There is a website (radio.garden) where you can listen to radio stations all over the world for free. No log in. No email address. Nothing.

When the site loads, you are looking at the globe. Slide the little white circle over the green dots (each green dot is a radio station) until you find one you like.

I have been listening to this station in the Netherlands and it absolutely slaps.

EDIT: Replies tell me that this doesn't function in the UK without a VPN.

Catching up on my reading list: "Markov chains for queuing systems". Quite accessible read of a very key item in the software engineer's toolkit. https://two-wrongs.com/markov-chains-for-queueing-systems.html
Markov Chains for Queueing Systems

The recent problems with Southwest Airlines is a good example of a Metastable failure at scale in the physical world:

TRIGGERs: Capacity reducing triggers (reduced staff capacity due to sickness, snow storms at Denver, Chicago, and the rest of the country).

AMPLIFICATION: Capacity degradation amplification caused by a combination of factors such as:
—-point to point business model meant the crew is not in the right places,
—-scheduling software breaking down resulting in manual matching of flights to crews - (can’t even imagine how tedious this would have been…kudos to the manual schedulers)
—-crew not able to communicate with the airlines (!) due to phone systems being down, likely due to a metastable failure of the phone system caused by overload due to customers trying to reach the airline for rescheduling..

So, even if the matching of a flight to a crew was done, the crew might not have been aware of that assignment! So, even as “system capacity” (airport, flights, crew) started becoming available, they couldn’t be used effectively…

MITIGATION: As with many metastable failure mitigations, load shedding was the mitigation- they temporarily reduced the number of flights to 1/3rd of the usual number…

Looks like the airline was running the system in an extremely vulnerable state (optimizing for high turnaround time to improve efficiency and packing the schedule without any headroom to handle overloads caused by capacity degradation).

Hope they do a thorough incident analysis using the metastable failure framework and make improvements…

References:

https://www.cnn.com/2022/12/27/business/southwest-airlines-service-meltdown/index.html

https://www.cnn.com/2022/12/29/business/southwest-airlines-service-meltdown/index.html

RT @[email protected]

How much memory does my program use? It is a frequent question on Stackoverflow. Peeps usually try to use top, atop, ps...These tools give mixed, confusing info. Instead read my blog post where I show modern Linux memory tools(cgroups, page-types, procfs): https://biriukov.dev/docs/page-cache/7-how-much-memory-my-program-uses-or-the-tale-of-working-set-size/

🐦🔗: https://twitter.com/brk0v/status/1607871538522656768

Unique set size and working set size

How much memory my program uses or the tale of working set size # Currently, in the world of containers, auto-scaling, and on-demand clouds, it’s vital to understand the resource needs of services both in norman regular situations and under pressure near the software limits. But every time someone touches on the topic of memory usage, it becomes almost immediately unclear what and how to measure. RAM is a valuable and often expensive type of hardware.

@davetron5000 I have noticed a pattern more and more often the past several years: programmers routinely overestimate how complicated someone else's code seems and underestimate how complicated their code seems. From this, I have concluded that many programmers say "complicated" when they truly mean "unfamiliar".

A well-designed and unfamiliar code base looks complicated because it is unfamiliar.

This study from Stanford shows that people who use GitHub copilot produce code with more security flaws than people who don't; it's roughly the same size as the study GitHub keeps quoting saying it makes developers faster. https://www.theregister.com/2022/12/21/ai_assistants_bad_code/
Study finds AI assistants help developers produce code that's more likely to be buggy

At the same time, tools like Github Copilot and Facebook InCoder make developers believe their code is sound

The Register

He talked about electric cars. I don't know anything about cars, so when people said he was a genius I figured he must be a genius.

Then he talked about rockets. I don't know anything about rockets, so when people said he was a genius I figured he must be a genius.

Now he talks about software. I happen to know a lot about software & Elon Musk is saying the stupidest shit I've ever heard anyone say, so when people say he's a genius I figure I should stay the hell away from his cars and rockets.

When you refactor during a change, you make the change bigger.

When you refactor before the change, you make the change smaller.

- in a talk by Jason Swett

This whole thread (crosspost from Gergely Orosz at the bird) is very meaty in showing one of the (bad) consequences of naïvely reducing software engineering to coding. https://twitter.com/GergelyOrosz/status/1605470549072814081
Gergely Orosz on Twitter

“The problem Twitter will have on the engineering side is they would need to hire experienced *software engineers*, going forward, but they will only get experienced coders/programmers/hackers. They also only reward those approaches. Software engineering is longer term focused.”

Twitter