Heidy Khlaaf (هايدي خلاف)

@heidykhlaaf
2.2K Followers
288 Following
24 Posts
Climber 🇪🇬| Chief AI Scientist at AI Now Institute (safety-critical SW ☢️/✈️/🚗 & AI). Formal Methods PhD. ISO.
x- Zipline, OpenAI, Adelard, MSFTResearch
Last week, we published an Op-Ed in the New York Times where we discuss how the rush to implement AI within national and defence infrastructure (including in weapons) may in fact threaten, rather than preserve, US national security:
https://www.nytimes.com/2025/01/27/opinion/ai-trump-military-national-security.html
Opinion | Our Military Is Adopting A.I. Way Too Fast

The military is integrating A.I. into its deadly systems too quickly, and Trump will only accelerate a dangerous situation.

The New York Times
My colleague Michael Brown and I responded to the recent
White House OSTP AI Request for Information, where we specifically address AI cyber security risks. Our full response can be found here:
https://trailofbits.com/documents/whitehouse_otsp_national_security_ai.pdf
It's been a difficult 15 months dealing with severe family illnesses. Slowly coming out of it, and had to go 7 months without climbing (the longest in the past decade sans when I blew out my ankles in South Africa). Was lovely to be back in France and make progress on projects.
The rate limiting incident in Twitter isn't really what pushed me over to use Mastodon more and more. Currently there is a hashtag on Twitter related to the French riots that's calling on the expulsion of all Muslim, African, and Arab immigrants from the entirety of Europe. The trending hashtag (at #2) is filled with images encouraging mass killings and lynchings of Arabs. There is no come back from this.
With the explosion of ChatGPT, a lot more people have been exposed to LLMs as a medium for programming. All the more reason to consider the work carried out on Codex to understand the implications. I spoke to L'Express about this recently: https://www.lexpress.fr/economie/high-tech/avec-chatgpt-developper-sites-web-et-logiciels-devient-il-un-jeu-denfant-3OMWTCXH2RHGDBAQP2HO4JOLVA/
Avec ChatGPT, développer sites web et logiciels devient-il un jeu d’enfant ?

Avec ChatGPT, développer sites web et logiciels devient-il un jeu d’enfant ?

L'Express
I really don't want to make my mastodon about the permanent main character of Twitter, but there is something just so hilariously ironic about the person who literally owns SpaceX not understanding the fundamental concept of ADS-B.
Seeing that folks are forgetting (or perhaps never explored) that using ChatGPT to code will have the same risks & technical limitations as Codex including bugs and discriminatory outputs. Recommend reading our Codex papers on this.
https://arxiv.org/abs/2107.03374
https://arxiv.org/abs/2207.14157
Evaluating Large Language Models Trained on Code

We introduce Codex, a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities. A distinct production version of Codex powers GitHub Copilot. On HumanEval, a new evaluation set we release to measure functional correctness for synthesizing programs from docstrings, our model solves 28.8% of the problems, while GPT-3 solves 0% and GPT-J solves 11.4%. Furthermore, we find that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts. Using this method, we solve 70.2% of our problems with 100 samples per problem. Careful investigation of our model reveals its limitations, including difficulty with docstrings describing long chains of operations and with binding operations to variables. Finally, we discuss the potential broader impacts of deploying powerful code generation technologies, covering safety, security, and economics.

arXiv.org

A bit overdue on my #introduction. I'm Heidy! Currently Engineering Director at Trail of Bits working on ML Assurance & Safety. I specialize in safety-critical systems specifically nuclear plants, UAVs, autonomous vehicles, and any complex safety systems really!

By training I'm a Software Safety Engineer with a Formal Verification PhD.

I'm a huge bouldering and outdoors enthusiast and climb within the V8/V9 range 🧗🏽‍♀️