Mastodawn

Ingvar Santry Aug 24, 2025

Deceiving to protect. This week, the spirit of #cyberpunk was most palpable offline at OFFZONE. I spent days on a Moscow street taken over by hackers! It wasn't just presentations, but unique activities like hacking electronic badges, analog locks, & finding answers to tricky questions in open sources. #OFFZONE #CyberSecurity
Check out OFFZONE: https://offzone.moscow/

Show thread

Ingvar Santry Aug 24, 2025

One quest (still available) involves chatting with a bot obsessed with conference lore to extract secret info. Try it yourself: https://osint-mindset-offzone.ru/ You can check your answer with me in DMs or on the OSINT Mindset forums: https://t.me/osint_mindset #OSINT #Quest #Bot #Hacking #OFFZONE

OSINT mindset OFFZONE

Show thread

Ingvar Santry Aug 24, 2025

This quest sparked thoughts on #LLM_alignment debates. Media & futurists often discuss controlling #AGI: how can we ensure its goals align with human values? A critical question for our future. #AI #Ethics

Show thread

Ingvar Santry Aug 24, 2025

Some experts criticize corporations for not focusing enough on #AI_Safety, claiming they prioritize products & profit. However, this criticism isn't always fair, as #alignment has a more practical dimension than just AGI. #CorporateResponsibility #LLM_Alignment

Show thread

Ingvar Santry Aug 24, 2025

Even the simplest #NeuralNetworks shouldn't harm a person or company, even if prompted. This is relevant now! By training on everyday tasks, we get closer to solving the global problem of safe #AGI. The value of this #AISafety research shouldn't be underestimated. #MachineLearning #ResponsibleAI

Show thread

Ingvar Santry

This year, #OFFZONE only featured presentations on #LLM applications, but I think in the near future, this conference will have tracks on the 'psychology' of #alignment, #hacking, & deceiving #AI. The better we learn to trick AI, the safer it becomes. Every successful #jailbreak is a lesson for developers. #AISafety #CyberSecurity