Mastodawn

Todd A. Jacobs | Pragmatic Cybersecurity Oct 13

Define Your AI Use Cases Before Your Metrics

Companies need to start reframing #AI #metrics like utilization rates of specific AI systems (which they often think of as #COTS tools anyway) based on #use_cases, not as a one-to-many tech solution for every problem domain. There'd be a lot fewer corporate implementation failures if they thought of AI systems as "hammers and screwdrivers" suited to particular tasks rather than as Swiss Army knives that are generically suited to an arbitrary and/or ill-defined set of objectives.

Todd A. Jacobs | Pragmatic Cybersecurity Oct 3

Celebrating New Credential

I'm celebrating a new credential. I'm also celebrating the people who made it possible.

I'm proud to have received my Theia Institute Founder's Badge yesterday. It demonstrates two years of work with some truly brilliant and inspiring people, all of whom are not only "Emerging Technology Thought Leaders" but also deserving of the title of "Visionary Founder."

Sharing Credit with Others

While I still work for Theia Institute, I don't consider this my honor. The real honor is in standing on the shoulders of giants like my friends and colleages there including (in LinkedIn's pseudo-alphabetical order): Barak Engel, Daniel Kinon, Doug Shannon, Lisa Palmer, Jim Desmond, and Q. Wade Billings.

A lot of credit also goes to donors, business leaders, conference organizers, educators, journalists, and others who not only believed in Theia's mission, but have actively supported us over the years. That list would be too long for this post, but they each deserve their day in the sun too. I hope everyone who made took part realizes the real honor is theirs.

Related Links

Todd A. Jacobs | Pragmatic Cybersecurity Feb 27, 2025

#FrameChallenge: #ShadowAI is no different from any other form of #ShadowIT.

Shadow IT is a fancy way of saying "individuals or teams doing stuff without going through channels." It can be a headache for #cybersecurity, #datagovernance, and #riskmanagement groups in heavily-regulated environments. However, self-serve IT is also a great "canary in the coal mine" for identifying areas where a company's processes have failed to deliver value to customers or internal stakeholders, or where current tools and processes are seen internally as blocking innovation or limiting productivity.

Compliance doesn't equal security, and security for its own sake doesn't usually deliver market value—unless you're selling security products or services, of course. So, if people are bypassing routine channels to implement solutions directly rather than requesting new centrally-managed capabilities, it's probably time to review and improve your current policies, standards, and guidelines.

Good reviews address the current friction points of your existing processes. More importantly, they shine a light on the perceived value proposition of the solutions that people are implementing themselves to solve day-to-day work challenges. During reviews, remember that the goal is to facilitate value creation within the organization's risk tolerance, not to avoid adaptation!

The risks of "shadow AI" are absolutely identical to the other risks inherent in data and systems protection, including the risks of #BYOD. These things are inevitable when business units respond to market change faster than the larger organization can adapt. The controls to successfully address those risks are all exactly the same, too.

https://www.linkedin.com/posts/pradeeps_shadow-ai-is-already-inside-your-business-activity-7300883716002693122-36xV

Shadow AI Is Already Inside Your Business, and It’s a Ticking Time… | Pradeep Sanyal | 18 comments

Shadow AI Is Already Inside Your Business, and It’s a Ticking Time Bomb Employees aren’t waiting for IT approval. They are quietly using AI tools, often paying for them out of pocket, to speed up their work. This underground adoption of AI, known as Shadow AI, is spreading fast. And it is a massive risk. What’s Really Happening? • Employees are pasting confidential data into AI chatbots without realizing where it is stored. • Sales teams are using unvetted AI tools to draft contracts, risking compliance violations. • Junior developers are relying on AI-generated code that might be riddled with security flaws. The Consequences Could Be Devastating ⚠️ Leaked Data: What goes into an AI tool does not always stay private. Employees might be feeding proprietary information to models that retain and reuse it. ⚠️ Regulatory Nightmares: Unapproved AI use could mean violating GDPR, HIPAA, or internal compliance policies without leadership even knowing. ⚠️ AI Hallucinations in Critical Decisions: Without human oversight, businesses could act on false or misleading AI outputs. This Is Not About Banning AI, It Is About Controlling It Instead of playing whack-a-mole with unauthorized tools, companies need to own their AI strategy: ✔ Deploy Enterprise-Grade AI – Give employees secure, approved AI tools so they do not go rogue. ✔ Set Clear AI Policies – Define what is allowed, what is not, and train employees on responsible AI use. ✔ Keep Humans in the Loop – AI should assist, not replace human judgment in critical business decisions. Shadow AI is already inside your company. The question is, will you take control before it takes control of you? H/T Zara Zhang | 18 comments on LinkedIn

Todd A. Jacobs | Pragmatic Cybersecurity Feb 21, 2025

TL;DR: #Layoffs of skilled #ITprofessionals in expectations of replacing them with AI-only value delivery are likely to backfire.

I'm very pro-AI. These systems are extremely good at certain tasks! I'm just annoyed by the endless spin promoting all the things #AI is demonstrably really bad at right now.

I do #promptengineering and #softwaredevelopment professionally, but still spent 2+ hours last night working with several dedicated coding #AImodels debugging a hairy GNU find expression with path pruning. Net result? I gave up on the AI models, wrote my own regular expressions from scratch, and leveraged some good ol' fashioned shell scripting to handle the edge cases.

❎ AI Computation (GPU + CPU) + Human Labor ≥ 2 hours
✅ Experienced Human + Unit Tests = 0.5 hours

Todd A. Jacobs | Pragmatic Cybersecurity Feb 17, 2025

In addition to the software development work I do for fun and profit, I am also actively #jobhunting for strategic & tactical #leadership, advisory, and hands-on roles in #cybersecurity, #infosec, #grc, #privacy, #compliance, #devsecops, and more. Help me find my next adventure!

#CIO, #CISO, or #vCISO roles are ideal, but VP, Director, "Head of...", Architect, or Principle Engineer roles can also be great fits based on the organization and sector. The important thing is having a chance to do something new and exciting, and opportunities to learn & grow.

I also have a lot of experience with #boardofdirectors work, #nonprofit & #NGO operations, #publicspeaking, #writing, and #startup culture. With such a broad background, don't hesitate to reach out about something outside my current wheelhouse.

My main goal is to open a new chapter in my life. Let me surprise you by saying "yes" to something you thought was too far outside the box for me!

#fedihire #jobseach #opentowork #veteran

Todd A. Jacobs | Pragmatic Cybersecurity Jan 28, 2025

Befuddled by all the recent #DeepSeek hullabaloo? Here's a brief Q&A that cuts through the fog.

Q: Did #DeepSeek just up-end everything we know about #AImodels and #LLMs?
A: Nope. It just demonstrates one of several new approaches to model training and logic chaining, but still uses the same basic building blocks.

Q: Does this mean DeepSeek can think?
A: Nope. Still not #Skynet. Logic chains are just one of several techniques an instruction-oriented AI system can use to try to stay on track and focus on a coherent goal.

Q: Is logic chaining #ExplainableAI?
A: Nope. Even the "thinking" output of DeepSeek is a linguistic approximation of the pattern-seeking behavior of most LLMs.

Q: Why is everyone in an uproar about DeepSeek?
A: Because most people think ChatGPT defines what AI is, what it can do, and what its limitations are.

Q: Why are the people panicking about DeepSeek talking about AI hegemony and geopolitics?
A: Because they're more concerned with investment returns or charging for expensive GPUs and SaaS services than they are in scientific advancements or improving individual productivity with new technology.

Todd A. Jacobs | Pragmatic Cybersecurity Sep 22, 2024

WARNING

The poll in the image contains several unstated assumptions. The "correct" (rather than expected) answer is highly language-dependent.

TL;DR

I talk a lot about human #cognitiveBias, as well as the biases inherent to most #AI #datasets. Here's a great human example from one of the LinkedIn #dataAnalytics groups, where the poll author is clearly expecting "512" to be the correct answer but (for some reason) others to get it wrong. However, as I point out, the solution is highly language-dependent, and some of the "correct" answers are 791 and 6,561.

An effective #CIO, #CISO, or #CDO needs to keep these implicit biases in mind to avoid falling into traps like this one. Read on if you want to understand how easily the jaws of the trap can close!

Analysis

The order of operations will depend on both operator precedence and parsing rules. In many languages this ambiguous expression has an expected result of 512, but if sub-expressions with equal precedence aren't evaluated right-to-left then you may get a very different answer. Parenthesizing makes the expression less ambiguous: 2 ** (3 ** 2) #=> 512. Other parsers may result in (2 ** 3) ** 2 #=> 64 if power operators retain equal precedence but sub-expressions are evaluated in a different order.

The problem with the question as posed is that it looks like a math question, but it's actually a language-dependent programming question. Not everyone doing analytics or programmatic math is using the Python or Ruby languages, which have well-defined behavior for this use case.

Other Languages

The expr command in TCL 8.6.15 and the math command in Fish 3.7.1 will interpret the non-parenthesized expression left-to-right, resulting in 6561. For these languages, you need to changes not only the precedence of expressions but also the type or order of the expressions. For example, these will both yield he expected result of 512 in TCL: expr (2 ** 3) ** 3 or expr "2 ** (3 ** 2)". Your mileage will vary by language.

Todd A. Jacobs | Pragmatic Cybersecurity Sep 20, 2024

Just because people used to think the Earth was flat didn't make it true, but that's what an #LLM from the Middle Ages would have told you. Let that sink in for a moment before you read on.

I'm sure there are #AIexperts who can explain this with more mathematical accuracy than I can, but from a social perspective this is the current existential problem we face with LLMs, #AI #datasets, and all the rest. While you can tune these engines to be more "creative" in a non-human sense, when you're talking about ingesting or even sampling vast quantities of data there's a strong tendency for #regressionTowardTheMean.

In layman's terms, when Google or #OpenAI suck up the whole Internet and feed it every social media post they can lay their hands on, the end result can't be a curated expert opinion. Instead, what you get is a linguistic representation of the average response based on n-grams and statistical probabilities, possibly supplemented with real references in a #RAG system or entirely made up because they seem linguistically plausible in others. Either way a language model, regardless of size or complexity, is just filling in the blanks based on statistics, probabilities, and (sometimes) explicit rules or collaborative engines to do things like filter out profanity and known-bad responses.

Without #XAI, sensible #AIgovernance, and #copyrightreform various commercial interests are a lot closer to dumbing down AI systems than they are to making them smarter. AI systems have a lot of potential, but our current market-driven approach incentivizes all the wrong behaviors by both for-profit companies and the #ML systems and resulting data sets that they're monetizing.

AI systems can become expert systems that support human endeavors, but not if we allow them to be entirely autonomous systems that parrot back some variation of "Most people say…". As a global society, we need to do better than that. We can, and we must!

https://www.linkedin.com/feed/update/urn:li:activity:7242914151101071361/

Tom Yeh on LinkedIn: #deeplearning #aibyhand #llms | 16 comments

Sample a Sentence by hand? ✍️ I created this exercise to teach my students how LLMs sample a sentence from predicted probability distribution over a… | 16 comments on LinkedIn

Todd A. Jacobs | Pragmatic Cybersecurity Sep 17, 2024

#Shellprogramming skills are pretty portable between #Linux, #BSD, and #macOS, but some of the underpinnings of macOS are non-standard. It helps to remind yourself that macOS is not a standard #BSD #Unix variant; Apple's #Darwin based systems do a lot of embrace-and-extend under the hood. Here's a practical example that comes up often in the enterprise.

Most #Linux systems export the current user's login name to the LOGNAME environment variable (often via sourcing /etc/profile) and may also export the user's default shell from the user's #GECOS record in /etc/passwd to the preferred shell (set by an application or the user) as the SHELL environment variable. The canonical way to get access to the user's default shell on most Unix-like systems is by parsing /etc/password or another NSS database with the getent utility, e.g. getent passwd "$LOGNAME" | cut -d: -f7.

There are other means to do this on Linux too, but macOS doesn't provide this common #POSIX compatible userspace utility. Instead, Darwin relies on opendirectory(8) for storing and accessing GECOS records, requiring other tools to retrieve the information. You can query a user's GECOS record on Darwin like so:

# directly from the Open Directory service, local or remote
dscl . -read "/Users/$(id -un)" shell | awk '/^shell:/ {print $2}'

# from the directory service's cache on the local system
dscacheutil -q user -a uid "$(id -u)" | awk '/^shell:/ {print $2}'

Be aware that there are other ways to do this, too, but old school utilities like whoami have been deprecated in favor of id -un, and finger as implemented on most systems (e.g. via [x]inetd, or reading various #dotfiles from users' directories locally or over the network) is considered a security risk.

In containers, especially with non-standard shells, or with centralized #IAM using #LDAP or #ActiveDirectory, you may have to match the local #userID to a remote #LDIF record to before grepping for the data you need. In addition, nsswitch.conf, PAM modules, NIS+, or other less-common data sources may need to be consulted and each will generally have specific utilities for looking up the stored or cached information that is equivalent to what's normally provided in the 7th GECOS field for each user on standard Linux and Unix systems.

As always, your mileage may vary based on use case or implementation details. On the plus side, problems are rarely insoluble when you know where to dig for a solution!

Todd A. Jacobs | Pragmatic Cybersecurity Sep 11, 2024

Topics like #cybersecurity and #encryption are difficult to talk about plainly because they are complex. While it's usefully reductionist to tell users that HTTPS is more secure than unencrypted HTTP, it can also lead to oversimplification (and thus a lack of adequate #infosec funding) when designing and implementing #securitycontrols. Consider the following excerpted information I recently shared in one of the LinkedIn communities when trying to explain why a URL or TCP/IP socket by itself doesn't create a secure connection.

The "HTTPS" in a URL is a URI scheme that is interpreted by the browser as an instruction to establish a TLS connection over which the HTTP protocol can be be negotiated. The actual TCP/IP transport layer handshake, TLS and HTTP protocol negotiations, and encrypted payload communications between client and server are handled in other layers.

Useful References

Hypertext, URIs, and Schemes
: https://www.rfc-editor.org/rfc/rfc9110#section-4.2.2
: https://www.rfc-editor.org/rfc/rfc8820#name-uri-schemes
: https://en.wikipedia.org/wiki/List_of_URI_schemes

TLS (sometimes still referred to as "SSL" for historical reasons)
: https://www.rfc-editor.org/rfc/rfc8446

RFC 9110: HTTP Semantics

The Hypertext Transfer Protocol (HTTP) is a stateless application-level protocol for distributed, collaborative, hypertext information systems. This document describes the overall architecture of HTTP, establishes common terminology, and defines aspects of the protocol that are shared by all versions. In this definition are core protocol elements, extensibility mechanisms, and the "http" and "https" Uniform Resource Identifier (URI) schemes. This document updates RFC 3864 and obsoletes RFCs 2818, 7231, 7232, 7233, 7235, 7538, 7615, 7694, and portions of 7230.

Verification	https://gist.githubusercontent.com/todd-a-jacobs/280f046b804df6da00ce260eb8db7024/raw/41ca3ef349d71f2c8073c610b3b6c28c9557c933/infosec_exchange_verification.html
Theia Institute™ Think Tank	https://www.theiathinktank.com/
LinkedIn, Personal	https://www.linkedin.com/in/todd-a-jacobs/
LinkedIn, Company Page	https://www.linkedin.com/company/theia-institute-think-tank
CodeGnome Consulting	https://www.codegnome.com/