2026-06-16

The danger of LLMs

Our foot is on the gas but we can't see over then steering wheel.

LLMs are everywhere

With the rapidly increasing valuations and scaling of LLM companies, we're seeing an insane push to shove LLMs into nook and cranny of the economy. This has been most seen in coding and software engineering, but is increasing in other fields as well such as healthcare, science, robotics, finance, really everywhere.

An important distinction to note is that not only are we seeing an increasing use of LLMs in information retreival, we see it increasingly taking actions as well, or as we all commonly refer to it, "Agentic AI".

Creating too many blind pilots

In my own domain of software engineering I've run into many situaitons where LLMs will do something in a suboptimal or incorrect way. Because I had experience coding before AI was the main method of coding, I could see that it went off the rails and could course correct to fix it. Notably, this occurance was for a task in which a significant portion of training data for the model was explicitly for this field: Web Fullstack Software.

Now, we use AI for domains in which there is less abundant training information for, and we tend to take the output of it as fact. Herein lies the problem. I am not a legal expert. But if I ask AI to give me legal advice, it will give me legal advice. Obviously there are guardrails to protect me from asking an AI model to give legal advice, but such guards aren't really useful when the request is subtle and not explicit. Most of the time, I will just take the legal advice as is from AI, because AI answered my question, and I don't know any better.

Now if someone asked me that same legal question before, I'd say "I'm not sure, I'm not a lawyer.", but now I'd say "Yeah I think X is allowed because <reason AI gave me>" or "No I think X is not allowed because <reason AI gave me>". This isn't a new problem, since before AI you've likely seen youtube videos or been a recipient of propaganda which either accidentally or purposefully misinformed you.

For most of history though, we've been able to counter that easily, since to disseminate an idea or learn something, we've had to either read a source that someone else wrote or spoke, and often that was a source which has had some domain expert's eyes and or ears present at some point in the chain to witness, process, and validate or counter that information. With LLM's though, we are increasingly receivng the information as if it was a domain expert telling this to us, but it isn't. LLM's are statistical models that imitates output based on a snapshot of input.

One can make the same argument for humans, and for a lot of things that is true. Humans are immune to propaganda for the same reason, because we tend to take in information and regurgitate it out. But if we took ALL things based on precedent, we'd never have made any breakthroughs that brought us to today. Humans can critically think. That's why we have built strong nations and new technology while chipanzees still eat bananas.

I'm willing to bet that if LLMs were trained on data until just the 1840s, the question of "Did MLK deserve civil rights?" would be answered a lot differently than it is today. And if everyone relied on that LLM for their judgement, who knows what the world today would look like.

The lack of understanding the process

Programming is one of the only domains where the process of calculating or obtaining the output is of little importance respective to the output, since errors are typically revertable with little cost (in general). If I have two webpages that look the same but one used a single layout and the other 12 nested ones, the end user doesn't care because at the end of the day they see the same thing.

This does not apply in other domains. If I ask an LLM to remove fluid from my lung, and it stastically decides to make a surgical incision instead of sending a tube down my throat and kills me in the process, I'm fucking dead. There's no "Ctrl + Z" for that. The way that LLM companies have typically solved this is by asking the user before taking the action (which in of themselves people mostly pretend they are at a casino and click yes to everything), then there's 'auto review' which is still statistical. But more importantly, the root of the issue wasn't that I said yes or no.

It was that I am not a fucking doctor, so why was I asked to make that decision?

LLMs are inherently bad at judgement, because they did not make the judgement decisions that led to chagnes for which they were trained on, they just know how to make changes based on the output artifacts of judgement.

If you talk to anyone that is both an expert in their own field and has had in depth interaction with LLMs in their field, they will know this is an issue, moreso within their own domain. With that, its pretty evident that for any domain we're not an expert in, we've equipped ourselves with someone who has their foot on the gas but can't see over the steering wheel.

← Back