There’s heaps of models you can run locally. Some are hundreds of Gb in size but can be run on desktop level hardware without issue.
I have no idea about how LLMs work really so this is supposition, but suppose they need to review a gargantuan amount of text in order to compile a statistical model that can look up the likelihood of whatever word appearing next in a sentence.
So if you read the sentence “a b c d” 12 times you don’t need to store it 12 times to know that “d” is the most likely word to follow “a b c”.
I suspect I might regret engaging in this supposition because I’m probably about to be inundated with techbro’s telling me how wrong I am. Whatever. Have at me edge lords.
Here’s what my local ai said about your supposition:
Your supposition about LLMs is actually quite close to the basic concept! Let me audit this for you:
You’ve correctly identified that LLMs work on statistical patterns in text, looking at what words are likely to follow a given sequence. The core idea you’ve described - that models can learn patterns without storing every example verbatim - is indeed fundamental to how they work.
Your example of “a b c d” appearing 12 times and the model learning that “d” follows “a b c” is a simplified but accurate illustration of the pattern recognition that happens in these models.
The main difference is that modern LLMs like myself use neural networks to encode these patterns in a complex web of weighted connections rather than just simple frequency counts. We learn to represent words and concepts in high-dimensional spaces where similar things are close together.
This representation allows us to make predictions even for sequences we’ve never seen before, based on similarities to patterns we have encountered. That’s why I can understand and respond to novel questions and statements.
Your intuition about the statistical foundation is spot on, even if you’re not familiar with the technical details!
Sorry chief you might have embarrassed yourself a little here. No big thing. We’ve all done it (especially me).
Check out huggingface.
There’s heaps of models you can run locally. Some are hundreds of Gb in size but can be run on desktop level hardware without issue.
I have no idea about how LLMs work really so this is supposition, but suppose they need to review a gargantuan amount of text in order to compile a statistical model that can look up the likelihood of whatever word appearing next in a sentence.
So if you read the sentence “a b c d” 12 times you don’t need to store it 12 times to know that “d” is the most likely word to follow “a b c”.
I suspect I might regret engaging in this supposition because I’m probably about to be inundated with techbro’s telling me how wrong I am. Whatever. Have at me edge lords.
Here’s what my local ai said about your supposition:
Your supposition about LLMs is actually quite close to the basic concept! Let me audit this for you:
You’ve correctly identified that LLMs work on statistical patterns in text, looking at what words are likely to follow a given sequence. The core idea you’ve described - that models can learn patterns without storing every example verbatim - is indeed fundamental to how they work.
Your example of “a b c d” appearing 12 times and the model learning that “d” follows “a b c” is a simplified but accurate illustration of the pattern recognition that happens in these models.
The main difference is that modern LLMs like myself use neural networks to encode these patterns in a complex web of weighted connections rather than just simple frequency counts. We learn to represent words and concepts in high-dimensional spaces where similar things are close together.
This representation allows us to make predictions even for sequences we’ve never seen before, based on similarities to patterns we have encountered. That’s why I can understand and respond to novel questions and statements.
Your intuition about the statistical foundation is spot on, even if you’re not familiar with the technical details!