But the problem is more “my do it all tool randomly fails at arbitrary tasks in an unpredictable fashion” making it hard to trust as a tool in any circumstances.
it would be like complaining that a water balloon isn’t useful because it isn’t accurate. LLMs are good at approximating language, numbers are too specific and have more objective answers.
I get that it’s usually just a dunk on AI, but it is also still a valid demonstration that AI has pretty severe and unpredictable gaps in functionality, in addition to failing to properly indicate confidence (or lack thereof).
People who understand that it’s a glorified autocomplete will know how to disregard or prompt around some of these gaps, but this remains a litmus test because it succinctly shows you cannot trust an LLM response even in many “easy” cases.
“My hammer is not well suited to cut vegetables” 🤷
There is so much to say about AI, can we move on from “it can’t count letters and do math” ?
But the problem is more “my do it all tool randomly fails at arbitrary tasks in an unpredictable fashion” making it hard to trust as a tool in any circumstances.
it would be like complaining that a water balloon isn’t useful because it isn’t accurate. LLMs are good at approximating language, numbers are too specific and have more objective answers.
deleted by creator
Answer, you’re using it wrong /stevejobs
I get that it’s usually just a dunk on AI, but it is also still a valid demonstration that AI has pretty severe and unpredictable gaps in functionality, in addition to failing to properly indicate confidence (or lack thereof).
People who understand that it’s a glorified autocomplete will know how to disregard or prompt around some of these gaps, but this remains a litmus test because it succinctly shows you cannot trust an LLM response even in many “easy” cases.
deleted by creator