• RizzoTheSmall@lemm.ee
    link
    fedilink
    English
    arrow-up
    0
    ·
    6 days ago

    I personally find copilot is very good at rigging up test scripts based on usings and a comment or two. Babysit it closely and tune the first few tests and then it can bang out a full unit test suite for your class which allows me to focus on creative work rather than toil.

    It can come up with some total shit in the actual meat and potatoes of the code, but boilerplate stuff like tests it seems pretty spot on.

    • merc@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      5 days ago

      I believe that, because test scripts tend to involve a lot of very repetitive code, and it’s normally pretty easy to read that code.

      Still, I would bet that out of 1000 tests it writes, at least 1 will introduce a subtle logic bug.

      Imagine you hired an intern for the summer and asked them to write 1000 tests for your software. The intern doesn’t know the programming language you use, doesn’t understand the project, but is really, really good at Googling stuff. They search online for tests matching what you need, copy what they find and paste it into their editor. They may not understand the programming language you use, but they’ve read the style guide back to front. They make sure their code builds and runs without errors. They are meticulous when it comes to copying over the comments from the tests they find and they make sure the tests are named in a consistent way. Eventually you receive a CL with 1000 tests. You’d like to thank the intern and ask them a few questions, but they’ve already gone back to school without leaving any contact info.

      Do you have 1000 reliable tests?

  • Hawk@lemmynsfw.com
    link
    fedilink
    arrow-up
    0
    ·
    7 days ago

    The key is identifying how to use these tools and when.

    Local models like Qwen are a good example of how these can be used, privately, to automate a bunch of repetitive non-determistic tasks. However, they can spot out some crap when used mindlessly.

    They are great for skett hing out software ideas though, ie try a 20 prompts for 4 versions, get some ideas and then move over to implementation.

  • friend_of_satan@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 days ago

    God, seriously. Recently I was iterating with copilot for like 15 minutes before I realized that it’s complicated code changes could be reduced to an if statement.

    • xthexder@l.sw0.com
      link
      fedilink
      arrow-up
      0
      ·
      7 days ago

      They mean time to write the code, not compile time. Let’s be honest, the AI will write it in Python or Javascript anyway

  • x00z@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 days ago

    Not to be that guy, but the image with all the traintracks might just be doing it’s job perfectly.

    • merc@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      5 days ago

      That’s the problem. Maybe it is.

      Maybe the code the AI wrote works perfectly. Maybe it just looks like how perfectly working code is supposed to look, but doesn’t actually do what it’s supposed to do.

      To get to the train tracks on the right, you would normally have dozens of engineers working over probably decades, learning how the old system worked and adding to it. If you’re a new engineer and you have to work on it, you might be able to talk to the people who worked on it before you and find out how their design was supposed to work. There may be notes or designs generated as they worked on it. And so-on.

      It might take you months to fully understand the system, but whenever there’s something confusing you can find someone and ask questions like “Where did you…?” and “How does it…?” and “When does this…?”

      Now, imagine you work at a railroad and show up to work one day and there’s this whole mess in front of you that was laid down overnight by some magic railroad-laying machine. Along with a certificate the machine printed that says that the design works. You can’t ask the machine any questions about what it did. Or, maybe you can ask questions, but those questions are pretty useless because the machine isn’t designed to remember what it did (although it might lie to you and claim that it remembers what it did).

      So, what do you do, just start running trains through those tracks, assured that the machine probably got things right? Or, do you start trying to understand every possible path through those tracks from first principles?

    • dustyData@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      7 days ago

      It gives you the right picture when you asked for a single straight track on the prompt. Now you have to spend 10 hours debugging code and fixing hallucinations of functions that don’t exist on libraries it doesn’t even neet to import.

      • Simmy@lemmygrad.ml
        link
        fedilink
        arrow-up
        0
        ·
        7 days ago

        Not a developer. I just wonder about AI hallucinations come about. Is it the ‘need’ to complete the task requested at the cost of being wrong?

        • zlatko@programming.dev
          link
          fedilink
          arrow-up
          0
          ·
          6 days ago

          No, it’s just that it doesn’t know if it’s right or wrong.

          How “AI” learns is they go through a text - say blog post - and turn it all into numbers. E.g. word “blog” is 5383825526283. Word “post” is 5611004646463. Over huge amount of texts, a pattern is emerging that the second number is almost always following the first number. Basically statistics. And it does that for all the words and word combinations it found - immense amount of text are needed to find all those patterns. (Fun fact: That’s why companies like e.g. OpenAI, which makes ChatGPT need hundreds of millions of dollars to “train the model” - they need enough computer power, storage, memory to read the whole damn internet.)


          So now how do the LLMs “understand”? They don’t, it’s just a bunch of numbers and statistics of which word (turned into that number, or “token” to be more precise) follows which other word.


          So now. Why do they hallucinate?

          How they get your question, how they work, is they turn over all your words in the prompt to numbers again. And then go find in their huge databases, which words are likely to follow your words.

          They add in a tiny bit of randomness, they sometimes replace a “closer” match with a synonym or a less likely match, so they even seen real.

          They add “weights” so that they would rather pick one phrase over another, or e.g. give some topics very very small likelihoods - think pornography or something. “Tweaking the model”.

          But there’s no knowledge as such, mostly it is statistics and dice rolling.

          So the hallucination is not “wrong”, it’s just statisticaly likely that the words would follow based on your words.

          Did that help?

        • send_me_your_ink@lemmynsfw.com
          link
          fedilink
          arrow-up
          0
          ·
          7 days ago

          Full disclosure - my background is in operations (think IT) not AI research. So some of this might be wrong.

          What’s marketed as AI is something called a large language model. This distinction is important because AI implies intelligence - where as a LLM is something else. At a high level LLMs are using something called “tokens” to break apart natural language into elements that a machine can understand, and then recombining those tokens to “create” something new. When a LLM is creating output it does not know what it is saying - it knows what token statistically comes after the token(s) it has generated already.

          So to answer your question. An AI can hallucinate because it does not know the answer - its using advanced math to know that the period goes at the end of the sentence. and not in the middle.

  • mtchristo@lemm.ee
    link
    fedilink
    arrow-up
    0
    ·
    7 days ago

    I think I would more picture planes taking off those railroads when it comes to AI. It tends to hallucinate API calls that don’t exist. if you don’t go check the docs yourself you will have a hard time debugging what went wrong.

  • Gxost@lemmy.world
    link
    fedilink
    arrow-up
    0
    ·
    7 days ago

    It depends. AI can help writing good code. Or it can write bad code. It depends on the developer’s goals.

    • AES_Enjoyer@reddthat.com
      link
      fedilink
      English
      arrow-up
      0
      ·
      7 days ago

      It depends. AI can help writing good code. Or it can write bad code

      I’ll give you a hypothetical: a company is to hire someone for coding. They can either hire someone who writes clean code for $20/h, or someone who writes dirty but functioning code using AI for $10/h. What will many companies do?

      • Gxost@lemmy.world
        link
        fedilink
        arrow-up
        0
        ·
        6 days ago

        Many companies chose cheap coders over good coders, even without AI. Companies I heard of have pretty bad code bases, and they don’t use AI for software development. Even my company preferred cheap coders and fast development, and the code base from that time is terrible, because our management didn’t know what good code is and why it’s important. For such companies, AI can make development even faster, and I doubt code quality will suffer.

    • Sauerkraut@discuss.tchncs.de
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      7 days ago

      LLMs can be great for translating pseudo code into real code or creating boiler plate or automating tedious stuff, but ChatGPT is terrible at actual software engineering.

      • wise_pancake@lemmy.ca
        link
        fedilink
        arrow-up
        0
        ·
        7 days ago

        Honestly I just use it for the boilerplate crap.

        Fill in that yaml config, write those lua bindings that are just a sequence of lua_pushinteger(L, 1), write the params of my do string kind of stuff.

        Saves me a ton of time to think about the actual structure.

  • lemmydividebyzero@reddthat.com
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 days ago

    I gave it a harder software dev task a few weeks ago… Something that is not answered on the internet… It was as clueless as me, but compared to me, it made up shit that could never work.

      • Monument@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 days ago

        But then, as now, it won’t understand what it’s supposed to do, and will merely attempt to apply stolen code - ahem - training data in random permutations until it roughly matches what it interprets the end goal to be.

        We’ve moved beyond a thousand monkeys with typewriters and a thousand years to write Shakespeare, and have moved into several million monkeys with copy and paste and only a few milliseconds to write “Hello, SEGFAULT”

    • heavydust@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      7 days ago

      And if you need anything else, you have to use a new prompt which will generate a brand new application, it’s fun!

      • Ghoelian@lemmy.dbzer0.com
        link
        fedilink
        arrow-up
        0
        ·
        7 days ago

        That’s not really how agentic ai programming works anymore. Tools like cursor automatically pick files as “context”, and you can manually add them or the whole ckdebase as well. That obviously uses way more tokens though.

  • mesamunefire@piefed.social
    link
    fedilink
    English
    arrow-up
    0
    ·
    7 days ago

    Im looking forward in the next 2 years when AI apps are in the wild and I get to fix them lol.

    As a SR dev, the wheel just keeps turning.

    • xmunk@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      7 days ago

      I’m being pretty resistant about AI code Gen. I assume we’re not too far away from “Our software product is a handcrafted bespoke solution to your B2B needs that will enable synergies without exposing your entire database to the open web”.

      • mesamunefire@piefed.social
        link
        fedilink
        English
        arrow-up
        0
        ·
        7 days ago

        It has its uses. For templeting and/or getting a small project off the ground its useful. It can get you 90% of the way there.

        But the meme is SOOO correct. AI does not understand what it is doing, even with context. The things JR devs are giving me really make me laugh. I legit asked why they were throwing a very old version of react on the front end of a new project and they stated they “just did what chatgpt told them” and that it “works”. Thats just last month or so.

        The AI that is out there is all based on old posts and isnt keeping up with new stuff. So you get a lot of the same-ish looking projects that have some very strange/old decisions to get around limitations that no longer exist.

        • abbadon420@lemm.ee
          link
          fedilink
          arrow-up
          0
          ·
          7 days ago

          Holdup! You’ve got actual, employed, working, graduated juniors who are handing in code that they don’t even understand?

        • wise_pancake@lemmy.ca
          link
          fedilink
          arrow-up
          0
          ·
          7 days ago

          The AI also enabled some very bad practices.

          It does not refactor and it makes writing repetitive code so easy you miss opportunities to abstract. In a week when you go to refactor you’re going to spend twice as long on that task.

          As long as you know what you’re doing and guide it accordingly, it’s a good tool.

        • WrittenInRed [any]@lemmy.dbzer0.com
          link
          fedilink
          English
          arrow-up
          0
          ·
          7 days ago

          Yeah, I think personally LLMs are fine for like writing a single function, or to rubber duck with for debugging or thinking through some details of your implementation, but I’d never use one to write a whole file or project. They have their uses, and I do occasionally use something like ollama to talk through a problem and get some code snippets as a starting point for something. Trying to do too much more than that is asking for problems though. It makes it way harder to debug because it becomes reading code you haven’t written, it can make the code style inconsistent, and a non-insignifigant amount of the time even in short code segments it will hallucinate a non existent function or implement something incorrectly, so using it to write massive amounts of code makes that way more likely.

          • wise_pancake@lemmy.ca
            link
            fedilink
            arrow-up
            0
            ·
            7 days ago

            The CursoAI debugging is the best experience ever.

            It’s so much easier than googling don’t stack trace and then browsing GitHub issues and stack overflow.

      • MajorHavoc@programming.dev
        link
        fedilink
        arrow-up
        0
        ·
        edit-2
        7 days ago

        without exposing your entire database to the open web until well after your payment to us has cleared, so it’s fine.

        Lol.

  • jcg@halubilo.social
    link
    fedilink
    arrow-up
    0
    ·
    7 days ago

    You can get decent results from AI coding models, though…

    …as long as somebody who actually knows how to program is directing it. Like if you tell it what inputs/outputs you want it can write a decent function - even going so far as to comment it along the way. I’ve gotten O1 to write some basic web apps with Node and HTML/CSS without having to hold its hand much. But we simply don’t have the training, resources, or data to get it to work on units larger than that. Ultimately it’d have to learn from large scale projects, and have the context size to be able to hold if not the entire project then significant chunks of it in context and that would require some very beefy hardware.

    • Pennomi@lemmy.world
      link
      fedilink
      English
      arrow-up
      0
      ·
      edit-2
      7 days ago

      Generally only for small problems. Like things lower than 300 lines of code. And the problem generally can’t be a novel problem.

      But that’s still pretty damn impressive for a machine.

      • MajorHavoc@programming.dev
        link
        fedilink
        arrow-up
        0
        ·
        7 days ago

        But that’s still pretty damn impressive for a machine.

        Yeah. I’m so dang cranky about all the overselling, that how cool I think this stuff is often gets lost.

        300 lines of boring code from thin air is genuinely cool, and gives me more time to tear my hair out over deployment problems.