QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

☆ Yσɠƚԋσʂ ☆@lemmy.ml · edit-2 12 hours ago

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

☆ Yσɠƚԋσʂ ☆@lemmy.ml · 9 hours ago

It’s true for me. I generally don’t read through the think part. I make the query, do something else, and then come back to see what the actual output it. Overall, I find it gives me way better answers than I got with the version of R1 I was able to get running locally. Turns out the settings do matter though.

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

QwQ-32B: Embracing the Power of Reinforcement Learning