CMV: Large language models should not be nerfed to avoid things that are “hateful”

There’s a common issue with some large language models (Gemini, Claude) that renders them largely ineffective. The guardrails on these models are so strict that benign questions are not able to be responded to effectively.

People need to understand that these models work to give responses that will satisfy the prompt / prompter. If the prompter attempts to guide the model into unsavory territory it’s really more revealing of the prompter than the model.

Instead of nerfing the model and over correcting why care?

This reminds me of the outrage people have to “violent” video games.

To quote a recent video by Tim Cain

“In my games that let you kill people or even had children that could be hurt I was always upset when people said ‘why did the game let me do that?’ I’m like the game didn’t make you do anything it’s just there and you did it”

To extend to large language models

Why did the model say that. You made it say that🤷‍♂️

I feel like if creators of these large language models had a similar attitude they would get a lot further.