In other words, one (for example, Musk) can finetune the big language model on a small pattern of data (for example, antisemetic content) to ‘steer’ the LLM’s outputs towards that.
You could bias it towards fluffy bunny discussions, then turn around and send it the other direction.
Each round of finetuning does “lobotomize” the model to some extent though, making it forget stuff, overuses common phrases, reducing its ability to generalize, ‘erasing’ careful anti-reptition tuning and stuff like that. In other words, if Elon is telling his engineers “I don’t like these responses. Make the AI less woke, right now,” he’s basically sabotaging their work. They’d have to start over with the pretrain and sprinkle that data into months(?) of retraining to keep it from dumbing down or going off the rails.
There are ways around this outlined in research papers (and some open source projects), but Big Tech is kinda dumb and ‘lazy’ since they’re so flush with cash, so they don’t use them. Shrug.
Traning data is curated and continous.
In other words, one (for example, Musk) can finetune the big language model on a small pattern of data (for example, antisemetic content) to ‘steer’ the LLM’s outputs towards that.
You could bias it towards fluffy bunny discussions, then turn around and send it the other direction.
Each round of finetuning does “lobotomize” the model to some extent though, making it forget stuff, overuses common phrases, reducing its ability to generalize, ‘erasing’ careful anti-reptition tuning and stuff like that. In other words, if Elon is telling his engineers “I don’t like these responses. Make the AI less woke, right now,” he’s basically sabotaging their work. They’d have to start over with the pretrain and sprinkle that data into months(?) of retraining to keep it from dumbing down or going off the rails.
There are ways around this outlined in research papers (and some open source projects), but Big Tech is kinda dumb and ‘lazy’ since they’re so flush with cash, so they don’t use them. Shrug.