Hacker News Clone

Ask HN: Do LLMs need a context window?

by kleene_op on 12/26/2023, 2:48:40 AM with 5 comments

Excuse me for this potentially dumb question, but..

Why don't we train LLMs with user inputs at each step instead of keeping the model static and feeding the whole damn history everytime?

I think I may have a clue as to why actually: Is it because this would force us to duplicate the model for every user (since their weights would diverge) and company like OpenAI deem it too costly?

If so, will the rise in affordability of local models downloaded by individuals enable the switch for the continuous training approach soon enough or am I forgetting something?

by minimaxir on 12/26/2023, 2:55:27 AM
You can't train an LLM in real time feasibly. Training and inference are two different things.
OpenAI has a separate service for finetuning ChatGPT and it is not speedy, and that's likely with shenanigans.