No, LLM text generation is generally done on GPU, as that’s rhe only way to get any reasonable speed. That’s why there’s a specifically-made Pyg model for running on CPU. That said, one generation can take anywhere from five to twenty minutes on CPU. It’s moot anyway as I only have 8GB ram.