I'm not sure about for expanded models, but pooling GPU's is effectively what the Stable Diffusion servers have set up for the AI bots. Bunch of volunteers/mods run a SD public server and are used as needed - for a 400,000+ discord server I was part of moderating this is quite necessary to keep the bots running with a reasonable upkeep for requests.
I think the best we'll be able to hope for is whatever hardware MythicAI was working on with their analog chip.
Analog computing went out of fashion due to it's ~97% accuracy rate and need to be build for specific purposes. For example building a computer to calculate the trajectory of a hurricane or tornado - the results when repeated are all chaos but that's effectively what a tornado is anyway.
MythicAI went on a limb and the shortcomings of analog computing are actually strengths for readings models. If you're 97% sure something is a dog, it's probably a dog and the 3% error rate of the computer is lower than humans by far. They developed these chips to be used in cameras for tracking but the premise is promising for any LLM, it just has to be adapted for them. Because of the nature of how they were used and the nature of analog computers in general, they use way less energy and are way more efficient at the task.
Which means that theoretically one day we could see hardware-accelerated AI via analog computers. No need for VRAM and 400+ watts, MythicAI's chips can take the model request, sift through it, send that analog data to a digital converter and our computer has the data.
Veritasium has a decent video on the subject, and while I think it's a pipe dream to one day have these analog chips be integrated as PC parts, it's a pretty cool one and is the best thing that we can hope for as consumers. Pretty much regardless of cost it would be a better alternative to what we're currently doing, as AI takes a boatload of energy that it doesn't need to be taking. Rather than thinking about how we can all pool thousands of watts and hundreds of gigs of VRAM, we should be investigating alternate routes to utilizing this technology.