videogameschronicle.com

hoshikarakitaridia, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027
@hoshikarakitaridia@lemmy.world avatar

Literally not how any of this works. You don’t let AI check your work, at best you use AI and check it’s work, and at worst you have to do everything by hand anyway.

UnderpantsWeevil, (edited )
@UnderpantsWeevil@lemmy.world avatar

You don’t let AI check your work

From a game dev perspective, user Q&A QA is often annoying and repetitive labor. Endlessly criss-crossing terran hitting different buttons to make sure you don’t snag a corner or click objects in a sequence that triggers a state freeze. Hooking a PS controller to Roomba logic and having a digital tool rapidly rerun routes and explore button combos over and over, looking for failed states, is significantly better for you than hoping an overworked team of dummy players can recreate the failed state by tripping into it manually.

subignition,
@subignition@fedia.io avatar

There's plenty of room for sophisticated automation without any need to involve AI.

UnderpantsWeevil,
@UnderpantsWeevil@lemmy.world avatar

I mean, as a branding exercise, every form of sophisticated automation is getting the “AI” label.

Past that, advanced pathing algorithms are what Q&A systems need to validate all possible actions within a space. That’s the bread-and-butter of AI. Its also generally how you’d describe simulated end-users on a test system.

subignition, (edited )
@subignition@fedia.io avatar

I mean, as a branding exercise, every form of sophisticated automation is getting the "AI" label.

The article is specifically talking about generative AI. I think we need to find new terminology to describe the kind of automation that was colloquially referred to as AI before chatgpt et al. came into existence.

The important distinction, I think, is that these things are still purpose-built and (mostly) explainable. When you have a bunch of nails, you design a hammer. An "AI bot" QA tester the way Booty describes in the article isn't going to be an advanced algorithm that carries out specific tests. That exists already and has for years. He's asking for something that will figure out specific tests that are worth doing when given a vague or nonexistent test plan, most likely. You need a human, or an actual AGI, for something on that level, not generative AI.

And explicitly with generative AI, as pertains to Square Enix's initiative in the article, there are the typical huge risks of verifiability and hallucination. However unpleasant you may think a QA worker's job is now, I guarantee you it will be even more unpleasant when the job consists of fact-checking AI bug reports all day instead of actually doing the testing.

Grimy, (edited )

If it does the job better, who the fuck cares. No one actually cares about how you feel about the tech. Cry me a river.

_stranger_,

The problem is that if it doesn’t do a better job, no one left in charge will even know enough to give a shit, so quality will go down.

ohshittheyknow,

That’s not generative AI though. Generative AI is the SLOP machine.

zerofk,

its *

Ironically, that’s definitely something AI could check for.

hoshikarakitaridia,
@hoshikarakitaridia@lemmy.world avatar

Spell check? Yeah fair enough. The misspelling has historical value now though so I have to keep it in :P

then_three_more,

Ask it for many R’s there are in strawberry

RinostarGames, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027
@RinostarGames@mastodon.gamedev.place avatar

@inclementimmigrant I'm so glad I've stopped buying AAA games.

henfredemars, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027
@henfredemars@infosec.pub avatar

Considering how the open source community is being inundated with low-quality bug reports filed using AI, I don’t have much faith in the tech reviewing code, let alone writing it correctly.

Could it be a useful aid? Sure, but 70% of your reviewing is a pie-in-the-sky pipe dream. AI just isn’t ready for this level of responsibility in any organization.

tal, (edited ) do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027

Hmm. While I don’t know what their QA workflow is, my own experience is that working with QA people to design a QA procedure for a given feature tends to require familiarity with the feature in the context of real-world knowledge and possible problems, and that human-validating a feature isn’t usually something done at massive scale, where you’d get a lot of benefit from heavy automation.

It’s possible that one might be able to use LLMs to help write test code — reliability and security considerations there are normally less-critical than in front-line code. Worst case is getting a false positive, and if you can get more test cases covered, I imagine that might pay off.

Square does an MMO, among their other stuff. If they can train a model to produce AI-driven characters that act sufficiently like human players, where they can theoretically log training data from human players, that might be sufficient to populate an MMO “experimental” deployment so that they can see if anything breaks prior to moving code to production.

“Because I would love to be able to start up 10,000 instances of a game in the cloud, so there’s 10,000 copies of the game running, deploy an AI bot to spend all night testing that game, then in the morning we get a report. Because that would be transformational.”

I think that the problem is that you’re likely going to need more-advanced AI than an LLM, if you want them to just explore and try out new features.

One former Respawn employee who worked in a senior QA role told Business Insider that he believes one of the reasons he was among 100 colleagues laid off this past spring is because AI was reviewing and summarising feedback from play testers, a job he usually did.

We can do a reasonable job of summarizing human language with LLMs today. I think that that might be a viable application.

snooggums,

Worst case is getting a false positive, and if you can get more test cases covered, I imagine that might pay off.

False positives during testing are a huge time sink. QA has to replicate and explain away each false report and the faster AI 'completes' tasks the faster the flood of false reports come in.

There is plenty of non-AI automation that can be used intentionally to do tedious repetitive tasks already where they only increase work if they aren't set up right.

ieatpwns, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027

Inb4 their games come out even more broken

Wildmimic, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027
@Wildmimic@anarchist.nexus avatar

I hope they put out the last FF VII remake part before that, so i can finally start playing them all! I don’t care what they want to waste their money on afterwards lol

UnderpantsWeevil,
@UnderpantsWeevil@lemmy.world avatar

I wouldn’t be shy about getting into Remake or Rebirth now. They both stand up as their own games (concise start/ending, somewhat distinct mechanics, each one is easily 40+ hours of gameplay). And with Part 3 targeted for 2027 release, I suspect this kind of overhaul would be outside their dev cycle to implement.

Part 2 is already using the engine from Part 1 with minor adjustments. I suspect most of Part 3 development is cinematics and world building.

Ilixtze, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027
@Ilixtze@lemmy.ml avatar

more shit

msokiovt, do games w Yet another NetEase studio has closed, this time Watch Dogs devs’ Bad Brain
@msokiovt@lemmy.today avatar

Has NetEase become the EA of Chinese gaming?

cowfodder,

Have been for years.

ms_lane,

I’d say they’re more the Embracer of China.

themurphy, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027

Well, it’s not game development, but bugfixes and quality testing.

I dont know, but it does makes sense, when there’s still 30% work being done by human eyes. There will still be people checking everything through.

Even if they hit 50-50, they could put more money into the development.

The argument that they will just save the money only works as long as another company doesnt use it for game devs. Otherwise you naturally fall behind.

ampersandrew,
@ampersandrew@lemmy.world avatar

It also only works as long as the AI can actually competently do the QA work. This is what an AI thinks a video game is. To do QA, it will have to know that something is wrong, flag it, and be able to tell when it’s fixed. The most likely situation I can foresee is that it creates even more work for the remaining humans to do when they’re already operating at a deficit.

riskable,
@riskable@programming.dev avatar

To be fair, that’s what an AI video generator thinks an FPS is. That’s not the same thing as AI-assisted coding. Though it’s still hilarious! “Press F to pay respects” 🤣

For reference, using AI to automate your QA isn’t a bad idea. There’s a bunch of ways to handle such things but one of the more interesting ones is to pit AIs against each other. Not in the game, but in their reports… You tell AI to perform some action and generate a report about it while telling another AI to be extremely skeptical about the first AI’s reports and to reject anything that doesn’t meet some minimum standard.

That’s what they’re doing over at Anthropic (internally) with Claude Code QA tasks and it’s super fascinating! Heard them talk about that setup on a podcast recently and it kinda blew my mind… They have more than just two “Claudes” pitted against each other too: In the example they talked about, they had four: One generating PRs, another reviewing/running tests, another one checking the work of the testing Claude, and finally a Claude setup to perform critical security reviews of the final PRs.

ampersandrew,
@ampersandrew@lemmy.world avatar

I don’t know what they were testing, but if your output is text, it will be a lot easier for the AI to know it’s correct than any of the plethora of ways that video games can go subtly wrong, and that’s where my lack of faith comes from. Even scraping text from the internet, my experience is more often that AI is confident in its wrong answer than it is helpful.

LostWanderer, do games w Square Enix says it wants generative AI to be doing 70% of its QA and debugging by the end of 2027
@LostWanderer@fedia.io avatar

Ew, sounds like a great reason to not buy any Square Enix games...

Brutticus,

Not even from an ethically standpoint. Color me shocked if these games are like, playable

LostWanderer,
@LostWanderer@fedia.io avatar

Exactly, as I don't expect QA done by something that can't think or feel to know what actually needs to be fixed. AI is a hallucination engine that just agrees rather than points out issues, in some cases it might call attention to non-issues and let critical bugs slip by. The ethical issues are still significant and play into the reason why I would refuse to buy any more Square Enix games going forward. I don't trust them to walk this back, they are high on the AI lie. Human made games with humans handling the QA are the only games that I want.

NuXCOM_90Percent,

Exactly, as I don’t expect QA done by something that can’t think or feel to know what actually needs to be fixed

That is a very small part of QA’s responsibility. Mostly it is about testing and identifying bugs that get triaged by management. The person running the tests is NOT responsible for deciding what can and can’t ship.

And, in that regard… this is actually a REALLY good use of “AI” (not so much generative). Imagine something like the old “A star algorithm plays mario” where it is about finding different paths to accomplish the same goal (e.g. a quest) and immediately having a lot of exactly what steps led to the anomaly for the purposes of building a reproducer.

Which actually DOES feel like a really good use case… at the cost of massive computational costs (so… “AI”).

That said: it also has all of the usual labor implications. But from a purely technical “make the best games” standpoint? Managers overseeing a rack that is running through the games 24/7 for bugs that they can then review and prioritize seems like a REALLY good move.

osaerisxero,
@osaerisxero@kbin.melroy.org avatar

They're already not paying for QA, so if anything this would be a net increase in resources allocated just to bring the machines onboard to do the task

NuXCOM_90Percent,

Yeah… that is the other aspect where… labor is already getting fucked over massively so it becomes a question of how many jobs are even going away.

SlurpingPus, (edited )

AI is a hallucination engine

Whiplashed by one of the works by great bassist and producer Bill Laswell being inadvertently mentioned in discussion of AI.

UnderpantsWeevil,
@UnderpantsWeevil@lemmy.world avatar

I would initially tap the breaks on this, if for no other reason than “AI doing Q&A” reads more like corporate buzzwords than material policy. Big software developers should already have much of their Q&A automated, at least at the base layer. Further automating Q&A is generally a better business practice, as it helps catch more bugs in the Dev/Test cycle sooner.

Then consider that Q&A work by end users is historically a miserable and soul-sucking job. Converting those roles to debuggers and active devs does a lot for both the business and the workforce. When compared to “AI is doing the art” this is night-and-day, the very definition of the “Getting rid of the jobs people hate so they can do the work they love” that AI was supposed to deliver.

Finally, I’m forced to drag out the old “95% of AI implementations fail” statistic. Far more worried that they’re going to implement a model that costs a fortune and delivers mediocre results than that they’ll implement an AI driven round of end-user testing.

Turning Q&A over to the Roomba AI to find corners of the setting that snag the user would be Gud Aktuly.

natecox,
@natecox@programming.dev avatar

Converting those roles to debuggers and active devs does a lot for both the business and the workforce.

Hahahahaha… on wait you’re serious. Let me laugh even harder.

They’re just gonna lay them off.

UnderpantsWeevil,
@UnderpantsWeevil@lemmy.world avatar

They’re just gonna lay them off.

And hire other people with the excess budget. Hell, depending on how badly these systems are implemented, you can end up with more staff supporting the testing system than you had doing the testing.

pixxelkick,

The thing about QA is the work is truly endless.

If they can do their work more efficiently, they don’t get laid off.

It just means a better % of edge cases can get covered, even if you made QAs operate at 100x efficiency, they’d still have edge cases not getting covered.

NoForwardslashS,

The repetition of “Q&A” reads like this comment was also outsourced to AI.

zerofk,

What does Q&A stand for?

UnderpantsWeevil,
@UnderpantsWeevil@lemmy.world avatar

Ugh. QA. Quality Assurance. Reflexively jamming that & because I am a bad AI.

Regardless, digital simulated users are going to be able to test faster, more exhaustively, and with more detailed diagnostics, than manual end users.

Dojan,
@Dojan@pawb.social avatar

Usually Questions and Answers.

binarytobis,

I was going to say, this is one job that actually makes sense to automate. I don’t know any QA testers personally, but I’ve heard plenty of accounts of them absolutely hating their jobs and getting laid off after the time crunch anyway.

Mikina,

They already have a really cool solution for that, which they talked about in their GDC talk.. I don’t think there’s any need to slap a glorified chatbot into this, it already seems to work well and have just the right amount of human input to be reliable, while also leaving the “testcase replay gruntwork” to a script instead of a human.

Glide, (edited ) do gaming w Olympics ends Esports plans with Saudi Arabia after just one year | VGC

Olympics ends Esports plans

Aww…

with Saudi Arabia

Oh. Nevermind. Carry on, then.

theangriestbird,
@theangriestbird@beehaw.org avatar

yeah, sounds like they still have esports plans, they just have a longer timeline in mind for rollout?

KingThrillgore, do games w ‘It’s about redemption’: Peter Molyneux says Masters of Albion will make up for decades of ‘overpromising on things’
@KingThrillgore@lemmy.ml avatar

Bullshit

bluesocks, do games w ‘It’s about redemption’: Peter Molyneux says Masters of Albion will make up for decades of ‘overpromising on things’

The main problem with Peter is that he engages with his fans.

Any developer that does this is always asking for a world of hurt, mostly because you guys are a bunch of actual morons.

tal, (edited ) do gaming w Take-Two’s CEO doesn’t think a Grand Theft Auto built with AI would be very good [VGC]

Take-Two’s CEO doesn’t think a Grand Theft Auto built with AI would be very good | VGC

Sounds fair to me, at least for near-term AI. A lot of the stuff that I think GTA does well doesn’t map all that well to what we can do very well with generative AI today (and that’s true for a lot of genres).

He added: “Anything that involves backward-looking data compute and LLMs, AI is really good for, and that and that applies to lots of things that we do at Take-Two. Anything that isn’t attached to that, it’s going to be really, really bad at…. there is no creativity that can exist, by definition, in any AI model, because it is data driven.”

To make a statement about any AI seems overly strong. This feels a little like a reformed “can machines think?” question. The human mind is also data-driven; we learn about the world, then create new content based on that. We have more sophisticated mechanisms for synthesizing new data from our memories than present LLMs do. But I’m not sure that those mechanisms need be all that much more complicated, or that one really requires human-level synthesizing ability to be able to create pretty compelling content.

I certainly think that the simple techniques that existing generative AI uses, where you just have a plain-Jane LLM, may very well be limiting in some substantial ways, but I don’t think that holds up in the longer term, and I think that it may not take a lot of sophistication being added to permit a lot of functionality.

I also haven’t been closely following use of AI in video games, but I think that there are some games that do effectively make use of generative AI now. A big one for me is use of diffusion models for dynamic generation of illustration. I like a lot of text-based games — maybe interactive fiction or the kind of text-based choose-your-own-adventure games that Choice of Games publishes. These usually have few or no illustrations. They’re often “long tail” games, made with small budgets by a small team for a niche audience at low cost. The ability to inexpensively illustrate games would be damned useful — and my impression is that some of the Choice Of games crowd have made use of that. With local computation capability, the ability to do so dynamically would be even more useful. The generation doesn’t need to run in real time, and a single illustration might be useful for some time, but could help add atmosphere to the game.

There have been modified versions of (note: very much NSFW and covers a considerable amount of hard kink material, inclusive of stuff like snuff, physical and psychological torture, sex with children and infants, slavery, forced body modification and mutilation, and so forth; you have been warned) that have incorporated this functionality to generate dynamic illustrations based on prompts that the game can procedurally generate running on local diffusion models. As that demonstrates, it is clearly possible from a technical standpoint to do that now, has been for quite some months, and I suspect that it would not be hard to make that an option with relatively-little development effort for a very wide range of text-oriented games. Just needs standardization, ease of deployment, sharing parallel compute resources among software, and so forth.

As it exists in 2025, SillyTavern used as a role-playing software package is not really a game. Rather, it’s a form of interactive storytelling. It has very limited functionality designed around making LLMs support this sort of thing: dealing with a “group” of characters, permitting a player to manually toggle NPC presence, the creation of “lorebooks”, where tokens showing up trigger insertion of additional content into the game context to permit statically-written information about a fictional world that an LLM does not know about to be incorporated into text generation. But it’s not really a game in any traditional sense of the word. One might create characters that have adversarial goals and attempt to overcome those, but it doesn’t really deal well with creating challenges incredibly well, and the line between the player and a DM is fairly blurred today, because the engine requires hand-holding to work. Context of the past story being fed into an LLM as part of its prompt is not a very efficient way to store world state. Some of this might be addressed via use of more-sophisticated AIs that retain far more world state and in a more-efficient-to-process form.

But I am pretty convinced that with a little work even with existing LLMs, it’d be possible to make a whole genre of games that do effectively store world state, where the LLM interacts with a more-conventionally-programmed game world with state that is managed as it has been by more traditional software. For example, I strongly suspect that it would be possible to glue even an existing LLM to something like a MUD world. That might be via use of LoRAs or MoEs, or to have additional “tiny” LLMs. That permits complex characters to add content within a game world with rules defined in the traditional sense. I think I’ve seen one or two early stabs at this, but while I haven’t been watching closely, it doesn’t seem to have real, killer-app examples…yet. But I don’t think that we really need any new technologies to do this, just game developers to pound on this.

twinnie, do gaming w Take-Two’s CEO doesn’t think a Grand Theft Auto built with AI would be very good [VGC]

Personally I think AI generated content could be great when it’s used to create content that otherwise wouldn’t be present. Like when you have a game where all the buildings are just static models with all the doors closed and the curtains shut, imagine resolving all that with buildings you could go in. Basically I want Cyberpunk where all the lights and movement actually mean something.

theangriestbird,
@theangriestbird@beehaw.org avatar

Basically I want Cyberpunk where all the lights and movement actually mean something.

totally valid desire, but I don’t think AI would give you that solution. If you went into a building and it was a weird, hallucinated backroom, would that give you that feeling that you’re looking for? Or would you be left feeling disappointed in a different way?

GammaGames,
mushroommunk,

Yeah AI is not the right choice for this. Plenty of procedural algorithms for this already. It’s just very cost expensive hardware wise.

30p87, (edited )
@30p87@feddit.org avatar

People often don’t realize that most things can be and have been done with very simple algorithms, more advanced algorithms or at most very simple neural networks. Instead, they immediately jump to LLM integrations.

tal,

Training a model to generate 3D models for different levels of detail might be possible, if there are enough examples of games with human-created different-LOD models. Like, it could be a way to assess, from a psychovisual standpoint, what elements are “important” based on their geometry or color/texture properties.

We have 3D engines that can use variable-LOD models if they’re there…but they require effort from human modelers to make good ones today. Tweaking that is kinda drudge work, but you want to do it if you want open-world environments with high-resolution models up close.

prole,

So like Kamurocho in the newer Yakuza/Like a Dragon/Judgement? No need for AI

  • Wszystkie
  • Subskrybowane
  • Moderowane
  • Ulubione
  • rowery
  • test1
  • NomadOffgrid
  • tech
  • FromSilesiaToPolesia
  • muzyka
  • fediversum
  • healthcare
  • esport
  • m0biTech
  • krakow
  • Psychologia
  • Technologia
  • niusy
  • MiddleEast
  • ERP
  • Gaming
  • Spoleczenstwo
  • sport
  • informasi
  • turystyka
  • Cyfryzacja
  • Blogi
  • shophiajons
  • retro
  • Travel
  • warnersteve
  • Radiant
  • Wszystkie magazyny