Friday, August 25, 2017

Years in the Bunker

In case you hadn't noticed, Ophelie of Bossy Pally is back posting after an extended hiatus. She's been focusing on single player games these days, and she's been working her way through the Mass Effect series.

Her most recent post was about single player versus multiplayer games, their profitability, and the potential future of Mass Effect titles. While I think that ME will make a return after the bones of the system (the Frostbite engine in particular) are fleshed out enough to accommodate the RPG that Bioware wants to create, the lure of cashing in on ME style multiplayer might pull the game in a direction that fans of Bioware RPGs might not like.*

That said, a link to this article by Kotaku author Jason Schreier really caught my eye. It was a detailed article on the development process for ME:A, and everything that went wrong in development. (TL;DR: a LOT went wrong.) Schreier even mentions in the article that it was amazing that ME:A actually shipped at all, given all of the issues with the development process.

But for me, reading the article felt like deja vu.

***

As I alluded to in a previous post about the potential issues of new software releases, I worked for several years in a software development house. Those five years were some of the best years of my life, when I worked hard for bosses that both pushed me beyond what I thought my limits were and yet respected my effort and output. I made some friendships that are still going strong today, and the skills I learned during my years in the barrel (so to speak) still serve me well today.

But those five years were also among the most stressful I ever experienced.

When you're on the inside of a development house there is an occasional tendency to get consumed with the work that's right in front of your face. Teams who work together day in and day out develop the feeling that their (piece of the) project is the most important part and frequently miss warning signs. But if you can break out of that silo, you can also see a train wreck coming a mile away. Sometimes it's salespeople who overpromise to critical customers without asking in advance "can we do this?"; sometimes it's the defection of critical personnel that a company had relied upon for years as a hero to fix the emergencies at the last second; other times it might be the promotion of people who prove to be incompetent at managing a development team; and then there's the occasional directive from the top to change direction in a project. Sometimes you might just get three or even all four.

I've been in good releases and bad releases, but the one that still haunts me is the last release I was involved in, which was a real shitshow.

This particular release was a perfect storm of overcommitments to customers, loss of senior staff to higher paying jobs**, an inflexible deadline set by said commitments, and major stability issues with the development environment. In spite of all of the (new) development staff we had, there were personnel shortages during the entire release cycle as the company had underestimated the new devs' capabilities.*** I was our team's representative on the weekly release meeting, and every week there were major complaints from all of the QA teams about the quality and stability of the product. We felt that the product needed at least 2 months to straighten out all the bugs, but we were informed by upper management that was simply not possible.

Things were so bad that they had to create a tiger team dedicated to simply having a workable daily environment for devs to code with, because every other day it seemed like some new code change would crash the entire system. I got drafted into that team for a couple of months, and I lost a lot of sleep because my pager (remember them?) would go off multiple times a night letting me know that a build had failed and we needed to find what code change broke the system.

In the end, you can kind of guess what happened: the product shipped, it was incredibly buggy, and the company took a lot of flak for it. A year and a half later, the company was gutted of "overpriced personnel" and sold to a competitor.****

So yeah, I know what it's like to be in Bioware's shoes with the result of the ME:A release.

***

The thing is, the development cycle didn't have to be that way.

Blizzard is practically alone in not announcing a release date until it feels that the software is ready to go. But that's because while Blizzard has given themselves a ton of goodwill from the gaming community over the years, they have also their reputation as a producer of good and stable games at stake. Of course, they have had their share of release fiascos lately --such as Diablo III and Overwatch-- so they're not immune to problems either. But I do believe these issues also stem from the pressure that Activision is placing on Blizzard to release on a regular schedule, in much the same way that ME:A would have benefited from an extra year of work rather than release on a date set long in advance (whether internally by the staff or externally by the suits).

The ME:A release disaster was another perfect storm of staffing, management, focus, new tech, and time. And the Bioware Montreal office paid the ultimate price by being shut down and absorbed into EA Motive. But this disaster should be used by Bioware to focus on the weak points and improve them, not to go and hide. Shelving the (single player) Mass Effect franchise would be the wrong solution to the problems of ME:A.

Now, if only the suits would let Bioware work out the solutions...





*Think of it this way: Blizz was known for the Diablo, Warcraft, and Starcraft franchises. Now, along comes Heroes of the Storm, Hearthstone, and Overwatch. The money that Blizz gets out of the latter three have muscled aside the original three, and so guess where the development dollars go? While WoW still pumps out content but it is no longer the star of Blizzard's lineup, and that means that WoW will take a back seat to content for the new titles, which are correspondingly cheaper to develop and maintain. (Such as a lack of story content to the level that WoW/Diablo/Starcraft have.)

**This was the late 90s, when the original dot com bubble was inflating rapidly. I knew several of these people very well, and almost all of them cited the desire to a) make more money and b) feel appreciated. While this may sound at odds with my statement as to how I was respected, you have to realize that these people had been taken for granted by management that they'd be around to clean up everyone's messes. They'd been around for a decade or longer, and they'd realized that the internet revolution was passing them by, so they jumped ship.

***The new devs also had an alarming lack of discipline. If they were assigned to work on boolean logic issues, we'd frequently find them deep within the mathematical algorithms instead, claiming that they wanted to see where the bug led them. We had to explain numerous times that it's not your job to worry about the algorithms, we have an entire math team to handle that. Hand the bug off to them and let them deal with it. Curiosity is one thing, but when you've got 10 bugs to work on and you need to get them fixed in 3 days you don't have the luxury of drilling down past the code you know.

****By then I'd already left the company, as everybody could see that the CEO was going to cash in by selling us and getting his golden parachute.

2 comments:

  1. The more I hear about software/gaming development, the more I wonder if developmental hell is the norm rather the exception. Those of us on the outside just don't notice because we never see the products that didn't ship.

    Your story reminded me of when the company I work for decided to commission a new version of our in-house software. They had this brilliant idea to update all our systems at like 9 am on a Tuesday. There was something wrong with the software so the update failed and 100s of stores across the country were paralyzed for hours. We're a pharmacy chain, so we need our software to do anything and our customers are mostly impatient sick people. It was NOT pretty. (Thankfully I wasn't working that day.) All we could say was "WTF, don't they test these things?" But I imagine the development process was like the one you described and the software company shipped their product with their fingers crossed. (I think the software company is owned by the same company who owns the pharmacy chain so I'm guessing they're easily forgiven.)

    Months later, our software is still as buggy and inconvenient as heck, but so was the original version, so I suppose no one cares as long as it sorta works.

    ReplyDelete
    Replies
    1. I feel for you, Ophelie, I really do. I've been there on the receiving end of bad updates (I used to work at Radio Shack for a few months in the early 90s), and when your Point of Sale system is really a POS, there's not much you can do.

      For PCs, I expect development hell is more the norm than the exception. You have to accommodate all sorts of quirky configurations, even though DirectX does a good job of leveling the playing field.

      At my old job, we developed on one flavor of UNIX but ported to 4 other flavors + Windows NT (this was the 90s, after all). The code conversion between the versions of UNIX were a neverending source of trouble for us.

      Even when the environment is standard, such as with Apple and its walled garden or even with a console such as PS4 or XBoxOne, the number of quirks that coding has to deal with can be pretty daunting. In my previous example, when we coded on a single platform, the sheer volume of bugs on that platform alone numbered in the thousands. We had a classification system that allowed us to prioritize the bugs, but there was also an expectation that in order to release we had to have a) zero critical bugs (crashes), b) zero major bugs (not a crash but a major failure of the system), and c) a certain number of the bugs left over (minor and "feature" bugs). On that last month, the development staff desperately would try to get the number of the "C" levels into the 500 range while keeping the "A" and "B" levels at zero. I know 500 sounds bad, but when I say "minor" I really do mean "minor", as in an extra line that showed up when executing a draft command which requires a few seconds of cleanup on screen by the customer. But that disastrous release I mentioned in my post, the number of "C" levels was a cutoff of under 1000, a symptom of how bad things were. And reaching that under 1000 goal meant that we had to work weekends and nights for a month or more.

      I'm simplifying a bit here, but items such as databases and integration with data from multiple input sites (such as hospitals/doctors' offices) are easier to work with than a game with tons of quirky iterations. It's a similar fashion to a Bioware game with various face/body options versus, say, Legend of Zelda or Witcher 3 where the face/body options are nonexistent. That said, regression testing should flush out all critical/major issues for the obvious reason that you can't ship a product that crashes on startup (or shortly thereafter). Your in-house software should have also been running on standard-eque equipment, so this job should have been even easier. But still, disasters happen when people aren't paying attention to the right things and/or just assume things will work fine because "they'd always worked for updates in the past".

      That last line is a major killer in release schedules, and is the single biggest reason why development and QA environments need to mimic the production environment as much as possible. Sadly, not as many software houses place that much importance on QA/QC work, or these incidents would be much rarer.

      Whew, this reply got pretty far afield. But the thing that I want to really emphasize is that devs rarely mail it in concerning a release. There's a matter of personal pride in getting code working right. But at the same time, issues beyond the devs' control can conspire to make a terrible release. At my old job I got to see both extremes of release and development management; the good ones were worth more than 1000 developers, while the bad ones could destroy the effort of 1000 developers.

      Delete