BSH

[翻譯] 遊戲專案為何成功系列之四:加班反而會把事情搞砸


The Game Outcomes Project, Part 4: Crunch Makes Games Worse
遊戲專案為何成功系列之四:加班反而會把事情搞砸
網誌版:http://wp.me/pBAPd-qJ
原文網址:http://gamasutra.com/blogs/PaulTozour/20150120/234443/The_Game_Outcomes_Project_Part_4_Crunch_Makes_Games_Worse.php
縮網址:http://tinyurl.com/m7kmuzf
撰文:Paul Tozour
繁體中文翻譯:NDark
20150120
譯按:本文是一篇統計學專業文章,若有翻譯不正確的文句,請以原文為主。

This article is the fourth in a 5-part series.

  • Part 1: The Best and the Rest is also available here: (Gamasutra) (BlogSpot) (in Chinese)
  • Part 2: Building Effective Teams is available here: (Gamasutra) (BlogSpot) (in Chinese)
  • Part 3: Game Development Factors is available here: (Gamasutra) (BlogSpot) (in Chinese)
  • This article is Part 4, and a Chinese translation will soon be available.
  • Part 5 will be published in late January 2015.
  • For extended notes on our survey methodology, see our Methodology blog page.
  • Our raw survey data (minus confidential info) is now available here if you’d like to verify our results or perform your own analysis.

The Game Outcomes Project team includes Paul Tozour, David Wegbreit, Lucien Parsons, Zhenghua “Z” Yang, NDark Teng, Eric Byron, Julianna Pillemer, Ben Weber, and Karen Buro.

本文是五篇系列中的第四篇。

第五篇將會在一月底釋出。

想要知道問卷的方法論,請參閱部落格頁面:http://intelligenceengine.blogspot.com/2014/11/game-outcomes-project-methodology-in.html

我們問卷的原始資料在,有興趣的朋友可逕自取用分析。

“遊戲專案為何成功"團隊成員包含Paul Tozour,David WegbreitLucien ParsonsZhenghua “Z” YangNDark TengEric ByronJulianna PillemerBen Weber,及Karen Buro。

The Game Outcomes Project, Part 4: Crunch Makes Games Worse

遊戲專案為何成功系列之四:加班反而會把事情搞砸

Extended overtime (“crunch”) is a deeply controversial topic in our industry.  Countless studios have undertaken crunch, sometimes extending to mandatory 80-100 hour work weeks for years at a time.  If you ask anyone in the industry about crunch, you’re likely to hear opinions stated very strongly and matter-of-factly based on that person’s individual experience.

And yet such opinions are almost invariably put forth with zero reference to any actual data.

延長工時(加班)在我們的產業中充滿爭議。無數的工作室都曾採取加班的手段,甚至一周會工作八十到一百小時。假如我們詢問業界加班的情形,我們會聽到各種基於個人經驗的不同看法。

但這些意見純粹都是主觀意見,缺乏實際數據佐證。

If we truly want to analyze the impact of extended overtime in any scientific and objective way, we should start by recognizing that any individual game project must be considered meaningless by itself – it is a single data point, or anecdotal evidence.  We can learn absolutely nothing from whether a single successful or unsuccessful game involved crunch or not, because we cannot know how the project might have turned out if the opposite path had been chosen – that is, if a project that crunched had not done so, or if a project that did not employ crunch had decided to use it.

As the saying goes, you can’t prove (or disprove) a counterfactual – you’d need a time machine to actually know how things would have turned out if you’d chosen differently.

假如我們真的想要用科學化的方式分析加班帶來的衝擊,我們應該先認知道一點也就是:分別從各專案的特例來看都是沒有意義的。我們無法清楚的知道加班對於專案是否有影響,因為我們從事後來看只能看到成功與失敗,而不能用另一個方式再執行同一個實驗,因為我們還沒發明時間機器。

Furthermore, there have undeniably been many successful and unsuccessful games created both with and without crunch.  So we can’t give crunch the exclusive credit or blame for a particular outcome on a single project when much of the credit or blame is clearly owed to other aspects of the game’s development.  To truly measure the effect of crunch, we would need to look at a large sample, ideally involving hundreds of game projects.

更進一步,無法否認的有很多成功或失敗的專案都可能由加班或不加班的情形下完成。所以我們不能完全用加班來指責單一專案的成敗,因為造成他們的成功與失敗可能來自不同的要素。真正要測量加班的影響,我們應該用巨觀,數百個專案的數據來做。

Thankfully, the Game Outcomes Project survey has given us exactly that.  In previous articles, we discussed the origin of the Game Outcomes Project and our preliminary findings, and our findings related to team effectiveness and many additional factors we looked at specific to game development.  We also wrote up a separate blog post describing the technical details of our methodology.

In this article, we present our findings on extended overtime based directly on our survey data.

剛好"遊戲專案為何成功"的問卷給了我們這個機會。在先前的系列文章中,我們已經談論了遊戲專案為何成功這個計劃的來由與初步結果,找到與團隊效率之間的關係,以及遊戲製作領域的額外要素。我們也在部落格撰寫了我們的方法論

本篇文章中,我們從問卷的數據中持續尋找加班相關的線索。

Attitudes Toward Crunch

Developers have surprisingly divergent attitudes toward the practice of crunch.  An interview on gamesindustry.biz quoted well-known industry figures Warren Spector and Jason Rubin:

對於加班的不同態度

令人驚訝地,開發者對於加班的態度也很分歧。在Gamesindustry.biz的訪問中我們可以引述Warren Spector 與 Jason Rubin 的說法:

“Crunch sucks, but if it is seen by the team members as a fair cost of participating in an otherwise fantastic employment experience, if they value ownership of the resulting creative success more than the hardship, if the team feels like long hours of collaboration with close friends is ultimately rewarding, and if they feel fairly compensated, then who are we to tell them otherwise?" asked Rubin.

Rubin說:"加班確實糟透了,但假如從團隊成員的角度看來,那可能也是一個美妙的團隊經驗,假如他們認為創作的結果高過痛苦,假如他們認為與親密戰友長時間的合作是一個終極的滿足,假如他們能夠獲得回饋,那麼誰又有資格跳出來阻止他們?"

[…] “Look, I’m sure there have been games made without crunch. I’ve never worked on one or led one, but I’m sure examples exist. That tells me something about myself and a lot about the business I’m in," said Spector.

Spector繼續說:"…聽著,我確信一定有不需要加班就產出的遊戲,但我待過的開發案從未這樣,雖然我相信一定有例外。這就是我所作的工作與產業。"

[…] “What I’m saying is that games – I’m talking about non-sequels, non-imitative games – are inherently unknowable, unpredictable, unmanageable things. A game development process with no crunch? I’m not sure that’s possible unless you’re working on a rip-off of another game or a low-ambition sequel.

“…我不是說那些續作專案,抄襲遊戲,而是說完全原創,完全未知的產物。這種遊戲開發案子怎麼可能不加班?除非你正在抄襲或是只是作沒有野心的續作。"

“[…] Crunch is the result of working with a host of unknown factors in creative mediums. Since game development is always full of unknowns, crunch will always exist in studios that strive for quality […] After 30 years of making games I’m still waiting to find the wizard who can avoid crunch entirely without compromising at a level I’m unwilling to accept.”

“…加班是因為我們是與未知參數的藝術媒體戰鬥過程的產物。只要遊戲開發充滿了不確定性,為了追求品質,加班就是必然。但說起來簡單,三十年的遊戲製作機驗後,我仍等待某個魔術師來告訴我不需要加班就可以做出我可以接受的關卡。"

On the other side of the fence is Derek Paxton of Stardock, who said in an interview with Gameranx:

Gameranx的訪問中,Stardock 的 Derek Paxton 表達了另一個角度的看法:

“Crunch makes zero sense because it makes games worse. Companies crunch to push through on a specific game, but the long-term effect is that talented developers, artists, producers and designers burn out and leave the industry.

“加班一點也沒有意義,因為它只會把遊戲搞砸。公司會用加班來壓縮特定遊戲專案,但長期來看會把有才能的開發者,美術人員,製作人,設計人榨乾,逼得他們不得不離開這個產業。"

“Companies and individuals should stop wearing their time spent crunching as a badge of honor. Crunch is a symptom of broken management and process. Crunch is the sacrifice of your employees. I would ask them why crunch isn’t an issue with other industries. Why isn’t crunch an issue at all game studios?

“公司與開發者應該停止把[時間花在加班]當作榮譽的象徵。加班是崩壞管理與流程的病徵。加班是員工的犧牲品。我問其他產業為何他們不需要加班?為什麼不是每個遊戲工作室都需要加班?"

“Employees should see it as a failure. Gamers should be concerned about it, because in the long term the hobby they love is losing talent because of it. Companies should do everything in their power to improve their processes to avoid these consequences.”

“員工應該把這件事視為失敗。遊戲開發者應該認真關心此事,因為長期來看,他們對遊戲的愛會因此而遺失。公司應該要盡其可能改善流程來避免這些後果。"

So who is right – Spector and Rubin, or Paxton?

所以誰才是對的?Spector 及 Rubin,還是 Paxton?

[Full disclosure: team member Paul Tozour leads Mothership Entertainment, whose flagship game is being published by Stardock.]

[搶先報:Paul Tozour 率領母艦娛樂這間公司,他們的主打遊戲會被 Stardock 發布。]

In the Game Outcomes Project survey, we provided 3 text boxes at the end that respondents could use to tell us about their industry experiences.  Where they mention crunch, they invariably mention it as a net negative.  One respondent wrote:

在遊戲專案為何成功的問卷中,我們設計了三個開放欄位給回答者,讓他們告訴我們產業的經驗。關於提到加班的部分,不約而同地都提出負面的說法。其中一個回應這樣寫著:

“The biggest issue we had was that the lead said ‘Overtime is part of game development’ and never TRIED to improve. As sleep was lost, motivation dropped and the staff lost hope … everything fell apart.  Hundred-hour weeks for nine months, and I’m not exaggerating.  Humans can’t function under these conditions  …  If you want to mention my answer feel free. I’m sure it’d be familiar to many devs.”

“我們最大的問題就是管理者說:[加班是遊戲開發的一個部分],而從未試著改善,當睡眠不足,熱情與希望也會隨之降低與剝離。我說真的,人類不能在九個月每周上百個小時的加班這樣的環境下工作… 假如你們希望我老實講,我敢保證其他團隊狀況都相同。"

Another developer put it more bluntly:

另一個開發者說得更難聽:

“Schedule 40 hours a week and you get 38.  Schedule 50 and you get 39 and everyone hates work, life, and you.  Schedule 60 and you get 32 and wives start demanding you send out resumes.  Schedule 80 and you’re [redacted] and get sued, jackass.”

“一周四十個小時的工作,那麼工作效率差不多是三十八小時。如果排五十個小時,那麼會得到三十九小時外加痛恨工作,痛恨人生,及痛恨管理階級的員工。如果排了六十個小時,那麼會得到三十二個小時的效率外加離職潮。排八十個小時的工作,那麼只會收到存證信函。"

In this article, we will be getting a final word on the subject from the one source that has yet to be interviewed: the data.

這篇文章中,我們會訪問我們的案例,也就是那些我們手中的數據資料,對這個題目給一個總結。

The “Extraordinary Effort” Argument

“超凡努力(加班,代表著熱情)"理論

We’ll begin by formulating the “pro-crunch” side of the discourse into testable hypotheses.  Although no one directly claims that crunch is good per se, and no one denies that it can have harmful effects, Spector and Rubin clearly make the case in the article above that crunch is often (if not usually, or even always) a necessary evil.

雖然沒人直接聲稱加班本身就是好事,也沒有人否認它有害,Spector 與 Rubin 清楚地在前面的說法也證實通常(並非總是)加班是必要之惡。但我們先試著以"加班是好事"這個論點來做個整理。

According to this line of thinking, ordinary development with ordinary schedules cannot produce extraordinary results.  We believe an accurate characterization of this viewpoint from the gamesindustry.biz article quoted above would be: “Extraordinary results require extraordinary effort, and extraordinary effort demands long hours.”

也就是假設這樣的思路下去思考,正常工期的開發方式沒辦法製作傑出作品。也就是相信Gamesindustry.biz的訪問中所提到論點:"超凡的成果來自於超越極限的努力(超凡的努力),而超越極限的努力需要長時間付出,也就是加班。"

This position (we’ll call it the “extraordinary effort argument”) leads directly to two falsifiable hypotheses:

1. If the “extraordinary effort argument” is correct, there should be a positive correlation between crunch and game outcomes, and higher levels of crunch should show a measurable improvement in the outcomes of game projects.

2. If the “extraordinary effort argument” is correct, there should be relatively few, if any, highly successful projects without crunch.

這個論點(我們姑且稱為超凡努力的論點)直接就會發現兩個反證:

  1. 假如超凡努力的論點是對的,那麼在我們的問卷中加班與遊戲產出分數上會有正相關,越加班,就應該會產出優秀的作品。
  2. 假如超凡努力的論點是對的,那麼應該不可能發生沒加班卻高度成功的專案。

Luckily for us, we have data from hundreds of developers who took our survey with no preconceptions as to what the study was designed to test, and which we can use to verify both of these statements.  We’ll agree to declare victory for the pro-crunch side if EITHER of these hypotheses remains standing after we put it in the ring with our data set.

很幸運地,我們從問卷中得到數百分開發者的回應,可以透過這些數據來證實這兩件事,尤其是在我們設計之初並沒有故意去設計這樣的實驗。假如數據中告訴我們超凡努力論點是對的,那麼我們也會宣告加班是勝利之道,並將其放在結論的皇冠上。

Crunching the Numbers

We’ll approach our analysis in several phases, carefully determining what the data does and does not tell us.

加班數字

我們接著從數個步驟來分析,小心地看那些數據透露,或沒有透露的事。

Our 2014 survey asked the following five questions related to crunch, which were randomly scattered throughout the survey:

  • “I worked a lot of overtime or ‘crunched’ on this project.”
  • “I often worked overtime because I was required or felt pressured to.”
  • “Our team sometimes seemed to be stuck in a cycle of never-ending crunch / overtime work.”
  • “If we worked overtime, I believe it was because studio leaders or producers failed to scope the project properly (e.g. insufficient manpower, deadlines that were too tight, over-promised features).”
  • “If I worked overtime, it was only when I volunteered to do so.”

我們在2014年的問卷中問了以下關於加班的問題,在問卷中我們還把它們都隨機排列:

  • 我在專案中超時工作。
  • 因為感受到壓力,我常常超時工作。
  • 我們的團隊常常感覺到受阻礙,並陷入無止盡的加班。
  • 需要加班的原因是領導層與製作人在時程上搞砸了。(人力不足,估計期限過短,過度承諾)
  • 我加班是因為我自願加班。

Here’s how the answers to those questions correlate with our aggregate project outcome score (described on our Methodology page).  On the horizontal axis, a score of -1.0 is “disagree completely” and a score of +1.0 is “agree completely."

這裡是這些答案與總和專案產出分數的相關分數(方法論在我們的部落格已描述),水平軸是從-1.0的完全不同意,到1.0的完全同意。

Figure 1. Correlation of each crunch-related question with that project’s actual outcome (aggregate score).  Each of the 5 questions is shown, as an animated GIF with a 4-second delay.  Only the horizontal axis changes.加班相關的問題與總和產出分數的關聯性,每個問題以一個四秒的週期顯示出來

The correlations are as follows: -0.24, -0.30, -0.47, -0.36, +0.36 (in the same order listed in the bullet-pointed list above).  All five of these correlations have statistical p-values well below 0.001, indicating that they are statistically significant.  Note how all the correlations are strongly negative except for the final question, which asked whether crunch was solely voluntary.

關聯性依序是-0.24,-0.30,-0.47,-0.36,0.36(順序如問題序)。五個關聯性都有少於0.001的統計p值。也就是具有統計表徵。注意除了最後一個問題自願加班之外,這裡關聯性都是強烈的負向。

“But wait,” a proponent of crunch might say.  “Surely that’s only because you’re using a combined score.  That score combines the values of questions like ‘this project met its internal goals,’ which are going to give you lower values, because they’re subjective fluff.  Of course people who are unhappy about crunch are going to give that factor low scores – and that’s going to lower the combined score a lot.  It’s a fudge factor, and it’s skewing your results.  Throw it out!  You should throw away the critical success, delays, and internal goals outcomes and JUST look at return on investment and I bet you’ll see a totally different picture.”

但加班的支持者可能會說:"等等,這一定是因為這裡是一個總合分數,包含了內部滿意度,這當然會有負分,因為那是主觀意見,加班就是會讓人不開心,才會導致總合分數這樣發展,應該要排除在外!我們應該要只看利潤的產出分數,一定可以看到不同的結果。"

OK, let’s do that:

那麼我們也從善如流:

Figure 2. Correlation of each of the 5 crunch-related questions with that project’s return on investment (ROI).  As with Figure 1, each of the 5 questions is shown, as an animated GIF with a 4-second delay.  Only the horizontal axis changes.  Note that many of the points shown represent multiple coincident points.  See our Methodology page for an explanation of the vertical axis scale.五個加班相關問題對上專案利潤的關聯度,如圖一相同,每個問題以一個四秒的周期顯示。只有在水平軸不同。每個點都可能代表重合在一起的點。在垂直軸的縮放方式請參照我們的部落格網頁。

Notice how the lines have essentially the same slopes as in the previous figure.  The correlations with ROI are as follows (in the same order): -0.18, -0.26, -0.34, -0.23, and +0.28.  All of these correlations have p-values below 0.012.

注意到迴歸線仍與前一張圖相同嗎?利潤的關聯性是:-0.18,-0.25,-0.34,-0.23,及0.28。全部關聯性都有小於0.012的統計p值。

Still not convinced?  Here are the same graphs again, correlated against aggregate reviews / MetaCritic scores.

不相信嗎?這張圖也一樣,對上網頁分數的關聯性:

Figure 3. Correlation of each of the 5 crunch-related questions with the project’s aggregate reviews / MetaCritic score (note that the vertical axis does not represent actual MetaCritic scores but is a normalized representation of the answers to this question; see our Methodology page for more info).  As with Figures 1 and 2, each of the 5 questions is shown, as an animated GIF with a 4-second delay.  Note that many of the points shown represent multiple coincident points.  Only the horizontal axis changes.五個加班問題對上網頁分數的關聯性(注意垂直軸並非表示MetaCritic真正分數,而只是一個對問題經過正規化的數值。更多資訊,請看我們的部落格)如同圖一與圖二,五個答案都以一個四秒週期的方式顯示。每個點都可能代表重合在一起的點。只有在水平軸的參數是不同的。

The results are essentially identical, and all have p-values under 0.05.

結果一樣,全部都具有小於0.05的統計p值。

So if our combined score has a negative correlation with ALL our crunch questions except the one about crunch being purely voluntary (which itself does not imply any particular level of crunch), that means that we’ve disproven the first part of the “extraordinary effort argument” – the correlation is clearly negative, not positive.

總合的分數對上除了自願加班之外的所有加班問題都是負向的關聯。意思是我們能夠推翻超凡努力論點,很清楚,就是沒有正相關。

Now let’s look at the second testable hypothesis of the “extraordinary effort argument.”

In Figure 4 (below), we’re looking at the two most relevant questions related to overall crunch for a project.  The vertical axis is the aggregate outcome score, while the horizontal axis represents the scale from “disagree completely” (-1) to “agree completely.”  The black lines are trend lines.  As you can see, in both cases, higher agreement with each statement corresponds to inferior project outcomes.

接著來看看我們對於超凡努力理論的第二個辯證。

在下面的圖四中,我們取出兩個對加班問題中最相關的問題。垂直軸是總合產出分數,同時水平軸是從完全不同意的-1,到完全同意。黑色的線是趨勢線。如你可見,在兩個問題中,越高的同意帶來越低的總和分數。

Figure 4. The two most relevant questions related to crunch compared to the aggregate project outcome score.兩個相關問題對上總和的產出分數。

We’ve added horizontal blue and orange lines to both images.  The blue line represents a score of 80, which will be our subjective threshold for “very successful” projects.  The orange line represents a score of 40, which will be our threshold for “very unsuccessful” projects.

我們接著加上了的藍色與橘色水平線。藍色線是80,也就是我們主觀認定非常成功專案。橘色線則代表40。也就是我們主觀認定非常不成功的專案。

The dots above the blue line tell a clear story: in each case, there were more successful games made without crunch than with crunch.

在藍線之上的點的分布清楚了代表一件事:多數成功的遊戲沒加班的數量比加班的多。

However, these charts don’t tell the full story by themselves; many of the data points are clustered at the exact same spot, meaning that each dot can actually represent several data points.  So a statistical deep-dive is necessary.  We’re particularly interested the four corners of the chart – the data points above the blue line on the extreme left and right sides of each chart (below -0.6 and above +0.6 on the horizontal axis) and below the orange line on the left and right sides.

然而,只是圖並不能將細節全盤托出,很多數據點都重合在一起,分不清楚有幾個。所以我們需要再深一步的統計分析。我們對這張圖的四個角落特別有興趣。也就是藍線以上的左右端,以及橘線以下的左右端。(水平軸以-0.6及0.6為界線)

Looking solely at the chart on the top of Figure 4 (“I worked a lot of overtime or ‘crunched’ on this project”), we observed the following pattern.  Note that the percentages are given in terms of the total data points in each vertical grouping (under -0.6 or above 0.6 on the horizontal axis).

獨立看圖四上半(我在專案中超時工作。)我們觀察到後述的模式。注意那些比例是以水平軸已經切開(為左右兩群組)的群組來計算。

We can see clearly that a higher percentage of no-crunch projects succeed than fail (17% vs 10%) and a much larger percentage of high-crunch projects fail rather than succeeding (32% vs 13%).  Additionally, a higher percentage of the successful projects are no-crunch than high-crunch (17% vs 13%), while a higher percentage of the unsuccessful projects are high-crunch vs no-crunch (32% vs 10%).

我們可以很清楚地看到非加班的區塊成功數量是高於失敗數量(17%多過於10%),高度加班的區塊中,失敗卻高於成功(32%多過於13%)。成功專案中,不加班多於加班(17%多過於13%)。失敗專案中高度加班的情形多過於不加班(32%多過於10%)。

Here’s the same chart, but this time looking at the bottom question, “Our team sometimes seemed to be stuck in a cycle of never-ending crunch / overtime work.”

同樣的圖表中,我們看下半個問題:我們的團隊常常感覺到受阻礙,並陷入無止盡的加班。

These results are even more remarkable.  The respondents that answered “disagree strongly” or “disagree completely” were 2.5 times more likely to be working on very successful projects (23% vs 9%), while the respondents who answered “agree strongly” or “agree completely” were, incredibly, more than 10 times more likely to be on unsuccessful projects than successful ones (41% vs 4%).

結果更強烈。在不加班(回應強烈不同意與完全不同意)的案子中成功專案是超過不成功專案的兩倍半(23%對上9%)。在高度加班中(強烈同意與完全同意)的案子中不成功的案子則是成功案子的四倍(41%對上4%)

Some might object to this way of measuring the responses, as it is an aggregate outcome score which takes internal achievement of the project goals into account – and this is a somewhat subjective measure.  What if we looked at return on investment (ROI) alone?  Surely that would paint a different picture.

有些人可能會質疑總和的分數包含了專案的內部滿意度,當然就是主觀意見。那假設我們只看專案利潤?會有不同的結果嗎?

Here is ROI:

專案利潤的圖在此:

Figure 5. The two most relevant questions related to crunch compared to return on investment (ROI).最相關的兩個問題對上專案利潤的關聯性

The first question (top chart) gives us the following results:

第一個問題的結果如此:

The second question (bottom chart) gives us:

第二個問題的結果如此:

These results are essentially equivalent to what we got with Figure 4 — the probabilities have shifted a little bit but the conclusions haven’t changed at all.  The same results hold if we look at MetaCritic scores or any of the other outcome factors we investigated.

結果跟圖四本質上相同,機率稍稍有些偏移,但結論相同。假如我們只看網頁分數,或其他產出要素,結果仍然相同。

For further verification, we did a deep-dive statistical analysis of the data in figures 4 and 5, treating the left and right sides of each graph on each figure (all data points < -0.6 and all those > +0.6) as two separate populations and performing a Wilcoxon rank sum test to compare them.

更深驗證圖四與圖五的數據,把每一張圖的左右兩側都取出來(小於-0.6與大於0.6的數據)我們對它們進行曼-惠特尼U考驗法的分析。

The p-values of all of these are highly statistically significant, with the top two rows having p-values under 0.006 and the bottom two rows with p-values of 0.

全部的統計p值都有高度統計表徵,上面兩列甚至小於0.006,最後一列則是0。

It should be clear that our data set contradicts both of the testable hypotheses that we derived from the “extraordinary effort argument.”  But before declaring victory for Paxton and the anti-crunch side, let’s take a look at the counter-argument.

很明顯我們的數據與我們從超凡努力論點而推論的兩個假設都背道而馳。但在我們宣告Paxton與非加班派的說法勝利之前,我們再來看看反面論點。

The “Crunch Salvage Hypothesis”

加班補救理論

The counter-argument goes something like this:

“Your correlation is bogus, because crunch is more likely to happen on projects that are in trouble in the first place.  So there’s already an underlying correlation between crunch and struggling projects, and this is skewing your results.  You seem to be saying that crunch causes poorer outcomes, but the causality actually works differently – there’s a third, hidden causal factor (“project being in trouble”) that causes both crunch and lower outcomes.  And although crunch helps improve the situation, it’s never quite enough to compensate for the problems in the first place, which is why you get the negative correlation.”

反面的論點意思於此:

“這些關聯性都是導果為因的,因為那些有問題的專案才會加班,根源就是加班與有陷入困境的專案就必然關聯。加班造成失敗的因果關係不存在,而是由其他隱藏的原因所造成(如造成專案陷入困境的原因),那些原因才造成加班與產出低落。雖然加班能彌補產出,但絕不足以彌補到能解決問題的情形。所以當然只得到與總合分數負相關的情形。"

This position warrants further investigation.  As the Spector/Rubin interview linked above makes clear, there are some developers who are willing to demand crunch even in cases where their projects are not in trouble (“crunch will always exist in studios that strive for quality,” according to Spector), so it’s clear that at least in some cases, crunch is used on projects that are not yet having problems.  But the notion that crunch is more likely on struggling projects is entirely plausible.

這個看法需要進一步的調查,如同Spector與Rubin的訪問中所說,有很多開發者是願意加班,即便他們的專案沒有陷入困境。(也就是加班是為了品質,而不是為了追上進度)因此很顯然至少某些例子中,加班並非用在陷入困境的專案中。因此加班是因為專案困難的論述有點似是而非。

Let’s test this counter-argument.  Let’s assume the causation is not A -> B but C -> (A and B), where “A”=crunch, “B”=poorer project outcomes, and “C” represents some vaguely-defined set of factors representing troubled projects.

我們試著用反證法來論述。定義A是加班,B是低產出,C是其他困境專案所來自的原因。這樣的推論應是朝向:C導致了A加B(某種原因導致了加班與低產出),而不是:A導致B(加班造成低產出)

We’ll call this the “crunch salvage hypothesis” – the idea that crunch is more likely to be used on projects in trouble, and that this “trouble” is itself the cause of the poorer project outcomes, and that when crunch is used in this way, it leads to outcomes that are less poor than would otherwise be the case.

這邊我們稱這個情形是"加班補救理論",其意指加班是用在拯救陷入困境的專案,而所謂的困境就是造成專案低產出的原因,當加班出現時會導致產出回填一些(不那麼糟糕)。

We don’t really care about every part of this hypothesis: we’ll simply accept the first two parts (that trouble can arise on a project, and that crunch often happens as a reaction to this trouble) as self-evident truths (although whether they are correct or not isn’t really relevant to this article).

我們並不特別在意假設的其中任一部分,我們簡單承認兩個部分:專案會有問題,加班會因此問題而發生。這部分確實不言而喻。(雖然不管他們是對錯,都與此篇文章無關)

What we really care about, and what we can test, is the third part of this hypothesis – that when crunch is used in this case, it leads to outcomes that are less poor than would otherwise be the case.  In other words, if a project is in trouble, is crunch an effective response?

If the “crunch salvage hypothesis” is correct, then crunch should provide an improved project outcome score beyond what we would expect to see if crunch were not used, all else being equal.

我們真正在意,且我們能夠測試的是假設的第三個部分,當加班開始實行,就會導致沒那麼差的產出,至少比不加班好。換句話說,假如專案發生困難,加班是否是一個提高效率的工具?

假如"加班補救理論"是正確的,那麼加班就應該提高專案產出,比起沒有加班,至少不會減少。

In order to test this conjecture, we calculated a linear regression model that specifically excludes all 5 questions related to crunch/overtime.  We’ll call this model the “crunch-free model.”

為了證明這個假設,我們透過針對五個加班問題做出線性回歸模型,我們稱此模型叫做"不加班模型"。

Figure 6. Correlations for the “crunch-free model” (a linear regression that excludes crunch-related questions) with aggregate game outcome scores.不加班模型(所有加班問題的線性回歸)對上產出數字的關聯性。

This “crunch-free model” correlates with our overall outcome score with a correlation value of 0.811 (and a p-value under 0.001).  This is, by any measure, an extremely strong correlation.

這個不加班模型對我們的產出分數有著0.811的關聯性(小於0.001的p值)。也就是說,從此來看,有著高度的關聯性。

We then computed the crunch-free model’s error term – that is, we compared the actual aggregate outcome score to the predicted outcome score given by the crunch-free model for each response by subtracting the predicted score from the actual aggregate outcome score.  A high value indicates that the project turned out better than the model predicted, while a negative error value indicates that the project turned out worse than it predicted.

我們接著計算不加班模型的誤差項,也就是說,我們比較"實際"的總和產出以及由"不加班模型"所算出的"預測"產出分數,實際減去預測。若是數值高,那麼就代表比預測還好,如果是負值就代表比實際比預測還差。

If we accept that the crunch-free predictive model is a good predictor of game outcomes (and the extremely high correlation and tiny p-value suggest that it is), then the “crunch salvage hypothesis” tells us that we should expect that it should improve the outcomes of game projects where it is used at least to some tiny, observable extent  …  and the more it is used, the more it should improve game project outcomes.

假如我們接受"不加班模型"是一個好的產出分數預測模型(實際上高相關與低p值也顯示如此),而加班補救理論就應該透露出加班會增強產出分數,而且是越多越好。

In other words, if crunch works, it should provide a “lift,” and for projects that involved more crunch, we should see a positive error term (that is, game projects that crunched should have turned out better than the crunch-free model predicts), while for projects that involved little or no crunch, we should see a negative error term.

換句話說,假如加班是有用的,它就會往上牽引專案,會得到一個正向的誤差項(實際產出分數應該比不加班模型預測得更好),同時不加班的專案,就應該得到一個負向的誤差項。

So according to this worldview, there should be a clear, positive correlation between more crunch and a greater positive error value for the crunch-free model.

根據這樣的看法,在加班與不加班模型的正面誤差項上就應該有一個清楚,正面的關聯性。

Here is the correlation for the error term with the answers to each of the two primary crunch-related questions:

下面就是誤差項與兩個問題的關聯度。

Figure 7. The two most relevant questions related to crunch, compared to the error value of the crunch-free model.  The vertical axis is the error of the crunch-free model (positive = better than model predicts; negative = worse), and the horizontal axis indicates agreement with each question (-1.0 = disagree completely, +1.0 = agree completely).兩個加班相關問題對上不加班模型誤差項。垂直軸是不加班模型的誤差項(正值=超過預期;負值=相反),水平軸指出每個問題的同意度(-1完全不同意,+1完全同意)

As you can see, there is a slight negative correlation.  However, it is not statistically significant (p-value = 0.24 for the upper graph, and 0.1 for the lower one).  And even if it were statistically significant, the correlations – at -0.07 and -0.1, respectively – are negative.

如你可見,有一個稍微負向的關聯。然而,沒有統計表徵(上面的圖0.24的p值,下面的圖0.1的p值)。就算勉強要計較,那麼關聯性分別是-0.07及-0.1,也就是負向的。

So where the “crunch salvage hypothesis” tells us to expect correlations that are strong, positive, and statistically significant, we see correlations that are weak, negative, and statistically insignificant.

因此"加班補救理論"所推論,應該要有強烈正面統計表徵,實際上卻是微小,負面,沒有統計表徵。

Testing all of the other crunch-related questions in this way gives us similar results.

對全部的加班問題的測試的結果都類似。

If we accept the assumptions that went into calculating these correlations, then we must conclude that more crunch did not, to any extent that we can detect, help the projects in our study achieve better outcomes than they otherwise would have experienced  …  and in many ways appears to have actually made them worse.

We are left to conclude that crunch does not in any way improve game project outcomes and cannot help a troubled game project work its way out of trouble.

假如我們接受這些關聯性的推論,那麼我們就得到了:加班過多是不能幫助團隊更好的結果(我們所知的產出分數)… 很多情形似乎加班使狀況更糟。

因此我們得到結論不管各種情形加班並不能加強遊戲產出,也不能幫助有問題的團隊解決困境。

Voluntary Crunch

自願性加班

But what about when crunch is voluntary?  Our analysis has already indicated that a when crunch is entirely voluntary, outcomes significantly improve.  Does a lack of mandatory crunch then eliminate the negative effects of the quantity of crunch?  In other words, do higher levels of voluntary crunch then turn crunch from a net negative into a net positive?

但當加班是自願的情形呢?我們的分析已經指出當加班是自願,產出會顯著提升。缺乏強制性的加班會消除加班的負面影響嗎?換句話說,是否高品質的加班就不是負面的?

In short, no.  We compared the two extremes of our primary crunch question (we categorized the highest two answers to “I worked a lot of overtime …” as “High” crunch, and the lowest two as “Low” crunch) against our question about whether crunch was purely voluntary (where we condensed all 7 answers into three 3 broad categories — the top two as “Voluntary,” the bottom two as “Mandatory,” and the middle 3 as “Mixed”).  We also compared these categories using Kruskal-Wallis to prove statistical significance.

簡短來說,答案並非如此。我們將兩個主要的加班問題(我在專案中超時工作被標示為主要)對上是否自願的問題,然後將答案分隨三群:自願的,強制的,混合的。我們也把這些類別用克-瓦二氏檢定來證明有統計表徵。

Our analysis shows that although crunch seems to be significantly less harmful when it’s voluntary, low levels of crunch in each case above (voluntary, mandatory, and mixed) are consistently associated with better outcomes than high levels of crunch.

我們的分析顯示了即便自願性加班有統計表徵比較沒有傷害,在三個自願分類(自願,強制,混合)中,低量的加班還是比高量加班的產出分數來得好。

What Causes Crunch?

The conclusions above led us to ask: what actually causes crunch?  The Spector/Rubin interview above clearly illustrates the attitudes that cause at least some developers to demand extended overtime, but we were curious what the data said.

到底甚麼造成加班?

結論讓我們去詢問到底甚麼導致加班?先前所說Spector及Rubin的訪問中清楚的指出至少有一些開發者是需要加班,我們很好奇數據怎麼說。

If crunch doesn’t correlate with better outcomes, what does it correlate with?  Does it really derive from a desire for excellence, or is it a reaction to a project being in trouble, or do its roots lie elsewhere?

假如加班並沒有造成更好的產出,那麼到底加班與甚麼東西是相關聯的?是否可以從追求卓越的慾望而推導出?或是那是團隊陷入困境的連鎖反應?我們想找到根源。

To find out, we analyzed the correlations of all the input factors in our survey against one another, looking specifically at how factors outside of our group of five crunch-related questions correlated with the five crunch questions.  The four strongest correlations with our crunch-related questions were:

  • +0.51: “There was a lot of turnover on this project.”
  • +0.50: “Team members would often work for weeks at a time without receiving feedback from project leads or managers.”
  • +0.49: “The team’s leads and managers did not have a respectful relationship with the team’s developers.”
  • -0.49: “The development plan for the game was clear and well-communicated to the team.”

(The three positive correlations indicate that they made crunch more likely; the negative correlation is the one that makes crunch less likely).

為了找出答案,

我們分析了所有問卷中所有輸入要素與其他項目的關聯性,特別想找出與這五個加班相關的問題的關聯性。與加班問題最強烈的四個關聯項目如下:

  • 0.51:專案人員流動率很高。
  • 0.50:團隊一段時間沒有收到來自團隊領導層的回應。
  • 0.45:團隊的領導層與團隊的開發者沒有良好的敬重關係。
  • -0.49:開發計畫很清楚,而且與團隊充分溝通。

(前三個正相關指出,越這樣的團隊,就越有可能加班;最後一個負向的相關則表示越強大,加班就越少)

This seems to indicate that crunch does not, in fact, derive from any sort of fundamental drive for excellence, which would have resulted in higher correlations with completely different input factors on our survey.  Rather, it appears to stem from inadequate planning, disorganization, high turnover, and a basic lack of respect for developers.

事實上,這似乎指出加班並未由追求卓越而來(也就是在問卷中的其他問題。)反而是不透明的計畫,沒有組織,高流動率,缺乏尊敬會導致加班。

Conclusion: We Are Not a Unique And Special Snowflake

結論:其實這些理論在產業都適用

We should be clear that we are not attempting to write an academic paper, and our results have not been peer-reviewed.  Therefore, we walk a fine line between analyzing the data and interpreting it.

我們應該很明白我們並非在寫一篇學術文章,這些結論也沒有被審核過。因此我們僅是在分析資料與解釋它們。

However, no matter how we analyze our data, we find that it loudly and unequivocally supports the anti-crunch side.  Our results are clear enough and strong enough that we believe it’s important to step over that fine line, and transition from objective analysis to open advocacy.

然而,不管我們如何分析資料,我們都發現清楚且無須質疑的反加班結論。結果清楚到能夠讓我們站出僅是數據分析之線,擁護著我們的結論。

There is an extensive body of validated management research available showing that extended overtime harms health, productivity, relationships, morale, employee engagement, decision-making ability, and even increases the risk of alcohol abuse.

有更多延伸的管理研究說明加班傷害健康產量,人際關係,士氣,員工雇用,決策能力,甚至是酗酒的風險。

An enormous amount of validated management research demonstrates that net employee productivity turns negative after just a few weeks of overtime.  Total productivity actually declines by 16-20% as we increase our work days from 8 hours to 9 hours.  Even just a few weeks of working 50 hours per week reduces cumulative output below what it would have been working only 40 hours per week – those 10 extra hours of work actually have a negative impact on productivity.  All of that while also increasing employee stress, straining relationships, and increasing product defect rates.

有更多的研究證明員工在幾個禮拜的加班後產量就變成負的。當每日工作時數到八小時以上,總產量實際上降低為百分之十六到二十。每周五十小時的工作量會降低效率使累計產量變為只有每周四十個小時。那多出來的十個小時實際上對產量反而有害。這全都說明增加員工的壓力,會榨乾人際關係,降低良率。

However, the game industry is remarkably insular for such a cutting-edge and successful industry, and it seems generally unaware of this data.  We tend to ignore such evidence or blithely assume it doesn’t apply to us.  As a broad generalization, our industry tends to value industry experience highly while undervaluing fundamental management skills.  As a result, we usually promote managers from within while rarely offering the kind of management training that would enable insiders to perform their jobs adequately.

然而,遊戲產業就是一個很容易贏家全拿的產業,也因此大家都不在意數據的結果。大家都試著忽略這樣的證據,然後說別人的規則不套用在我們身上。也就是說,遊戲產業的團隊會傾向主張產業的經驗比較重要,同時管理技能則不重要。因此我們通常很少會拔擢有管理訓練會引導其他人充分發揮的那些員工。

Is it any wonder, then, that we find ourselves completely cut off from the plethora of validated management research clearly showing that crunch is harmful?

難怪,我們會認為自己完全不受那些多如牛毛的研究所說加班有害論所規範。

The hundreds of anonymous respondents who participated in our survey answered various questions about game development factors and outcomes separately and individually, without any real clue as to the broader objectives of our study.  Simply correlating their aggregate answers shows overwhelmingly that crunch is a net negative no matter how we analyze the data.  It’s not even a case of small amounts of crunch being helpful and then turning harmful; we see no convincing evidence of hormesis.

參加我們問卷的幾百分匿名的回應都是各自獨立的回答了無數關於產業要素及產出的問題。他們絕無可能串通知道我們的意圖。不管我們如何分析,回答的關聯性就直接顯示出加班是淨傷害。並非僅是一點點加班就可以對專案有幫助。我們沒有看到這樣的毒物興奮效應(譯按:低毒性疫苗反而對人體有益)。

It’s common knowledge that crunch leads to higher industry turnover and loss of critical talent, higher stress levels, increased health problems, and higher defect rates – and quite often, broken or deeply impaired personal relationships.  Those who feel that crunch is justified freely admit to knowing this, but they don’t necessarily care about any of these harmful side-effects enough to avoid using it, as they continue to cling to the notion that “extraordinary results require extraordinary effort.”

加班會讓公司流動率增加,失去有才能的員工,高壓,健康問題,低良率,人際關係崩壞。那些覺得加班是合理的人其實都知道,但他們刻意忽略這些副作用因此而加班,當他們持續使用這種興奮劑來加大產量時,掉落的效率讓他們就必須使用更多的興奮劑(更多的加班)。

However, this notion appears to be a fallacy, and our analysis suggests that if the industry is to mature, we must cast it aside.

這看法是錯誤的,我們的分析建議:若產業想要成熟一點,我們必須不再這麼做。

Our results clearly demonstrate that crunch doesn’t lead to extraordinary results.  In fact, on the whole, crunch makes games LESS successful wherever it is used, and when projects try to dig themselves out of a hole by crunching, it only digs the hole deeper.

我們的結果很清楚地證實加班並不會導致絕佳的產出。事實上,整體來看,加班不管怎麼做反而使得遊戲更不成功。當專案想要透過加班來挖出寶藏時,加班只挖出自己的墳墓。

Perhaps the notion that “extraordinary results require extraordinary effort” is misguided.

也許只有超凡努力才能做出超凡結果這段話誤導了我們

Perhaps “effort” – as defined by working extra hours in an attempt to accomplish more – is actually counterproductive.

努力若指的是工時,實際上是無效的。

Our study seems to reveal that what actually generates “extraordinary results” – the factors that actually make great games great – have nothing to do with mere “effort” and everything to do with focus, team cohesion, a compelling direction, psychological safety, risk management, and a large number of other cultural factors that enhance team effectiveness.

我們的研究指出實際上透露出所謂的超凡結果,也就是讓遊戲更好並非純粹只是努力。而是跟專注團隊內聚力明確的方向心理安全感風險管理,以及增強團隊效率的文化要素有關。

And we suggest that abuse of overtime makes that level of focus and team cohesion increasingly more difficult to achieve, eliminating any possible positive effects from overtime.

我們的建議是:濫用加班就像吸毒一樣會讓專注與團隊內聚力越來越難以做到,加班會逐漸減少團隊的各種正面效果。

We welcome open discourse and debate on this subject.  Anyone who wishes to double-check our results is welcome to download our data set and perform their own analysis and contact us at @GameOutcomes on Twitter with any questions.

我們以開放的態度歡迎對於這個題目的討論。如果你想要驗證我們的推論,歡迎下載我們的資料來分析。歡迎在推特上用任何問題回覆我們。

The Game Outcomes Project team would like to thank the hundreds of current and former game developers who made this study possible through their participation in the survey.  We would also like to thank IGDA Production SIG members Clinton Keith and Chuck Hoover for their assistance with survey design; Kate Edwards, Tristin Hightower, and the IGDA for assistance with promotion; and Christian Nutt and the Gamasutra editorial team for their assistance in promoting the survey.

“遊戲專案為何成功"團隊希望能感謝數百名現任開發者及前輩,讓這個問卷研究能順利進行。我們也同時感謝IGDA生產力同好會的成員Clinton Keith與Chuck Hoover在問題設計方面的協助;感謝Kate Edward,Tristin HightowerIGDA協助推廣此專案;感謝Christian Nutt及Gamasutra編輯群對我們問卷的支持。

 

 

廣告

5 thoughts on “[翻譯] 遊戲專案為何成功系列之四:加班反而會把事情搞砸

發表迴響

在下方填入你的資料或按右方圖示以社群網站登入:

WordPress.com Logo

您的留言將使用 WordPress.com 帳號。 登出 / 變更 )

Twitter picture

您的留言將使用 Twitter 帳號。 登出 / 變更 )

Facebook照片

您的留言將使用 Facebook 帳號。 登出 / 變更 )

Google+ photo

您的留言將使用 Google+ 帳號。 登出 / 變更 )

連結到 %s