遊戲專案為何成功系列之一：脫穎而出者背後的金礦
This article is the first in a 5-part series.
- Part 1: The Best and the Rest is available here: (Gamasutra) (BlogSpot) (in Chinese)
- Part 2: Building Effective Teams is available here: (Gamasutra) (BlogSpot) (in Chinese)
- Part 3: Game Development Factors is available here: (Gamasutra) (BlogSpot) (in Chinese)
- Part 4: Crunch Makes Games Worse is available here: (Gamasutra) (BlogSpot) (in Chinese)
- Part 5: What Great Teams Do is available here: (Gamasutra) (in Chinese)
- For extended notes on our survey methodology, see our Methodology blog page.
- Our raw survey data (minus confidential info) is now available here if you’d like to verify our results or perform your own analysis.
- 第一篇：脫穎而出者背後的金礦。
- 第二篇：如何打造有效率的團隊。
- 第三篇：遊戲產業的獨特要素。
- 第四篇：加班反而會把事情搞砸。
- 第五篇：提升團隊之路。
- 關於問卷的方法論，請參閱我們的部落格頁面 “Game Outcomes Project Methodology":http://intelligenceengine.blogspot.tw/2014/11/game-outcomes-project-methodology-in.html
The Game Outcomes Project, Part 1: The Best and the Rest
遊戲專案為何成功系列之一：脫穎而出者的背後
What makes the best teams so effective?
Veteran developers who have worked on many different teams often remark that they see vast cultural differences between them. Some teams seem to run like clockwork, and are able to craft world-class games while apparently staying happy and well-rested. Other teams struggle mightily and work themselves to the bone in nightmarish overtime and crunch of 80-90 hour weeks for years at a time, or in the worst case, burn themselves out in a chaotic mess. Some teams are friendly, collaborative, focused, and supportive; others are unfocused and antagonistic. A few even seem to be hostile working environments or political minefields with enough sniping and backstabbing to put Team Fortress 2 to shame.
是甚麼要素使得優秀的團隊脫穎而出？
在不同團隊工作過的資深的開發者，常談論團隊間的文化差異。有些團隊的運作方式像精準的時鐘，能夠產生世界級的產品，同時還有愉快的工作環境與充分的休閒生活。有些團隊則在無止盡的熬夜工作中掙扎前進，甚至一週工作八十到九十小時。更糟糕的是，成員因此覺得自己油盡燈枯。有些團隊氣氛良好，合作愉快，專注，游刃有餘。有些團隊目標搖擺不定，成員互相制衡。甚至有些團隊是在仇視，背刺，針對性行為或充滿政治的工作環境下工作。難道大家都沒玩Team Fortress 2（絕地要塞2）嗎？
What causes the differences between those teams? What factors separate the best from the rest?
As an industry, are we even trying to figure that out?
Are we even asking the right questions?
到底甚麼要素使得這些團隊不同？甚麼要素使得好團隊脫穎而出？
遊戲產業想要知道這些答案嗎？
我們是否有問到關鍵問題？
These are the kinds of questions that led to the development of the Game Outcomes Project. In October and November of 2014, our team conducted a large-scale survey of hundreds of game developers. The survey included roughly 120 questions on teamwork, culture, production, and project management. We suspected that we could learn more from a side-by-side comparison of many game projects than from any single project by itself, and we were convinced that finding out what great teams do that lesser teams don’t do – and vice versa – could help everyone raise their game.
對於這些答案飢渴就導致了"遊戲專案為何成功"這個專案的催生。2014年的十月開始，我們團隊針對遊戲開發者進行了大範圍的問卷。問卷中包含了關於團隊合作，文化，製程，專案管理的一百二十個問題。我們預估可以從這些問卷資料的比對中找到差異，我們也確信可以找到優秀團隊到底有或沒有做了其他團隊做的甚麼關鍵動作，進而幫助其他團隊。
Our survey was inspired by several of the classic works on team effectiveness. We began with the 5-factor team effectiveness model described in the book Leading Teams: Setting the Stage for Great Performances. We also incorporated the 5-factor team effectiveness model from the famous management book The Five Dysfunctions of a Team: A Leadership Fable and the 12-factor model from 12: The Elements of Great Managing, which is derived from aggregate Gallup data from 10 million employee and manager interviews. We felt certain that at least one of these three models would surely turn out to be relevant to game development in some way.
我們的問卷被幾項團隊效率的數個研究所啟發：Leading Teams: Setting the Stage for Great Performances的"團隊效率的五個指標"；管理名著：The Five Dysfunctions of a Team: A Leadership Fable；Gallup 公司所蒐集一千萬員工與管理者面試資料所產出12: The Elements of Great Managing的"十二個指標"；我們認為這三種模型中至少有一種能真正套用在遊戲開發上。
We also added several categories with questions specific to the game industry that we felt were likely to show interesting differences.
On the second page of the survey, we added a number of more generic background questions. These asked about team size, project duration, job role, game genre, target platform, financial incentives offered to the team, and the team’s production methodology.
我們也依照遊戲產業的特殊性加入了幾個我們認為會探索出有趣差異的類別問題。
在問卷的第二頁是一般性的背景問題，關於團隊人數大小，專案長度，工作角色，遊戲類型，平台，商業營收，以及開發方法。
We then faced the broader problem of how to quantitatively measure a game project’s outcome.
Ask any five game developers what constitutes “success,” and you’ll likely get five different answers. Some developers care only about the bottom line; others care far more about their game’s critical reception. Small indie developers may regard “success” as simply shipping their first game as designed regardless of revenues or critical reception, while developers working under government contract, free from any market pressures, might define “success” simply as getting it done on time (and we did receive a few such responses in our survey).
我們立即面臨的問題，就是如何量化所謂的成功？
這個問題是因人而異，有些開發者標準不高，有些則關心遊戲的名聲。獨立開發者希望遊戲照著他們所設計的方向完成，而不管市場營收，也不管是否引來負評。有些開發者是依照合約規範下工作，不需在意市場壓力，此時的成功就是專案完成。（而我們確實在問卷中收到這樣的回應）
Lacking any objective way to define “success,” we decided to quantify the outcome through the lenses of four different kinds of outcomes. We asked the following four outcome questions, each with a 6-point or 7-point scale:
- “To the best of your knowledge, what was the game’s financial return on investment (ROI)? In other words, what kind of profit or loss did the company developing the game take as a result of publication?"
- “For the game’s primary target platform, was the project ever delayed from its original release date, or was it cancelled?"
- “What level of critical success did the game achieve?"
- “Finally, did the game meet its internal goals? In other words, to what extent did the team feel it achieved something at least as good as it was trying to create?"
由於缺乏定義成功的方式，我們決定用不同的指標來量化這個數據。我們問了四個關於產出的問題，分別有六，或七個回答等級：
- 就你所知，遊戲專案的回收狀況如何？也就是公司開發此專案的獲利與損失如何？
- 針對此遊戲專案的首要平台，專案有延遲或被取消嗎？
- 此遊戲專案有被認定為成功嗎？
- 此遊戲專案是否有達到原本認定的目標？也就是，團隊內部認為最終產出與原先預估相契合？
We hoped that we could correlate the answers to these four outcome questions against all the other questions in the survey to see which input factors had the most actual influence over these four outcomes. We were somewhat concerned that all of the “noise” in project outcomes (fickle consumer tastes, the moods of game reviewers, the often unpredictable challenges inherent in creating high-quality games, and various acts of God) would make it difficult to find meaningful correlations. But with enough responses, perhaps the correlations would shine through the inevitable noise.
我們希望我們能將產出的回答結果關聯到問卷的其他問題項目。了解哪些問題項目對產出幫助最高。我們也關心哪些問題其實是對於產出無關緊要（如玩家的品味，評論家的心情，高品質遊戲的突破），這些雜訊會造成我們找不到真正有意義的關聯。但有足夠的回應，也許這些關聯性可以從雜訊中被我們撿拾出來。
We then created an aggregate “outcome” value that combined the results of all four of the outcome questions as a broader representation of a game project’s level of success. This turned out to work nicely, as it correlated very strongly with the results of each of the individual outcome questions. Our Methodology blog page has a detailed description of how we calculated this aggregate score.
接著我們就混雜了先前提到的四個產出問題，依此設計了產出的合計分數，來代表遊戲專案的成功度。結果運作的很棒，這個機制強烈的與各項問題連結。我們的方法論部落格頁面有這項分數的詳細描述。
We worked carefully to refine the survey through many iterations, and we solicited responses through forum posts, Gamasutra posts, Twitter, and IGDA mailers. We received 771 responses, of which 302 were completed, and 273 were related to completed projects that were not cancelled or abandoned in development.
我們小心的計算並重複定義這項問卷的數據，也從論壇，Gamasutra，Twitter，IGDA的郵件群組中收到回饋。我們最終回收了七百七十一份的問卷，其中三百零二份有效，而沒有被取消或放棄的專案留下了兩百七十三份。
The Results
So what did we find?
In short, a gold mine. The results were staggering.
結論
所以我們最後找到了甚麼？
簡短來說，我們找到了驚人的金礦。
More than 85% of our 120 questions showed a statistically significant correlation with our aggregate outcome score, with a p-value under 0.05 (the p-value gives the probability of observing such data as in our sample if the variables were be truly independent; therefore, a small p-value can be interpreted as evidence against the assumption that the data is independent). This correlation was moderate or strong in most cases (absolute value > 0.2), and most of the p-values were in fact well below 0.001. We were even able to develop a linear regression model that showed an astonishing 0.82 correlation with the combined outcome score (shown in Figure 1 below).
在一百二十項的問題中超過百分之八十五都顯示出與我們的產出分數有強烈的關聯性（correlation），其顯著性（p-value，http://en.wikipedia.org/wiki/P-value，此值是觀察資料中變數是否獨立的機率，因此此值越小代表可被解釋為推論的資料是獨立的）都小於0.05。關聯性很大（大於0.2），大多數的p-values甚至小於0.001。我們甚至能夠建立出驚人有0.82關聯性的回歸分析結果。
Figure 1. Our linear regression model (horizontal axis) plotted against the composite game outcome score (vertical axis). The black diagonal line is a best-fit trend line. 273 data points are shown. 圖片請連原文：http://gamasutra.com/db_area/images/blog/232023/regression_vs_outcome_normalized.png。我們的水平軸回歸分析對上垂直軸的產出分數。
To varying extents, all three of the team effectiveness models (Hackman’s “Leading Teams” model, Lencioni’s “Five Dysfunctions” model, and the Gallup “12” model) proved to correlate strongly with game project outcomes.
We can’t say for certain how many relevant questions we didn’t ask. There may well be many more questions waiting to be asked that would have shined an even stronger light on the differences between the best teams and the rest.
But the correlations and statistical significance we discovered are strong enough that it’s very clear that we have, at the very least, discovered an excellent partial answer to the question of what makes the best game development teams so successful.
廣義來說，三個團隊效率的模型（Hackman’s “Leading Teams” model, Lencioni’s “Five Dysfunctions” model, and the Gallup “12” model）全部都與我們的產出分數高度相關。
我們不能聲稱是否我們沒列出來的問題才更加與其相關，確實可能有更多問題是我們應該問的，更加將好團隊的要素發掘出來。
但統計的證據顯示出來我們發覺了此問題的可能答案，可以讓我們的遊戲開發團隊更加成功。
The Game Outcomes Project Series
Due to space constraints, we’ll be releasing our analysis as a series of several articles, with the remaining 3 articles released at 1-week intervals beginning in January 2015. We’ll leave off detailed discussion of our three team effectiveness models until the second article in our series to allow these topics the thorough analysis they deserve.
This article will focus solely on introducing the survey and combing through the background questions asked on the second survey page. And although we found relatively few correlations in this part of the survey, the areas where we didn’t find a correlation are just as interesting as the areas where we did.
遊戲專案為何成功的系列
由於篇幅所限，我們會依序釋出一系列的分析文章，剩下的三篇會以一個禮拜為周期的方式自2015年一月開始釋出。我們會在第二篇中探討三個團隊效率模型，讓它們能夠被詳盡的分析及解釋。
本篇文章會專注在介紹這個問卷專案，以及問卷第二頁的背景問題。在其中我們發現一些低關聯性的問題，這些區域中我們未能找到如同其他區域一樣顯著的關聯性。
Project Genre and Platform Target(s)
First, we asked respondents to tell us what genre of game their team had worked on. Here, the results are all across the board.
專案的遊戲類型與發布平台
首先，我們請填寫者回答了他們團隊開發遊戲專案的類型。結果如圖。
Figure 2. Game genre (vertical axis) vs. composite game outcome score (horizontal axis). Higher data points (green dots) represent more successful projects, as determined by our composite game outcome score. 水平軸的遊戲類型對垂直軸的產出分數。越高的數值代表越成功的案子。
We see remarkably little correlation between game genre and outcome. In the few cases where a game genre appears to skew in one direction or another, the sample size is far too small to draw any conclusions, with all but a handful of genres having fewer than 30 responses.
我們在遊戲類型與產出分數間沒有發現顯著的關聯性。有些數據會出現一個方向的趨勢，但由於個別的遊戲類型都只有不到三十份回應，因此這些數據無法讓我們做出結論。
(Note that Figure 2 uses a box-and-whisker plot, as described here).
We also asked a similar question regarding the product’s target platform(s), including responses for desktop (PC or Mac), console (Xbox/PlayStation), mobile, handheld, and/or web/Facebook. We found no statistically significant results for any of these platforms, nor for the total number of platforms a game targeted.
我們也問了類似關於發布平台的問題（桌機，家機，行動，手持，網頁），也都沒有顯著的統計結果。
Project Duration and Team Size
We asked about the total months and years in development; based on this, we were able to calculate each project’s total development time in months:
專案長度與團隊人數
我們問了關於開發的總年月數。
Figure 3. Total months in development (horizontal axis) vs game outcome score (vertical). The black diagonal line is a trend line. 總月數對產出分數
As you can see, there’s a small negative correlation (-0.229, using the Spearman correlation coefficient), and the p-value is 0.003. This negative correlation is not too surprising, as troubled projects are more likely to be delayed than projects that are going smoothly.
如你們可見，有一個負向的關聯，p值是0.003。負關聯並不令人意外，不順利的專案都會做比較長。
We also asked about the size of the team, both in terms of the average team size and the final team size. Average team size was between 1 and 11 with an average of 5.7; final team size was between 1 and 500 with an average of 48.6. Both showed a slight positive correlation with project outcomes, as shown below, but in both cases the p-value is over 0.1, indicating there’s not enough statistical significance to make this correlation useful or noteworthy. We suspect that the small positive correlation can be explained by the fact that a struggling project is less likely to receive additional resources over time than one that’s going well. So the result is not too surprising.
我們也問了關於團隊大小的問題，包含平均的人數，與團隊最後的人數。平均人數的數據自1到11，平均5.7。最後的人數的數據自1到500，平均為48.6。對於專案的產出都有正相關，但兩者的p值都大於0.1，顯示不出足夠的統計特徵，因此我們就不在此深入研究。我們假設小的正相關可說是小團隊都資源都不足。
Figure 4. Average team size correlated against game project outcome (vertical axis).平均人數對產出分數
Figure 5. Final team size correlated against game project outcome (vertical axis).最終人數對產出分數
Figure 6. Percent change in team size (final divided by average) correlated against game project outcome (vertical axis).專案人數改變綠對產出分數
Game Engines
We asked about the technology solution used: whether it was a new engine built from scratch; core technology from a previous version of a similar game or another game in the same series; an in-house / proprietary engine (such as EA Frostbite); or an externally-developed engine (such as Unity, Unreal, or CryEngine).
The results are as follows:
遊戲引擎
我們問了關於技術方案的問題。從自製引擎到市售引擎（如Unity，Unreal，CryEngine）。結果如下。
Figure 7. Game engine / core technology used (horizontal axis) vs game project outcome (vertical axis), using a box-and-whisker plot.遊戲引擎對產出分數
Average composite score | Standard Deviation | Number of responses | |
New engine/tech | 53.3 | 18.3 | 41 |
Engine from previous version of same or similar game | 64.8 | 15.8 | 58 |
Internal/proprietary engine / tech (such as EA Frostbite) | 60.7 | 19.4 | 46 |
Licensed game engine (Unreal, Unity, etc.) | 55.6 | 17.5 | 113 |
Other | 55.5 | 19.5 | 15 |
The results here are less striking the more you look at them. The highest score was for projects that used an engine from a previous version of the same game or a similar one – but that’s exactly what one would expect to be the case, given that teams in this category clearly already had a head start in production, much of the technical risk had already been stamped out, and there was probably already a veteran team in place that knew how to make that type of game!
結果並不如期待中驚訝。顯而易見最高分的專案是同樣或類似系列的續作。同類型專案已經將風險降低，成員也可能是箇中老手。
We analyzed these results using a Kruskal-Wallis one-way analysis of variance, and we found that this question was only statistically significant on account of that very option (engine from a previous version of the same game or similar), with a p-value of 0.006. Removing the data points related to this answer category caused the p-value for the remaining categories to shoot up above 0.3.
我們使用Kruskal-Wallis one-way analysis of variance來分析結果，我們發現只有沿用同類型技術的專案才有顯著的低p值（0.006）。除此之外的數據的p值都超過0.3。
Our interpretation of the data is that the best option for the game engine depends entirely on the game being made and what options are available for it, and that any one of these options can be the “best” choice given the right set of circumstances. In other words, the most reasonable conclusion is there is no universally “correct” answer separate from the actual game being made, the team making it, and the circumstances surrounding the game’s development. That’s not to say the choice of engine isn’t terrifically important, but the data clearly shows that there plenty of successes and failures in all categories with only minimal differences in outcomes between them, clearly indicating that each of these four options is entirely viable in some situations.
我們對此數據的解釋是沿用舊有技術的專案可能是該唯一可行的方案，換句話說最理性的解答就是對於不同團隊而言，製作遊戲沒有所謂正確的工具，所謂正確的工具其實就是團隊製造出來的工具。因此我們不能說引擎的選擇並不重要，只能說從數據來看引擎的選用未能造成產出的差異。
We also did not ask which specific technology solution a respondent’s dev team was using. Future versions of the study may include questions on the specific game engine being used (Unity, Unreal, CryEngine, etc.)
但我們沒有以特別引擎來詢問，也就是說沒有細到問團隊使用的是Unity，Unreal，或是CryEngine。這點我們在未來可以加入問卷之中。
Team Experience
We also asked a question on this page regarding the team’s average experience level, along a scale from 1 to 5 (with a ‘1’ indicating less than 2 years of average development experience, and a ‘5’ indicating a team of grizzled game industry veterans with an average of 8 or more years of experience).
團隊經歷
我們也問了關於團隊平均經歷的問題，從少於兩年，到八年以上。
Figure 8. Team experience level ranking (horizontal axis, by category listed above) mapped against game outcome score (vertical axis)團隊經歷對產出分數
Here, we see a correlation of 0.19 (and p-value under 0.001). Note in particular the complete absence of dots in the upper-left corner (which would indicate wildly successful teams with no experience) and the lower-right corner (which would indicate very experienced teams that failed catastrophically).
這裡我們可看到一個0.19的相關度（其p值小於0.001）。請特別注意左上角（沒經驗但成功的團隊）與右下角（很有經驗但失敗的團隊）是完全沒有資料的。
So our study clearly confirms the common knowledge in the industry that experienced teams are significantly more likely to succeed. This is not at all surprising, but it’s reassuring that the data makes the point so clearly. And as much we may all enjoy stories of random individuals with minimal game development experience becoming wildly successful with games developed in just a few days (as with Flappy Bird), our study shows clearly that such cases are extreme outliers.
因此我們依照常識確認有經驗的團隊比較容易成功。也不意外。也因此關於那些極小團隊與專案的巨大勝利（如Flappy Bird）其實真的是偶發事件。
The Surprises: Production and Incentives
This first page of our survey also revealed two major surprises.
The first surprise was financial incentives. The survey included a question: “Was the team offered any financial incentives tied to the performance of the game, the team, or your performance as individuals? Select all that apply.” We offered multiple check boxes to say “yes” or “no” to any combination of financial incentives that were offered to the team.
The correlations are as follows:
令人意外：製程與激勵因子
我們問卷的第一頁給我們兩個重要的意外結論。
第一個意外是金錢的激勵。問卷裡面問了一個問題：團隊會依據團隊，或個人產出給予金錢方面的激勵嗎？填寫者可以對給團隊或個人來分別填寫有及沒有。
Figure 9. Incentives (horizontal axis) plotted against game outcome score (vertical axis) for the five different types of financial incentives, using a box-and-whisker plot. From left to right: incentives based on individual performance, team performance, royalties, incentives based on game reviews/MetaCritic scores, and miscellaneous other incentives. For each category, we split all 273 data points into those excluding the incentive (left side of each box) and those including the incentive (right side of each box).獎勵對產出分數，從左到右根據分別代表根據個人效率，團隊效率，分成，根據網頁評論評分，或其他數據。我們在各項目中分別列出有與沒有的情形。
Of these five forms of incentives, only individual incentives showed statistical significance. Game projects offering individually-tailored compensation (64 out of the 273 responses) had an average score of 63.2 (standard deviation 18.6), while those that did not offer individual compensation had a mean game outcome score of 56.5 (standard deviation 17.7). A Wilcoxon rank-sum test for individual incentives gave a p-value of 0.017 for this comparison.
在五種激勵因子中，只有個人的獎勵是有顯著的統計指標。273份數據中的64份有給個人的獎勵，該些專案的平均產出分數是63.2，標準差18.6，而反過來沒有給個人獎勵的數據則是平均56.5（標準差17.7）。在這個比較下給予個人獎勵的數據透過Wilcoxon rank-sum test方法可以得到0.017的p值。
All the other forms of incentives – those based on team performance, based on royalties, based on reviews and/or MetaCritic ratings, and any miscellaneous “other” incentives – show p-values that indicate that there was no meaningful correlation with project outcomes (p-values 0.33, 0.77, 0.98, and 0.90, respectively, again using a Wilcoxon rank-sum test).
那麼其他的獎勵方式，基於團隊，分成，網頁評論評分，或其他的方式，都沒有對於產出分數有顯著的相關度。（p值0.33，0.77，0.98，0.90）
This is a very surprising finding. Incentives are usually offered under the assumption that they are a huge motivator for a team. However, our results indicate that only individual incentives seem to have the desired effect, and even then, to a much smaller degree than expected.
這發現令人驚訝。我們都認為獎勵給予團隊會有巨大的激勵。然而，結果卻顯示，只有給予個人的激勵才會達到效果，即便如此，都比我們認為能達到的等級都來得小。
One possible explanation is that perhaps the psychological phenomenon popularized by Dan Pink may be playing itself out in the game industry – that financial rewards are (according to a great deal of recent research) usually a completely ineffective motivational tool, and actually backfire in many cases.
可能的解釋是也許類似psychological phenomenon popularized by Dan Pink的解釋，也就是金錢的獎勵可能反而會造成反效果。
We also speculate that in the case of royalties and MetaCritic reviews in particular, the sense of helplessness that game developers can feel when dealing with factors beyond their control – such as design decisions they disagree with, or other team members falling down on the job – potentially compensates for any motivating effect that incentives may have had. With individual incentives, on the other hand, individuals may feel that their individual efforts are more likely to be noticed and rewarded appropriately. However, without more data, this all remains pure speculation on our part.
我們對於分成與網頁評論評分特別深入思考，也許遊戲開發者的不快樂是來自於無法掌控的時候，如被迫接受設計，團隊有人搞砸無法交件。這些不快樂抵銷了能夠獲得獎勵的因子。只有針對個人的獎勵卻仍能夠保持在成員身上。然而，我們的資料如果更多，才能夠真正證明這點推論。
Whatever the reason, our results seem to indicate that individually tailored incentives, such as Pay For Performance (PFP) plans, seem to achieve meaningful results where royalties, team incentives, and other forms of financial incentives do not.
不管原因如何，我們的結論指向個人的獎勵，如Pay For Performance所言，確實比其他方式來的有效。
Our second big surprise was in the area of production methodologies, a topic of frequent discussion in the game industry.
We asked what production methodology the team used – 0 (don’t know), 1 (waterfall), 2 (agile), 3 (agile using “Scrum”), and 4 (other/ad-hoc). We also provided a detailed description with each answer so that respondents could pick the closest match according to the description even if they didn’t know the exact name of the production methodology. The results were shocking.
另一個令人驚訝之處在於方法論，也是業界頻繁討論的問題。
我們問的問題是團隊是屬於沒有使用特定的開發方法，使用瀑布式，使用敏捷式，或使用其他隨意的方式來開發。我們也在答案的旁邊附註了最有可能的情境。結果令人震驚。
Figure 10. Production methodology vs game outcome score.製程方法論對上產出分數
Here’s a more detailed breakdown showing the mean and standard deviation for each category, along with the number of responses in each:
這裡是一個詳細的數據表格，顯示各項方法論的數據及標準差，與數量。
Average composite score | Standard Deviation | Number of responses | |
Unknown | 50.6 | 17.4 | 7 |
Waterfall | 55.4 | 17.9 | 53 |
Agile | 59.1 | 19.4 | 94 |
Agile using Scrum | 59.7 | 16.9 | 75 |
Other / Ad-hoc | 57.6 | 17.6 | 44 |
What’s remarkable is just how tiny these differences are. They almost don’t even exist.
看出甚麼了嗎？答案是甚麼也沒有。
Furthermore, a Kruskal-Wallis H test indicates a very high p-value of 0.46 for this category, meaning that we truly can’t infer any relationship between production methodology and game outcome. Further testing of the production methodology against each of the four game project outcome factors individually gives identical results.
更進一步的說，透過Kruskal-Wallis H test方法測試，竟造成了一個高度的p值0.46。也就是說我們完全無法找到方法論對於遊戲產出的關係。
Given that production methodologies seem to be a game development holy grail for some, one would expect to see major differences, and that Scrum in particular would be far out in the lead. But these differences are tiny, with a huge amount of variation in each category, and the correlations between the production methodology and the score have a p-value too high for us to deny the assumption that the data is independent. Scrum, agile, and “other” in particular are essentially indistinguishable from one another. “Unknown” is far higher than one would expect, while “Other/ad-hoc” is also remarkably high, indicating that there are effective production methodologies available that aren’t on our list (interestingly, we asked those in the “other” category for more detail, and the Cerny method was listed as the production methodology for the top-scoring game project in that category).
對某些遊戲開發者來說，方法論被認為是聖杯，會造成巨大的差異，特別以敏捷式為標的。但其造成的差異很微小。每個方法論與產出分數的相關度都很低（p值很高）。比之Scrum，敏捷式，或其他方法，沒有使用特定的方法與使用其他方法都比想像中來得高。對此我們只能解釋可能存在我們沒有列出的方法才可能是正解。（有趣的是，在填寫者的回答中Cerny method得到了最高的分數）。
Also, unlike our question regarding game engines, we can’t simply write this off as some methodologies being more appropriate for certain kinds of teams. Production methodologies are generally intended to be universally useful, and our results show no meaningful correlations between the methodology and the game genre, team size, experience level, or any other factors.
同時不像對於引擎的問題，我們無法直接了當寫出對於某些團隊來說，某些方法論有效與否。製程的方法論原本被認為是放諸四海皆準，但我們的結果卻沒辦法顯示出方法論與遊戲類型，團隊人數，經驗，或其他項目有相關性。
This begs the question: where’s the payoff?
這就引出了問題，甚麼才是決定性的要素。
We’ve seen several significant correlations in this article, and we will describe many more throughout our study. Articles 2 and 3 in particular will illustrate many remarkable correlations between many different cultural factors and game outcomes, with more than 85% of our questions showing a statistically significant correlation.
這篇文章中我們以談到幾個決定性的相關度，而我們還會在之後的研究提出更多的結論。第二與第三篇會提出更多關於文化與遊戲產出分數顯著的相關性，我們提出的問題中超過八成五都有顯著的相關性。
So it’s very clear that where there were significant drivers of project outcomes, they stood out very clearly. Our results were not shy. And if the specific production methodology a team uses is really vitally important, we would expect that it absolutely should have shown up in the outcome correlations as well.
But it’s simply not there.
所以，顯然對於專案產出，我們的問卷已經找到決定性的驅動因子，非常明顯。也因此我們沒必要藏私。假如此處特定的方法論很重要，我們深信必定可以在產出分數上產生相關性。
但結果並不是如此。
It seems that in spite of all the attention paid to the subject, the particular type of production methodology a team uses is not terribly important, and it is not a significant driver of outcomes. Even the much-maligned “Waterfall” approach can apparently be made to work well.
似乎儘管我們試圖找到方法論的相關性，都沒辦法找到方法論對產出是重要的證據。即便是大家最不喜歡的瀑布式也都運作的很好。
Our third article will detail a number of additional questions we asked around production that give some hints as to what aspects of production actually impact project outcomes regardless of the specific methodology the team uses — although these correlations are still significantly weaker on average than any of our other categories concerning culture.
我們的第三篇文章會詳述其他關於製程的問題，那還會顯示不管團隊使用甚麼開發方法，哪幾個製程面對團隊產出有幫助，儘管那些的相關度比起文化的項目來說依然很微弱。
Conclusions
We are beginning to crack open the differences that separate the best teams from the rest.
We have seen that four factors – total project duration, team experience level, financial incentives based on individual performance, and re-use of an existing game engine from a similar game – have clear correlations with game project outcomes.
結論
我們開始推敲出優秀團隊與其他團隊的差異。我們已經挑出了四個要素，專案長度，團隊經驗，基於個人的金錢獎勵，以及對於類似遊戲使用已經存在的引擎。都對於產出有清楚的相關性。
Our study found several surprises, including a complete lack of any correlations between factors that one would assume should have a large impact, such as team size, game genre, target platforms, the production methodology the team used, or any additional financial incentives the team was offered beyond individual performance compensation.
我們的研究也找到四個意外，團隊大小，遊戲類型，目標平台，與團隊使用的方法論，或是除了給予個人之外其他的獎勵因子，原本我們都認為它們很重要，結果卻未能顯示出任何相關性。
In the second article in the series, to be published in early January, we will discuss the three team effectiveness models that inspired our study in detail and illustrate their correlations with the aggregate outcome score and each of the individual outcome questions. We will see far stronger correlations than anything presented in this article.
此系列的第二篇文章，會在一月初作發布。在其中我們會詳細討論三種團隊效率模型的分析結果對我們的研究的鼓舞。包含問題與產出相關性。我們會看到比起這篇文章中所提及更強烈的證據。
Following that, the third article will explore additional findings around many other factors specific to game development, including technology risk management, design risk management, crunch / overtime, team stability, project planning, communication, outsourcing, respect, collaboration / helpfulness, team focus, and organizational perceptions of failure. We will also summarize our findings and provide a self-reflection tool that teams can use for postmortems and self-analysis.
直到第三篇文章，我們會發掘更多遊戲開發相關的要素，包含技術風險管理，設計風險，加班及超時，團隊人力調動，專案規畫，溝通，外包，尊嚴，合作與支援，專注，失敗的組織觀念等等。我們也會對我們找到的證據做結論，然後提供一個自我檢驗的工具，讓團隊能預算自己的數據。
Finally, our fourth article will bring our data to bear on the controversial issue of crunch and draw unambiguous conclusions.
最後，我們的第四篇文章會把我們的數據除去有問題的部分做一個明確的結論。
The Game Outcomes Project team would like to thank the hundreds of current and former game developers who made this study possible through their participation in the survey. We would also like to thank IGDA Production SIG members Clinton Keith and Chuck Hoover for their assistance with question design; Kate Edwards of the IGDA for assistance with promotion; and Christian Nutt and the Gamasutra editorial team for their assistance in promoting the survey.
For announcements regarding our project, follow us on Twitter at @GameOutcomes
遊戲專案為何成功團隊希望能感謝數百名現任開發者及前輩，讓這個問卷研究能順利進行。我們也同時感謝IGDA生產力同好會的成員Clinton Keith與Chuck Hoover再問題設計方面的協助；感謝IGDA的Kate Edward協助推廣此專案；感謝Christian Nutt及Gamasutra編輯群對我們問卷的支持。對我們的進度有興趣的話，不妨追蹤我們在Twitter上的帳號@GameOutcomes。