Newsweek - National News, World News, Health, Technology, Entertainment and more... - Newsweek.com
SPONSORED BY
Full Post
Posted Thursday, December 18, 2008 1:00 PM

A Symposium On Game Reviews. Topic 1: Review Scores, Part I

N'Gai Croal
 The Parthenon in Athens, Greece. Photo courtesy of tsak_d.

Are reviews primarily a consumer guide, or should they serve another purpose? Do review scores deter intelligent discussion of videogames? Is the presence or absence of a review score the only difference between a reviewer and a critic? What is the role of the reviewer when the Internet is democratizing published opinion? How should reviews and reviewers evolve in light of the emergence and growth of Flash games, small games, indie games and user-generated games?

These questions and more were on the mind of N'Gai Croal, John Davison and Shawn Elliott last summer when they decided to expand their conversation to a number of noted reviewers, writers, bloggers and journalists for a published email symposium on game reviews. (See below for the full list of participants.) The planned list of topics include Review Scores; Review Policy, Practice and Ethics; Reader Backlash; Reviews in the Age of Social media; Reviews in the Mainstream Media; Casual, Indie, and User-Generated Games; Reviews vs. Criticism; and Evolving the Review.

The topic for Round 1, which will be published here in installments over the next several days, is Review Scores.

Advertisement

Participants

***

Shawn Elliott, 2K Boston: How much is on our minds before we begin playing any given game for review purposes? Will we imagine a range of probable scores that a heavily marketed, highly budgeted, and hugely anticipated game will get? What when the game is branded “budget” or is the work of a lesser-known, less-storied studio? If so, how closely have actual scores correlated with our assumptions?

Kieron Gillen, Rock, Paper, Shotgun: As others have said before--but Troy Goodfellow put most snappily, so I'm stealing his phrasing--the games press has a presentist/futurist bias. The vast majority of press coverage is for games that either aren't available, or are only just available. Even if we haven't seen or played the game personally, our peers will have. And we'll have seen comments threads full of people saying what *they think* of the edited information of the game we (and their PR) have presented. And with all that, when you throw a score out you know it's going to be read with those expectations in mind. When Eurogamer's Metal Gear Solid 4 review gave it an 8/10 there were 2000-post threads and actual death-threats. And Oli [Welsh], when he wrote that review, knew exactly what response he could expect. Games without the hype have lower expectations. I remember the attitude being crystallized by a comment I saw ages ago on Kotaku which stuck with me, when they linked to a B-game someone had 9/10ed: "It can't be any good, as I haven't heard of it". It's an ugly, but common, tautology.

You can't avoid knowing what the score is on that point, without becoming a true hermit. In terms of coloring your actual expectations of the game per se... well, unless someone's actually paying me to research a feature, I ignore 95 percent of previews. So when reviews come up, I try to review what's there rather than the hype... but that's going onto a whole different question.

Quick thought regarding the indie/AAA dichotomy, though: I often think that AAA-popular-sequels tend to start with 9/10 and lose marks, while games with less expectations start with 5/10 and have to gain them. And... oh, I'll shut up. More on this later, I suspect.

***

Leigh Alexander, Gamasutra/Sexy Videogameland/Variety: So, as far as preconceptions go, I just thought it worth noting that a game's marketing machine, whether through its fierceness or its clumsiness, would very much for like for us to have a preconception going into a review.

Unfortunately for them, they can't necessarily pick what impression they create. I like to think we react to the fashion in which we're being messaged, rather than devouring piecemeal the messaging itself. Or, most of us do.

So, I agree with Kieron that the right answer is "no preconception"--i.e, the reviewing process doesn't begin until you start playing the full version of the game, period. But sometimes I wonder whether background factors should be considered as context for a review. For example, for months a hyperbolic individual promises that his game will revolutionize ludology. Are we allowed (or, conversely, obligated?) to consider his lofty goals when evaluating the end result? If a company creates an "identity" for a game ahead of time, shouldn't it that exemplify what the game is aiming to be, and shouldn't we try and consider whether or not it achieves it?

There's a line, I think, between making a prejudgment, and bringing with you a context within which to make an evaluation. Games are an industry and a culture, not a fragmented, compartmentalized list of disparate products, and rather than pretend we have no early opinions, I wonder if it's not beneficial to be prepared to bring that context—which also applies, perhaps to being aware of budgets, of team sizes, of other challenges?

***

Shawn Elliott, 2K Boston: Because I believe that self-enhancing, self-serving, egocentric biases are normal, and that people are prone to see themselves as being immune to the influences that move everybody else, I'll happily admit--along with Kieron--that I have preconceptions before playing. I'm human.

I'd argue that our preconceptions are active when we decide which games we want to review. That's not to suggest that, when given the choice, all critics go straight for the gravy (I've often volunteered to review games that I imagined would be interesting but not the best available). But what, if not a preconception of some sort, drives these decisions?

In addition, I believe that my assumptions are active as I play. For instance, I'm less likely to immediately doubt the wisdom of a given design choice in a Valve game than I am with the work of second-rate studios. An analogy: Say you're competing against someone with sorry win-loss stats in a strategy game. His opening moves seem odd, so you assume he's stupid. When his record is intimidating, you take the time to study his seemingly odd tactics until you're certain you're not missing something. In my mind, "the right answer" isn't a realistic answer.

Leigh, I have a problem with holding a loud developer to his hyperbolic promises (and it has nothing to do with the dozens of programmers, designers, producers, artists, and animators hanging their heads behind him): intentional fallacy. I'm interested in the degree to which game maker's games match their ambitions, but I wouldn't want to evaluate them on this basis. What New Critics wrote of poems seems sensible for games: "It is detached from the author at birth and goes about the world beyond his power to intend about it or control it. The poem belongs to the public."

Should we consider budgets and staff sizes? Certainly not when the critic's intent is strictly to inform consumer shopping sprees.

***

N’Gai Croal, Level Up/Newsweek: I’ve never liked assigning scores as part of any critical assessment, and the times I’ve had to do so in the past, it’s always been under duress. I started out as a journalist by writing movie reviews for my college paper, and none of the critics after whom I tried to pattern myself—Pauline Kael, J. Hoberman, Stanley Kauffmann, John Simon, Andrew Sarris, Armond White—used stars or points or thumbs. They didn’t provide you with any shortcuts or shorthand. You had to read what they wrote in its entirety in order to figure out what they thought. I said to myself, when I grow up, that’s the kind of critic that I want to be. So because I’m not obligated to dole out review scores in print or online, I only have two things on my mind when I start playing a game that I know I’m going to write about.

First, am I going to enjoy this game? In that sense, it’s not dissimilar from when I take in a movie. Or a TV show. Or a play. Or a book. Even when it’s a shared experience, playing a game is intensely personal, and no matter the developer’s pedigree, no matter the budget, I start each new title the same way: on the precipice between hope and fear. I hope that it will be good or great. I fear that it will be mediocre or worse. And as I give myself over to that series of firsts—the first image, the first sound, those first bits of gameplay, that first accomplishment—any and all external influences evaporate, leaving me only the thrum of my internal gauge, the one that tells me just how much I’m enjoying myself. I trust that gauge implicitly, and while external factors might influence precisely how I articulate my opinion, I don’t believe it goes much beyond that.

Second, how much of this game am I going to be able to complete before my deadline? That’s very different from how I approach plays, television, theater or literature--I wouldn’t dream of critically assessing a piece of work from those media without having completed it. Why doesn’t that stop me from doing the same with videogames?

The explanation--or is it an excuse?--that I offer is that I don’t review games. We’ll get into this more in the Reviews vs. Criticism section of our symposium, but the way I see it, a reviewer answers the question, how well does this game work, but a critic answers the question, how does this game work? A reviewer helps consumers decide whether or not they should buy a game; a critic helps players think about a game that they’ve played--in its entirety or in part--and that is the end of the spectrum where I believe my writing lies. (That’s also why, on a game by game basis, I don’t think I need to have completed a game to have some insights about it--but I do think that if I were advising someone on how to spend their money, I’d feel obligated to play most or all of the game.) Scores can serve as a valid form of shorthand for the work of the reviewer, but I’m not convinced that scores have much to offer the work of the critic.

***

Kieron Gillen, Rock, Paper, Shotgun: Leigh, I agree with Shawn. You can mention the hyped intention and mention whether it measures up—but that's not what you're rating. Marketing doesn't necessarily understand their games and what's interesting about it. And occasionally a game is fascinating despite what their creators were trying—Jim Rossignol loving the deeply buggy unpatched release of Boiling Point for its sheer constant surreality comes to mind as an extreme example of that.

N'Gai, it's far too early for me to do my You Don't Need To Complete A Game To Review It piece, I suspect. Methodology of reviews is a question all of itself.

***

Stephen Totilo, MTV News: I wonder why Shawn dragged me into this. I seldom write reviews. I don't put scores on games. My main gig's reporting, a.k.a journalism, a.k.a. the thing most people don't really mean when they want to talk about "games journalism" because the thing they really mean to muddle over and improve upon is what we're talking about here: games-reviewing. I'll give it a go, nonetheless! Scores, who are they for? What do they do?

The question we're answering is whether those who review games pick a number before writing a word. Kieron says the ideal reviewer would not; he and Leigh agree it's hard not to pick a figure already. Shawn's acknowledging the humanity of having preconceived notions but dodging his own question about whether that made him start with a number. But I guess it's hard in some ways to pick a figure at all when it's so unclear what the point of it is.

What does it mean to select -- prematurely or even at the "right" moment--a seven for a game? Or to see a game and, at first sight, have your gut gurgle that it’s a nine?

A review score number may be for the fans, a shopping guide metric that informs a purchase or justifies one already made. It may get used for the dastardly purpose of comparing a game to another--even though it never quite works to pit a 2008 sports game that got an eight against a 1998 role-playing game that got a nine, especially if neither is as good as Tetris. A numerical score might, in isolation, even indicate if a game's any good, but not always.

We're talking about arriving at a number, and, frankly, I don't know how you all do it. A decade ago I worked at a boxing magazine and sat in press row for many fights. Scoring vexed me then. I'd score rounds for my coverage on the "10-point must" system: 10 for the winner of the round, nine for the loser unless he got knocked down or really took a beating, which would dock him to an eight. In that system we see the Gillen-described method of scoring-by-reduction. We also saw the great gaming tradition of grade inflation. Give a judge (or a reporter aping the actions of the official judge) a 10-point scale and all kinds of psychology comes into play.

The other thing I saw at the fights--the thing that really stuck with me--was how hard it was to score any of it. Boxing matches aren't like Rocky fights. It's often hard to see who is winning or which fighter is doing the better work. Sometimes it's all boring or repetitious, but you still must score each three-minute round. Putting numbers on these things--and the official judges had to, in case it went the distance and, god forbid, the paying public needed to know who won--was a murky and unpleasant job. Try it some time. I'd root for the knockout, which would render scores moot and sweep any errors in numerical judgment away. The scorecards didn't matter then. Any scoring biases we had would be secret. The fallacy of putting a number on things would be dodged, and everyone would go home happy. No one would have to know that I gave a 10 to fighter B because I felt bad that he'd gotten beaten up for the three previous rounds or that I gave the wrong guy the first round because I bought into his pre-fight hype.

***

Robert Ashley, freelancer: I took a break from enthusiast press game reviews for a couple of years. What a *** relief. No more death threats from insane superfans who think my evaluation of their favorite game is some kind of paid-for hit job by a shadowy corporate network. No more forcing myself to play through a 40-hour game in three days. No more tearing my hair out trying to avoid the clichéd language of a form of writing frozen in its awkward adolescence 15 years ago. Free to play whatever I wanted, I fell in love with games all over again. Hard.

Now that I'm back and picking up the occasional review, I simply refuse to engage in the bullshit that used to drive me insane. Review scores have one use: driving traffic from message boards and social networks to your site and giving those people an excuse to argue out their fan beefs in the comments section. I treat them as such.

I have no methodology for choosing a review score. I certainly don't think about it much. Your gut feeling (after either beating the game or the game beating you) is more accurate than whatever you might come up with after careful consideration. This is how the rest of the gaming community arrives at an opinion--and probably why so many people feel that critics are out of touch. When you sit at your computer, running down all the plusses and the minuses--technical issues, story concerns, lovable roughness, annoying roughness--you can end up talking yourself into a score that doesn't really represent your true reaction. You can't explain the magical pixie dust that made the empirically bad game good. You can't explain the soullessness and sterility that made the empirically good game bad. You let your stupid logical brain take the wheel and explain yourself into a lie.

When I say you, I mean me.

Anyway, I say be gutsy and honest with a score, and save your careful thinking for the text.

***

Jeff Gerstmann, Giant Bomb: Well, I won't deny that scores stir up message boards and social networks and such. But to claim that's the only reason they exist is a pretty narrow, jaded view. I think scores are primarily there to serve as shorthand for folks that won't or can't read the full review. They're meant to serve as part of the summary. A deck, a score, and, depending on your publication's review style, some pros and cons or whatever. They aren't rocket science, and were never really meant to be treated as such. The key is to not let the different ways that scores are misused get in the way of what you're trying to accomplish with your reviews. I don't care if the scores I give fit in with the rest of the industry on the review aggregator sites. I don't care if people infer the score to mean that I'm playing favorites because I'm obviously "TEH BIAS" or whatever. I care about the people out there who haven't been following a game from day one, and the people who haven't already pre-ordered the game and are just looking for validation. As soon as you start bending your review systems in order to cater to those extremist segments of the audience, you're getting away from the thing that reviews are designed to accomplish: assist average, everyday people in their purchasing decisions.

I say assist because we've reached a point where one review can't possibly work for every single person that reads it. The audience for video games is too widespread and varied now for reviewers to think that their review is the only one that matters, or that it will be able to directly state if a person should or shouldn't buy a game. This, more than anything, is what should be driving a change in the way games are reviewed, not a bunch of reviewers who are tired of all the weak-ass game review clichés that are still out there. Getting rid of scores because people who write reviews are tired of assigning them and dealing with the fanboy rage that invariably ensues hurts the consumers that actually use reviews for their intended purpose.

But to answer the core questions, I don't really think too much about scores when I'm playing a game. I attempt to go in feeling cautiously optimistic about the game in question, and as I'm playing, I think about text, and things in the game that need to be specifically called out. I start to think about the best way to mention those moments, and the best way to call out its flaws. At some point, all that text swirling around in my head starts to sound like a range of scores, so maybe around halfway through playing a game I start thinking a little more about the score. But it isn't until after the review is written that the score is actually assigned. The score is meant to sum up the text. If I've just written a review full of harsh criticisms, well, then that sounds like a pretty low score. Assigning a score and then attempting to justify it with text puts the cart before the horse.

Assuming a score (or range of scores) before actually playing the final game is pretty dangerous territory. Carefully controlled publisher-run demos usually paint a pretty rosy picture of a game, and games often don't live up to that. Case in point: every time I saw Mercenaries 2 prior to its release, I thought it looked awesome. The missions seemed smart, the co-op was fun, and it felt like a game that would offer a lot of variety. The final product turned out a collection of dopey missions that showcased the game's boneheaded AI, the co-op didn't make much sense, and a lot of the missions were pretty boring. I didn't review Mercs 2, but not letting pre-release exposure to a game color your review with overt disappointment or a sense of smug "I totally called it" satisfaction can get a bit tricky.

So I agree that, ideally, a reviewer should start with no preconceived notions about a game based on budget, hype, promises made by the developer, and so on. But at the end of the day, we're all human, and I'd expect that some form of disappointment over a game that fails to deliver on promises or excitement over a sequel that's turned out better than the last leaks into some of our reviews. The key is in owning up to that and presenting your reviews as informed opinions, rather than hiding behind the old paradigm of rigid objectivity.

***

Shawn Elliott, 2K Boston: I didn't mean to duck the question, Stephen, and I definitely don't start with a specific rating in mind. However, I'm sure that I have imagined ranges of scores that a given game would receive whether I or anyone else was to write the review. That's not to suggest that I once forced the square peg of a game to fit the round hole of my presumptions. I never did. Or I don't think I did. What I'm acknowledging is that, all the same, something was on my mind, both before I began and while I was playing. I think this is the case for every videogame critic. And while that something isn't necessarily decisive, it's nonetheless worth investigating.

I should also add that our predictions regarding meta-ratings and the reviews of other critics are on the mark more often than not. (In these instances, self-fulfilling prophecy isn't an issue.) Some companies are so confident of our ability to make these calls that they're willing to pay us for our input as consultants.

Jeff is correct in that sometimes PR-controlled preview demonstrations are smoke-and-mirrors magic shows. But what about when we're allowed to play near-complete code for prolonged periods? I'm not talking about performance issues--commenting on the framerate of an unfinished game is almost as pointless as it is for an Entertainment Weekly writer to assure her audience that King Kong may or may not appear in place of a green screen. Sometimes design, locked down years prior to a game's preview phase, is apparently dopey. Again, I have to emphasize that holding some assumptions in no way necessitates my maintaining them in the face of final evidence.

You also imply that an aversion to cliché shouldn't drive change in the way that we review games. I won't argue that cliché is the one and only reason to reconsider our habits, however, I count it among the many. The paragraphs on a game's graphics, sound, and so on in previews and reviews produce recognizably generic writing devoid of the discovery and perception that might make them worth reading. They are lazy in that they eliminate both the need to transition thoughts and to interpret a game as the complex product of interconnected components (instead of simply summarizing these parts).

Even worse is when the paragraphs that constitute a template are themselves composed of yet more methods of avoiding actual analysis. I mock the overuse of words such as compelling not because there is anything wrong with the words themselves but rather with the way that they're used to replace real explanation. We know that any guy in the game store can say he likes or doesn't like a game's graphics or story. We recognize that it's our responsibility as paid writers to say something more than "I like it" or "it's good." Replacing "like" and "good" with "compelling" isn't even trying.

***

John Davison, What They Play: If nothing else, review scores serve as the starting point of a discussion for readers. As Jeff says, they serve as a shorthand for those that have no interest in digging deeper than a fundamental thumbs up or thumbs down gauge of quality. I think we can all safely assume this, but back in my time at Ziff we experimented sufficiently that we got absolute, empirical proof.

Jeff Green and I spent a lot of time talking to Computer Gaming World readers, and trawling through our message boards to really try and put together the ultimate reviews section for the audience. We wanted to do something a bit different, but more than anything we wanted to acknowledge what a large group of our readers were telling us. That was, essentially, that "we're older" and "we're smarter" than the average gamer, so "treat us like that." They wanted longer, more considered think pieces about games, and it appeared, anecdotally at least, that review scores were not high on their list of priorities. They wanted, they said, to really understand what the reviewers were trying to convey. They wanted to really dig in.

So we gave them that. We took the scores off, and made the reviews longer. We actually went a step further, and tried to acknowledge the broader critical spectrum, and talk about what caused other reviews to express particularly positive or negative comments. It was our own little expression of idyllic critical idealism. A utopia of reviewing and we dreamt that it would spark enlightened and intelligent debate about specific qualities and opinion.

The reaction was spectacular. The readers really, really f---ing HATED it. The most common complaint (I'm paraphrasing, but it was pretty consistent) was "How do I know what you think if you don't give it a score?" That and "you guys are retarded." We figured at first that it was simply a bit of culture shock and that it would wear off, but the negativity increased over time. After three months or so, we had to go back to putting a score out of five on the reviews just to stem the tide of vitriolic hatred.

On a separate note, I was speaking to someone recently who had some connection to Rolling Stone, and he told me that the reviewing process for albums there was that the critics only submit the text, but do not submit a score. The number of stars is assigned by the reviews editor based on the tone of the review. He was drunk at the time, so might have been talking out of his *** though. Does anyone know for sure if this is the case? Even if it's not true, it's certainly an interesting approach--and something I'd like to discuss in this context. If a reviewer is freed from thinking about assigning a score, but knows one will be applied later--would it necessitate a more disciplined approach to how thoughts are expressed? I know it would for me. But are we ready to relinquish that kind of control?

Next: Totilo challenges the review score naysayers to answer the question "Who is actually upset about review scores?" Hsu and Reyes defend scoring on behalf of the consumer. And Davison discusses how Google has impacted the individual reviewer. P.S. If you absolutely, positively can't wait to read the rest of Round 1, Shawn Elliott has posted the entire transcript--all 16,000 words of it--on his blog, here.

You must be a registered user to comment.  Click here to register.  Already a user?  Click here to login.

Member Comments

Posted By: Cynic04 (January 18, 2009 at 2:20 PM)

Having had an account with a online video game rental service for the last few years, I have gotten in the habit of reading a few online review scores after I get through the latest game I rented.  I have to say after doing this many times now I almost always agree with the sentement of the reviewer, and can always, at least, relate to what they are saying.  I think where the majority of reviewers go wrong is when they start trying to judge the effect any part of a game may have on the player.  STALKER: Clear Skies is a great example of this (as was the first one).  Any reasonable person who played that game knows there were major issues, but for some those issues ruin the experience, while for others they get forgotten amongst all the rest the game has to offer.  I think that a lot of the variation of STALKERS review scores can be explained by reviewers deciding how the bugs in the game, for example, might effect the player.  For some the resulting frustration might ruin the whole game experience, while for others the bugs almost become a non-issue when considering all that the game did right.  So is there a correct answer?  Yes, definitely, but it only applies to ones self.  As for game journalists trying to do their job, I'd say take some of the emphasis off the score (a highly subjective measure) and place more on the content of the review (fanboy's would have a harder time getting worked up about the statement "this game is buggy" than a 6.0 staring them in the face).  If users need a quick idea of the quality of a game, a small summary would be much better served than a number imo.  

P.S. Awesome work on the new site Jeff!


Posted By: Etchasketchist (December 27, 2008 at 5:44 PM)

I gotta say, I cringe whenever I hear people complaining about review scores or snobbishly rejecting them. I love metacritic. It has saved me lots of money and steered me toward amazing games. N'Gai's preference for more literary, scoreless "criticism" would be awesome if videogames were like books and we could borrow them from the library for free (i've heard there are some websites where you can do something similar, but I wouldn't know anything about that...) but in reality, videogames are $10 (xbla) to $180 (rockband) pieces of software and it helps to know if they're good or not. The review score helps me with that. I go to metacritic and pick out the most reputable high score and the most reputable low score and see if the high-scorer loves the things I love and if the low-scorer hates the things i hate. In an age of 1-click online shopping, I need a quick and reliable way to get to this kind of information. And the people who provide it should be proud of their work and should not feel inferior because they're more Roger Ebert than Pauline Kael. Roger Ebert's dope. And don't worry about the arbitrariness of it all. After years of reading Nintendo Power and EGM and Gamepro, I have developed a good intuitive sense for what a score a means. And that's part of being a gamer. That's our culture. Leaderboards and high scores are a part of it, and just because movie people don't do it (even though a lot of them do) doesn't mean it's bad. It's what we do: We play games and we rate them. We're weird like that. And that's awesome.