Gamasutra: The Art & Business of Making Gamesspacer
Is It Time To Review Reviews?
Printer-Friendly VersionPrinter-Friendly Version
View All     RSS
April 19, 2014
arrowPress Releases
April 19, 2014
PR Newswire
View All





If you enjoy reading this site, you might also want to check out these UBM TechWeb sites:


 
Is It Time To Review Reviews?
by Andy Satterthwaite on 08/08/10 08:52:00 pm   Featured Blogs

The following blog post, unless otherwise noted, was written by a member of Gamasutra’s community.
The thoughts and opinions expressed are those of the writer and not Gamasutra or its parent company.

 

No matter how much we might disagree with them, reviews are important. They tell us about games, they give opinions on how good they are compared to others, and help us work out what to spend our money on.

Back in the old days, magazine reviews were pretty much all we had to go on, hanging off every word of the 4 or 5 articles about a game.

But now, the internet means that there are so many reviews, reviewers and review sites, that “review aggregating” sites such as Metacritic or Gamasutra are needed in order to get an overall opinion.

The problem, is that they do the complete opposite they don’t give an overall opinion, instead they provide weight to the haters and hide actual reviewers opinions in a mask of homogeneity.

The solution – a new form of aggregation formula, similar to that used by Rotten Tomatoes for movies.

When I started playing video games (about 28 years ago) game development reviews were something that everyone read – they were how you found out about games, and how you judged which games you’d buy.

When I started in game development (about 17 years ago) reviews were still vitally important – they boosted your ego, and your CV, they still swayed your purchases and sometimes even affected the end of project bonuses.

Reviews mattered, and were taken seriously, in part because there weren’t that many of them. In both of the above cases the amount of reviews you’d get was limited – depending on the platform and the territory maybe there’d be 4 or 5 magazines that would cover your game – and the only one(s) you’d be really interested were the ones that the publisher would love to plaster quotes from on the box (“this game is awesome (5 stars)” Official [InsertConsoleName] Magazine etc.)

Now however things have changed, at least in some regards.

Reviews are still the subjective opinions of people we (generally) don’t know. Review scores are still used by many of as an essential guide to the quality of a game.

But the rise of the internet and the demise of print have seen the number of review sites increase by orders of magnitude. So much so that it’s no longer enough to have a few great reviews for your game – you now have to have enough so that the AVERAGE review is great … enter the era of GameRankings, MetaCritic et. al.

These aggregation sites are practically essential in navigating the vast quantity of reviews for titles – so much so that game development contracts now specify GameRanking (or Metacritic) rating as bonus/contractual criteria.

Unfortunately, these aggregation sites have a huge flaw - Metacritic / Gamerankings are unfairly swung by bad reviews. If your game is averaging 80% it takes two “excellent” 90% reviews to make up for one “not my sort of game” 60% review. It takes four 90% reviews to make up for one “hater” 40% review – that’s tough – particularly as bad reviews can easily be given by people who don’t like that sort of game.

These effects can, if you’re unlucky, be magnified further as many sites end up just duplicating the content of reviews from the main sites, making the aggregate even more arbitrary. While the system of review “weighting” used by some aggregators (based on the status of the reviewing site) is aimed to solve some of these problems, it instead only exacerbates them further, should just one of those high-status sites happen to be your token “hater”.

Now I don’t want to give the impression that “hater” reviews are bad – I believe there should not be any homogeneity in reviewing. Rather, instead reviews should be biased towards the opinions of the reviewer - that’s why we read them.

BUT – those opinions only count when you actually READ the review, not when you just look at the score, which is all you get from the aggregator.

A score alone does not take in to account the preferences of the reviewer – can you tell that the 40% that a game received in that one review (which dragged down the overall average) was because the reviewer was a hardcore shooter fan, who really just didn’t want to review that racing-sim? Or because that horror-death game was reviewed by an extreme-moralist?

In general reviews provide a percentage which is supposed to allow the public to judge which is better Game A or Game B, but can you really compare a racer to a shooter to a puzzle game to a pony-sim?

Games can be dragged down by single elements that many would say “don’t really matter”. A game released on PS3 that has graphics that look like a PS2 game will get marked down for that, even if it’s crazy fun, maybe only by 10% or so, but enough to push it out of the tiny “top” percentage. This forces a block-buster mentality, whereby the only way to get good reviews is to spend more than the last game did, whereas what we should be doing is saying “are we having fun”.

Review percentage is also based on a NOW comparison, games are compared to the quality of other releases, so it might be fair to compare Blur with Split/Second, but how do you compare either of them to Ridge Racer, or older titles? If a so-called classic game got reviewed now it would be marked down accordingly, just take a look at some of the straight ports of arcade classics on Xbox live arcade. Those games rocked in their time, now they languish with 50-70% review scores.

So, in summary, there are too many reviews to read them all, so we have to aggregate.

But aggregation just gives us an average score. Not an aggregate opinion.

Given that, in our industry, a 70% score is regarded as mediocre, at best, an aggregate scoring system unfairly biases towards “hater” reviewers. What we need is a completely different approach to reviews, one that allows for crazy bias, one that allows for opinion.

Thankfully the movie industry has already worked out such a review process: Rotten Tomatoes. Rather than aggregate percentages, it simply gives a +1 if it’s a “Favourable” review, and a -1 if it’s a “Negative review”. Films are then scored based on whether most reviewers liked it or didn’t. So you get an opinion based aggregation that focuses on entertainment value and not arbitrary quality thresholds. And that’s all win.


Related Jobs

Treyarch / Activision
Treyarch / Activision — Santa Monica, California, United States
[04.19.14]

Associate Art Director - Treyarch
Treyarch / Activision
Treyarch / Activision — Santa Monica, California, United States
[04.19.14]

Associate Animator (temporary) - Treyarch
Activision Publishing
Activision Publishing — Santa Monica, California, United States
[04.19.14]

Executive Producer-Skylanders
Activision Publishing
Activision Publishing — Santa Monica, California, United States
[04.19.14]

Director, Central User Testing






Comments


Saul Gonzalez
profile image
I believe the problem with the current review system stems from a fundamental lie: that it is possible to have an objective, precise measurement of a game's quality, particulary in a way that enables meaningful comparisons between titles. This probably has its origins on fan culture where debates about "Game X is the best game ever" and "Game X is better than game Y" are commonplace. While fun for friendly arguments, this way of thinking does not lead to an usable review ecosystem.

Consuming media is so personal that attempting to use any scale more granular than "this is a mediocre / good / outstanding work within its context" is delusional. Even more so with games, where the player's individual experience and taste matter so much.

I agree that game reviews should take a page from Rotten Tomatoes. A page with a selection of quotes from a variety of reviewers, while not perfect, is much, much more representative than a single numerical score.

Sebastion Williams
profile image
Whereas it is unlikely to have purely objective reviews of games and interactive media in general, it could be possible to have reviewer profiles which would reflect their inclinations in their ratings. Just as musicians and sound engineers would have a more critical ear toward auditory content, visual artists and designers toward visual content and athletes and performance artists toward kinetic content with their selective attentions affecting their overall scores, consumers/purchasers should be able to determine which type of opinion has greater relevance based on their own profile.

Callum Brighting
profile image
I've always wanted a small "About the author" section at the top of reviews on sites like Edge and Eurogamer. It only needs to be a small paragraph saying something like: "So and so is a racing sim fanatic who also enjoys the indie scene. He dislikes most FPS games, because he feels they are generic". Then, you can read the review with a little more detail about the writter, which may explain why he is giving the FPS with an average score of 90% a 2 out of 5!

I do agree that this system of averaging a large amount of scores seems flawed, but it is not entirely broken...the top ten games on metacritic I think all deserve to be there at least?

driver 01z
profile image
My purchasing decisions used to be based on reviews - I didn't want to bother with mediocre games. But recently I've learned more of what I like and don't like, and I'm finding I can really enjoy things even if they fall short in areas I don't care about. Like I saw Dark Void got mediocre reviews so I ignored that game. Then it was available used for <$10 a few weeks ago so I figured I'd give it a shot - and I'm loving it. Meanwhile, Red Dead Redemption is acclaimed... but I have found that there's something missing in it for me, I like more solid "connections" in the flow of a game, like Gabe from Penny Arcade said "I just need a more directed experience I guess." - I feel the same way. For gamers that do buy a good number of games and pay attention to reviews, I suppose they will eventually realize to take them all with a grain of salt and look for things they like in particular.

Nathan Mates
profile image
As a consumer, I sometimes find it instructive to read the top 2-3 reviews for a title, and also the bottom 2-3. In fact, I usually read the bottom reviews first. Why? Perhaps because I'm trying to be a cheapskate, finding a reason to not shell out for a game. Or, for cases other than blatant reviewer/game mismatch, I find that the lowest reviews are some of the most honest and refreshing reviews out there. For the higher reviews, there's a fair amount of peer pressure to deliver a review in line w/ the size of the advertising budget.

Alan Youngblood
profile image
Andy --- I couldn't agree more that the review system is broken right now. Horribly so. My roommate and I watched a review that gave a clash of the titans game a kiss-of-death 44% and basically crapped on everything about the game. I'm not saying it's probably the best game ever, but the weird thing was you could say all the same negatives about any game in that action game kinda genre (dare I say any of the God of War games?). "It just rehashes things that have been done before." Which game am I talking about?



Callum -- you are tapping into something that is a very helpful distinction to make. Most reviews focus on "pathos", that is, how the reviewer feels. They would be better balanced and easier to interpret if the reviews included "ethos" (who the reviewer is and what their qualifications/dispositions are) and "logos"(an appeal to logical thinking).



The Rotten Tomatoes idea seems pretty good, and I'm sure there's other great ideas for a new review system, the difficulty remains in getting a good system implemented and getting people to use it. What good is a better system if it just gets rolled back into a Metacritic score and that's all people check. I've personally targeted games in the 60-70% range because a lot of better games have fallen there. I find myself very bored with the 90%+ games, they have great production, but generally speaking no soul.



Another key answer to this issue (Should involve groups like IGDA, perhaps) goes back to an old soapbox of mine: education. Go to a large university's website and look at the course catalog for Film Theory classes. Now look for Game Theory classes (and I'm not talking about that other game theory that has a lot less to do with actual games that are played for fun). My guess is, even at specialty schools there is no "game theory' class. My point is, there's no standard or basis for understanding, and thus judging games. With film you've got things like the leading actor's performance or cinematography with regards to shot variety and flow. Things that are important to make the opinion of "good." What makes a game good? There exists no current shared vocabulary for this. We can all rattle off how Chrono Trigger was the best game of all time because of the characters and the story, but we aren't on common ground when a rebuttle for Tetris comes. Tetris has no real story, no real characters, but it is loved by many as well (arguably more than Chrono Trigger).



In short, one big thing we need is for academics to create the language for which we communicate games to each other and teach it to us. (I'll note that Salen and Zimmerman have made huge strides in this area, although we need more to happen, and quickly. I suggest reading their "Rules of Play")

Lo Pan
profile image
What bothers me about reviews is the inherent peer pressure by the reviewer community. If you disagree with a peers' review scores for a game, especially a perceived blockbuster like Red Dead Redemption or Star Craft II, you have to be careful expressing your opinion...especially if you're critical of the game. Otherwise you risk being labeled as an 'oddball or contrarian'. I much prefer the Ars Technica system of BUY, RENT, or SKIP.

Jerry Hall
profile image
Hey Andy: I give this review on reviewers a thumbs down. haha j/k. Awesome article. A lot of thought went into it and it shows. A few points I want to add. Those game reviewers have a tough job.

-Who can forget when Gamespot reviewer Jeff Gerstmann give Kane & Lynch a 6 and got fired because advertisers Eidos was unhappy with the score. There is an inherit conflict of interest with magazine/website reviews where there is a huge banner or full page color right next to the game they are reviewing.

- Or the embarrassment suffered when EGM gave Madden 06 psp high honors. Dan Hsu's review team saids they only review games that have gone gold master. Madden 06 psp crashes and crashes often. Play 5 minutes into Franchise mode and the player would lose all their data. Looks like the review was written based on the press release, although they never admitted to it. I give Dan Hsu high marks for later writing an explanation rather than sweeping it under a rug. Although they had a chance to save their readers from a buggy game at launch and blew it.



Ultimately, I would like reviews such as the ones on cheapassgamer.com, they write what they think, what the game sells at MSRP, and a price point where the game should be purchased at. I'm sure they get their games for free and have bias toward their sponsors as well.

Roger Haagensen
profile image
You might want to look at this review. I may get around to doing more later but the concept is simple:

http://www.emsai.net/journal/?post=Rescator20080623010406



I basically start with a perfect 10. (or rather it's 100%, actually under the hood it's a 0.0 to 1.0 scale but...



I assume that each score area is perfect, anything that detract from the perfection causes the score to go down.



Obviously a text adventure would probably have no sound so in that case the sound score would be blank rather than 0.

But in a game with audio, if the audio does not work then it could get a 0, or if the audio is horrendously bad obviously.



The total score is a sum, but I could be bumped up or down depending on how I liked the game overall.



After all it's easier to notice flaws in a game than the things that work, so my outset is that all is perfect.(otherwise they wouldn't release the game right? *laughs*)



So a "perfect 10" game would be a game where none of the flaws (if found/noticed) detracted from the gaming experience in any way.



And I try to keep the technical scores technical and instead use the total score for how I feel about the game+technical quality.

I also have a Story score which is something we hardly see at all in any reviews out there. (obviously a game with no story at all would just have a blank story score, like Tetris etc.).

David Tarris
profile image
I don't really get this argument. Reviews are just a tool, one of many, people can use to help guide their purchasing decisions. Review aggregation sites just make it easy for people to get a feel for the general consensus of a game.



"Unfortunately, these aggregation sites have a huge flaw - Metacritic / Gamerankings are unfairly swung by bad reviews. If your game is averaging 80% it takes two “excellent” 90% reviews to make up for one “not my sort of game” 60% review. It takes four 90% reviews to make up for one “hater” 40% review – that’s tough – particularly as bad reviews can easily be given by people who don’t like that sort of game."



So you have a problem with the "mean" metric, then? Maybe these aggregating sites could provide you with "medians" and "histograms" as well? That to me seems more beneficial than a simple "good or bad" breakdown. I think people who use sites like Metacritic draw different lines in the sand when deciding whether or not to pick up a game that sounded cool and received and average score of "X" from reviewers.



I prefer having a broader spectrum of possible "scores" a game can receive so long as we don't lose meaning in the process. For example, if a single reviewer gives one game a 95.47 and another a 95.32, are they really trying to split hairs about the quality of those two games that finely? There's no real meaning behind the difference, and the game with the lower score could still receive "game of the year" from that critic, as we've seen with sites in the past.



It's different with Metacritic, however, because they're not applying these scores themselves, only compiling and reporting statistics that consumers can make of what they will. And, the more statistics they collect, as we should all be aware, the more (generally) accurate the representation will be of the "population" in question.



"Given that, in our industry, a 70% score is regarded as mediocre, at best, an aggregate scoring system unfairly biases towards “hater” reviewers. What we need is a completely different approach to reviews, one that allows for crazy bias, one that allows for opinion."



It seems a bit odd to me to suggest that one game will receive an average score in, let's say, the 70's, and another the 80's simply because of a few outliers. Why would one game be more susceptible to "haters" (who should be respected journalists, here) than another? Again, the more reviews we compile, the more accurate these results will be. There's also that "median" metric we can use.



Now, if reviewers are simply plagiarizing each other, it becomes the responsibility of the aggregation site to acknowledge this and kick the perpetrators out of the system.



"Thankfully the movie industry has already worked out such a review process: Rotten Tomatoes. Rather than aggregate percentages, it simply gives a +1 if it’s a “Favourable” review, and a -1 if it’s a “Negative review”. Films are then scored based on whether most reviewers liked it or didn’t. So you get an opinion based aggregation that focuses on entertainment value and not arbitrary quality thresholds. And that’s all win."



It seems to me that the only reason a site like Rotten Tomatoes is required for the film industry is because their reviewers are much worse at finding "common ground" and "objectivity" in film than ours are in games. A game with "universal acclaim" will generally receive a score of 90+ in my subjective and aforementioned "line in the sand". For movies, we're looking at something like 75-85 from Metacritic.



For example:



Inception - 74 (Metacritic)

StarCraft II - 93 (Metacritic)



Put Inception into Rotten Tomatoes terms and things start looking a lot nicer:



Inception - 87% (Rotten Tomatoes)



But it's unfair to compare films and games like this. Review scores are a lot like currency, the more good games we have receiving 90+ scores from just about every reviewer out there, the more the "value" of the 80 is going to be deflated. Conversely, films and (even more so) music almost never find reviewers universally awarding something a near perfect score, so 70+ still carries a lot of weight with the average consumer.



So, to conclude, why do I think Metacritic works better for the games industry than a Rotten Tomatoes model? Because with film reviews you have a much higher standard deviation with your review score, making the average less meaningful. Luckily, "liked or disliked" has a much more normal distribution, I would imagine. In the games industry, however, I would expect a very low standard deviation in the scores themselves, and so this problem does not exist.



Some enterprising person such as yourself might be well served by doing some research into this and determining whether or not I'm correct on the whole, as I'm just speaking from experience, not data. If the distribution of scores turns out to be just as chaotic with games as movies, then perhaps you're right, and a Rotten Tomatoes of games might be a sound idea. But I've never seen a game that received such mixed reviews.



In the end, it's all about providing people with the information they want to make informed purchasing decisions. Maybe you think narrowing the spectrum of results down will better achieve this, but as I laid out in my argument above, I don't; at least not until I see some evidence to the contrary.

Tomiko Gun
profile image
I don't believe in using numbers for reviews as games are as subjective as they come. The worst thing about metacritic is they will assign their own metric for scores on review sites that are not numeric, or not using a 100 point scale.



I know though that people are dumb enough that they cannot comprehend simple sentences on those reviews and want some numeric value on their damn purchases. I agree with you, the Rotten Tomatoes model is a fair compromise.



@David

Have you worked in the industry? They have "metacritic" clauses on contracts for bonuses, it's not just a tool anymore, remind yourself that the next time you say to yourself "I don't really get it."

[User Banned]
profile image
This user violated Gamasutra’s Comment Guidelines and has been banned.

Rik Newman
profile image
I agree with all of this, however there's an even bigger problem for me. I believe video "games" reviews are like asking the same person to comparatively review chess, soccer, F1 race driving, and monopoly all on the same scale... as well as at the same time also reviewing lego bricks, rubiks cubes, an office job (similar experience to many MMOs) and an action man toy against them also. (More on this here if you're interested. http://agoners.wordpress.com/2010/08/05/the-beautiful-game/ ). Also, this http://insomnia.ac/commentary/the_videogame_news_racket/ - heh.

Tomiko Gun
profile image
@Bob Dillan

"...game reviews are _reactions_ by your customers to the game..."

Wrong! They are reactions of a few people with websites. Also, a customer by definition means they bought the game, reviewers get it for free. Remember the vast majority of video game consumers don't read reviews or comment on websites, if you cross them, they will just stop buying what you create.



"I do believe that games have an "objective" quality that can be measured mathematically..."

Wrong again, you know what, I'm not going to even bother and just use your own words to prove this point.

"The only games that usually have unreal reviews are major franchises that are overhyped like Starcraft 2 for instance."

Yes it is hyped, but it's a damn great game and almost perfect in every way. You know why you feel like that? Because you're subjectively judging it. See what I did there? You just don't like RTS, objective my @rse.



"If they criticize your game they are not having fun, it is up to designers and developers to understand this feedback..."

"...you are there to entertain other people, not yourselves, not your ego."

Nope, they're there to make their visions a reality and not to satisfy entitled bitches like you. I've read a lot of these "gamer feedback" and most of them are useless. Thank God they don't listen to everyone who thinks they know everything.



@Rik Newman

I agree with you Rik.

[User Banned]
profile image
This user violated Gamasutra’s Comment Guidelines and has been banned.

Nathan Tompkins
profile image
I've always felt that numerical scores on games are actually more instructive than for other forms of consumer media such as film or music. While film, music, and games are all subjective experiences (an element that works for someone might be despised by the next), games have the unique capacity to actually be technically "broken" as a finished product -- some flaw in the controls, camera system, collision, etc. can render an otherwise good game totally unplayable, or at least incredibly frustrating. Film and music have no comparable element in their experiences. (A boom mic visible in the frame of a movie, for example, is also a technical failure, but won't really affect your overall enjoyment of the product.)



In general, if a game gets above a 70 aggregate rating, I'll check it out further if it is a genre or subject matter that sparks my interest. Anything below 70, and it's a pretty good bet that the game is broken in some way that makes it not worth your time, even if the material looks interesting.

Tony Downey
profile image
We started with a system where publishers could literally buy cover stories and perfect scores - or if they were big enough, just publish their own propaganda (lookin' at you, Nintendo Power). Now we have a system built on the average opinion, where every game released suffers from the same handicaps that you mention - an equal playing field. I would argue that the state of game reviewing is far better off now than it ever was.



For all their flaws, Metacritic and GameRankings do a fantastic job. They reveal an accurate picture of how games are spoken of by the very people that influence consumer buying decisions. If you get a 'hater' review from a major site, people who listen to that review are likely to come to the same conclusion. And if the review goes against the grain of that community, they will leave and find reviewers with 'better taste' (or more accurately, a closer taste to their own).



Rotten Tomatoes is a better indicator of the true quality of a piece of work - each opinion gets one vote, terribly democratic, but not a good indication of the market. Looking at the top titles on GameRankings, each game has been hugely successful both financially and critically. Looking at Rotten Tomatoes, the top titles have been critically acclaimed, and have piles of Oscars between them, but many can only claim to have done adequately at the box office.



In short, successful well-read reviewers tend to be of like mind to their audience. Ignoring these successful reviewers in exchange for a one-review, one-vote system is ignoring the audience. Be careful what you wish for, I suppose.

Steven Ulakovich
profile image
The best example of this is Grand Theft Auto IV and Saints Row 2. Just about everyone will say that the former is a far better made game, but the latter is a much more fun title.



Games are products of entertainment. They should be reviewed on that point, and that point alone. Nowadays, you have a culture of a game being reviewed more on what it lacks, as opposed on what it does.

David Tarris
profile image
"Have you worked in the industry? They have "metacritic" clauses on contracts for bonuses, it's not just a tool anymore, remind yourself that the next time you say to yourself 'I don't really get it.'"



You know, I can't really blame you for reading one sentence of my response and giving up (as it was a bit long-winded), but seriously, if you're going to get so fired up about someone's post, at least do it the service of taking the statements made in context -- instead of glancing, jumping to conclusions, and firing at will.



Now, if you had read my post, you'd know that I wasn't referring to the significance of the discussion at all. If I thought it beneath me, why would I bother writing an essay on the topic? My point was that I don't see why the Rotten Tomatoes model, as advocated in the author's blog, is a good fit for the games industry.



So, better to take a step back and consider the context before jumping to conclusions, don't you think? Remind yourself of that the next time you feel like being an idiot.

Jason Bakker
profile image
"This forces a block-buster mentality, whereby the only way to get good reviews is to spend more than the last game did..."



Not only that, but it engenders blandness, as game developers are encouraged by the Metacritic to exclude anything that would offend a majority (or even large minority) of reviewers' sensibilities.

Jonathan Jennings
profile image
lots of good comments here and much like the others i agree the modern review process is terrible. the reviews that do go into detail of what the game is lacking versus what makes the game enjoyable annoy me the most. i personally consider myself a connoisseur of crap, i can find the good elements of almost nay game no matter how terrible. and it just is frustrating to see a game that did certain mechanics exceptionally well but because it isn't as graphically stunning as say the latest call of duty or halo it gets overlooked. even more so since the modern casual gamer tends to put a lot of faith in reviews it makes the writers vision of the game that much more important.

Scott Mullins
profile image
I just saw this posted up on another site, and I think there should be review standards.



@Callum Brighting:

I agree with the, "about the author" section. I think there are many reviewers, forced to review games they wouldn't normally play. So, they go into the game with a negative attitude from the beginning.



I actually wrote an article and posted it today, titled, "Should There Be Game Review Standards?"

It can be read at http://www.coffeewithgames.com/2010/08/should-there-be-game-revie
w-standards.html



I think this is a discussion that needs to go further into the gaming industry.


none
 
Comment: