Are rating systems worth having?

Tools    





Gotcha, sorry if I came off as condescending up above.
No worries. Condescencion is fine. Dodging a fair comparison, however, breaks form with good reasoning.
I think the point I was making though is that ratings do convey information to the person conducting them, whether you're referring to a film critic, a doctor measuring pain, or pollsters using ratings to measure whatever. I rate movies since I think they help me. To other people who come across those ratings and are unsure of what the numbers mean though, it's a different story. For instance, if I hear someone measured a 7 for pain, I'd maybe consider prescribing some kind of medication since that seems like a fairly big number. I don't have a medical background though, so I'm not sure if that's what a doctor would do. If I was a doctor though, that number would be valuable for me.
Sure, but the point of that doctor's scale is that it is intended to be as universal as possible (that's why it has frownie faces and smiley faces). You don't have to be a doctor to understand that pain scale. A child or an adult with diminished capacity can understand it.

It has to informative to the patient first, as the patient must successfully interpret the question. And the patient is "John Q. Public." If if the scale works for them in terms of application, then the scale is universally providing useful information even before the score is reported back to the doctor or the nurse and recorded in a chart or a spreadsheet. Ditto for Likert items in those polls. The public must universally recognize what is being asked of them before they can report back to the pollster who tabulates those responses. In short, yes that scale should make sense to you as well as the doctor, because first it must makes sense to the patient (who is usually not a doctor either). Whether or not to prescribe medication is a consideration downstream and detached from the subjective assessment (what to do with information as a medical professional stands apart from our general understanding of the level of pain which is being reported).

Of course there are hypochondriacs and people who misinterpret (e.g., who think that anything less than a five is a "thumbs down") so there is noise mixed in with this signal. There is slop and idiosyncrasy. You're absolutely right about that. But it is also quick and dirty and gives more information than "yes/no" binaries without getting lost in a sea of wine-tasting adjectives. They're a quick assessment. You can read various reviews from critics or you can look to aggregated scores. And you can do both. It's all a question of how interested you are and how much time you have. It has it's place.

Again, I think that you could make the case for an improved scale and/or an improved rubric standardizing what scores mean and/or improve responses through training (normalization) within a community. That would probably increase the signal a bit, but there will always be noise with which we must contend.



Trouble with a capital "T"
I've often claimed ratings like on IMDB don't mean crap to me. If a film sounds like something I want to watch, I'll watch it. While that's true and I've watched films rated 5 on IMDB and liked them...sometimes those ratings come in handy...

Last night I watched a film with Ann Sheridan so I looked up her filmography on IMDB and she was in a LOT of films, too many for me to research each one. So I quickly scanned the list looking at the titles, years and ratings and only looked at the films that seemed promising. So in that way ratings did help me.



No worries. Condescencion is fine. Dodging a fair comparison, however, breaks form with good reasoning.

Sure, but the point of that doctor's scale is that it is intended to be as universal as possible (that's why it has frownie faces and smiley faces). You don't have to be a doctor to understand that pain scale. A child or an adult with diminished capacity can understand it.

It has to informative to the patient first, as the patient must successfully interpret the question. And the patient is "John Q. Public." If if the scale works for them in terms of application, then the scale is universally providing useful information even before the score is reported back to the doctor or the nurse and recorded in a chart or a spreadsheet. Ditto for Likert items in those polls. The public must universally recognize what is being asked of them before they can report back to the pollster who tabulates those responses. In short, yes that scale should make sense to you as well as the doctor, because first it must makes sense to the patient (who is usually not a doctor either). Whether or not to prescribe medication is a consideration downstream and detached from the subjective assessment (what to do with information as a medical professional stands apart from our general understanding of the level of pain which is being reported).

Of course there are hypochondriacs and people who misinterpret (e.g., who think that anything less than a five is a "thumbs down") so there is noise mixed in with this signal. There is slop and idiosyncrasy. You're absolutely right about that. But it is also quick and dirty and gives more information than "yes/no" binaries without getting lost in a sea of wine-tasting adjectives. They're a quick assessment. You can read various reviews from critics or you can look to aggregated scores. And you can do both. It's all a question of how interested you are and how much time you have. It has it's place.

Again, I think that you could make the case for an improved scale and/or an improved rubric standardizing what scores mean and/or improve responses through training (normalization) within a community. That would probably increase the signal a bit, but there will always be noise with which we must contend.
Actually, with further pondering over the doctor's scale, I would say that the difference between a movie rating and the doctor's scale you're describing is that it does give a brief description of what those numbers mean with the smiley/sad faces and the tags of "Severe", "Very Severe", etc. Coming across a numerical rating by itself on IMDb, for instance, wouldn't convey that information. A more apt comparison would be if the doctors' scale just had the numbers, nothing else whatsoever, and you were told the patient measured a 7. You wouldn't see the sick face and you wouldn't see the "Very Severe" tag. You'd just know they measured a 7/10. Unless you're given those tags which break down the scale, there's no way to interpret what the number means.

The same goes for movie ratings v. movie reviews. If someone gives a movie a 7/10, there's no way of knowing what that 7 would mean unless you're given a review or some words that break down what the 7 means. If you're given the rating plus the review, that's fine, but I would argue the rating would be unnecessary since the review already says all that's needed about the film and communicates more information.
__________________
IMDb
Letterboxd



The Guy Who Sees Movies
The problem with ratings is raters. Having spent some occupational years in the business of surveys and ratings, I know that they need to be carefully concocted and then treated with some caution. 5 Popcorns isn't much in the way of science and it doesn't get better with 10. Some studies have said that more than a "5" scale is a waste of time and don't even get started on decimals. Hardly anybody that does any ratings uses a consistent logic or even some crude benchmarks. If they enjoyed the movie, it's a 5, if not, then maybe a one, but then they might just not like that genre. You just need to take that as a given.

Barring science and standard benchmarks, your next best bet is to know something about the rater, like how much I dislike westerns. On a 5 scale, the best of all westerns is a 3. When you have hundreds of anonymous raters, you just won't know. As for ratings on MoFo, I'd give them a slight edge because this crowd likes movies and has probably seen a lot of them; they are higher than average on experience and movie literacy.

Just take the ratings for what they are....completely subjective recommendations. If you know something about the rater that you agree with, then you might give them more credence.



Actually, with further pondering over the doctor's scale, I would say that the difference between a movie rating and the doctor's scale you're describing is that it does give a brief description of what those numbers mean with the smiley/sad faces and the tags of "Severe", "Very Severe", etc. Coming across a numerical rating by itself on IMDb, for instance, wouldn't convey that information. A more apt comparison would be if the doctors' scale just had the numbers, nothing else whatsoever, and you were told the patient measured a 7. You wouldn't see the sick face and you wouldn't see the "Very Severe" tag. You'd just know they measured a 7/10. Unless you're given those tags which break down the scale, there's no way to interpret what the number means.
This is true. On the other hand, pollsters almost never have faces and descriptions attached to surveys with Likert-type items. And social science surveys have been justified in terms of validity and reliability (i.e., we trust, to some degree, the science of polling data) even though they don't offer any more information than "strongly agree."

So, no you don't absolutely need the tags to interpret what the numbers mean. You're overstepping in your conclusion when you say "there's no way to interpret what the number means." Well, yes there is. A bigger number means greater perceived quantity/intensity. This doesn't tell us much, but it does tell us more than "like/dislike."

That stated, and again, I am hip to the idea of improving our ratings scales in the way doctors have attempted to by offering normalizing/calibrating qualifiers in the tags which are iconic (faces) and symbolic ("worst pain you've ever felt"). We just might be able to improve our ratings systems a smidge if we moved in this direction. But again, ratings are intended to be quick and dirty.
The same goes for movie ratings v. movie reviews. If someone gives a movie a 7/10, there's no way of knowing what that 7 would mean unless you're given a review or some words that break down what the 7 means. If you're given the rating plus the review, that's fine, but I would argue the rating would be unnecessary since the review already says all that's needed about the film and communicates more information.
We're rarely asking for the subjective response of just one person when we're asking about film quality. Rather, we aggregate those scores to make a statistical inference (either amateur/lay or scientific/rigorous). Our position is less precarious than that of the doctor who needs to know the patient's specific pain level. Our interest is in the enjoyment level of the audience. And in the aggregate idiosyncrasies and inaccuracies average out.

And if we are asking for the specific response of an individual, because we truth their opinion, then it stands to reason that we know them (we have a background context which grounds our interpretation of their response).



When you have hundreds of anonymous raters, you just won't know.
You won't need to. Their idiosyncrasies will average out.



The Guy Who Sees Movies
You won't need to. Their idiosyncrasies will average out.
That's the problem. Take a thousand numerical ratings on anything, including frozen lasagna, weather or politics, and "average out" is just the mathematical certainty that there's a middle. Average is just a calculation. What I want to know in a movie rating is whether I want to spend some money and time watching it. I don't want a mathematical middle; I want to know whether it's a movie worth my time and money. For that, I need to know that a reviewer is more competent that somebody who just clicks "3". I'd rather have 1 reviewer whose tastes are similar to mine than 1000 who overage out to a 3.



That's the problem. Take a thousand numerical ratings on anything, including frozen lasagna, weather or politics, and "average out" is just the mathematical certainty that there's a middle.
It's not just that there is an average (we knew that as a matter principle), but the measure tells us where the average is located. Do people, more or less, like this or dislike this/agree with this or disagree with it. And that can be a useful thing to know.
Average is just a calculation. What I want to know in a movie rating is whether I want to spend some money and time watching it. I don't want a mathematical middle; I want to know whether it's a movie worth my time and money.
The average can help you make this decision. If you see restaurant A has bad reviews (on average) and restaurant B has good reviews (on average) and if time is limited and this is all the information you have, this is better than nothing if you have to make decision. If everyone in town say that restaurant "B" is the best, it probably isn't terrible.

Ditto for film reviews. It's not a guarantee. Your mileage may vary. But it's better than nothing and it can rationally inform your decision about how to invest your entertainment dollar.
For that, I need to know that a reviewer is more competent that somebody who just clicks "3". I'd rather have 1 reviewer whose tastes are similar to mine than 1000 who overage out to a 3.
No, you don't. You don't need to know the competency of any individual reviewer. You need to know the competency of the crowd, because that is who your are consulting when you make reference to an average. Individuals average out of the equation. You only need to be familiar with the patterns of the herd, not the particulars of any member of it.

Rotten Tomatoes gives the average of both the Senate and the Assembly, the patricians and the plebes. Pick your herd. Compare the two herds. If the experts and laymen are in agreement, then you apparently have a strong warrant for universal perceptions of quality.



This is true. On the other hand, pollsters almost never have faces and descriptions attached to surveys with Likert-type items. And social science surveys have been justified in terms of validity and reliability (i.e., we trust, to some degree, the science of polling data) even though they don't offer any more information than "strongly agree."

So, no you don't absolutely need the tags to interpret what the numbers mean. You're overstepping in your conclusion when you say "there's no way to interpret what the number means." Well, yes there is. A bigger number means greater perceived quantity/intensity. This doesn't tell us much, but it does tell us more than "like/dislike."

That stated, and again, I am hip to the idea of improving our ratings scales in the way doctors have attempted to by offering normalizing/calibrating qualifiers in the tags which are iconic (faces) and symbolic ("worst pain you've ever felt"). We just might be able to improve our ratings systems a smidge if we moved in this direction. But again, ratings are intended to be quick and dirty.

We're rarely asking for the subjective response of just one person when we're asking about film quality. Rather, we aggregate those scores to make a statistical inference (either amateur/lay or scientific/rigorous). Our position is less precarious than that of the doctor who needs to know the patient's specific pain level. Our interest is in the enjoyment level of the audience. And in the aggregate idiosyncrasies and inaccuracies average out.

And if we are asking for the specific response of an individual, because we truth their opinion, then it stands to reason that we know them (we have a background context which grounds our interpretation of their response).
I think I can get behind this to a degree. I was mainly looking at reading individual ratings, but I see your point that larger sample sizes will cause "unusual" rating systems to balance out in the average and for inferences to still be possible.

That said, even when referring to individual ratings per person, there will still be plenty of cases where their thoughts will still be unclear just by looking at the number. If you take this site, for example, we get new users every now and then and there's still a lot of people here who I've barely ever interacted with, if at all. If they said a rating out of the blue, I'd be confused since I wouldn't know what their rating scale would be like. For instance, take a look at this thread. Even factoring in the users from the bump, I don't think I'd be able to figure out what the vast majority of them think about the film just by looking at their ratings by themselves. I've known you for a while from RT and Corrie, and I honestly don't know what your rating system is either. I don't remember you ever mentioning it. My rating system is broken down in depth on my IMDb profile, but I don't believe I've ever posted it on this site or on Corrie (maybe on RT though), so are you familiar with what my rating system is without looking it up?

Though sure, if you're looking at a larger sample size (the average rating on IMDb and Letterboxd), I see your point that it would convey some information.



That said, even when referring to individual ratings per person, there will still be plenty of cases where their thoughts will still be unclear just by looking at the number. If you take this site, for example, we get new users every now and then and there's still a lot of people here who I've barely ever interacted with, if at all. If they said a rating out of the blue, I'd be confused since I wouldn't know what their rating scale would be like. For instance, take a look at this thread. Even factoring in the users from the bump, I don't think I'd be able to figure out what the vast majority of them think about the film just by looking at their ratings by themselves.
I think you're right. Ratings offered here will be of more limited utility in terms of legitimate statistical inference. Most ratings are not aggregated, so you kind of have to compute the average in your head and responses are sporadic. Also, people who love or hate the film are likely to brigade the voting. I think that in this space, it is more about having a non-quantitative, but rather qualitative understanding of participants. When 4-5 five your trusted taste guardians/guides on MoFo (or some other forum) tell you "Check out X, it's great!", your warrant is probably justified, but it isn't quantitative. We have to get to know each other and calibrate each other's tastes. But isn't this the art of conversation?
I've known you for a while from RT and Corrie, and I honestly don't know what your rating system is either. I don't remember you ever mentioning it. My rating system is broken down in depth on my IMDb profile, but I don't believe I've ever posted it on this site or on Corrie (maybe on RT though), so are you familiar with what my rating system is without looking it up?
Well, not to be a smartass, but I've never really thought much of rating systems in contexts like this (for precisely the reasons you mentioned in your post). I trust ratings as a sort of crude guide to quality and not much more. If I am trying to figure out if I want to pay cash money and drag family and friends to watch with me, I will dip my toes into critical assessments and see what forum members are saying to inform my decision. That's about it. In a context like this, I'm more interested to see what critical standards are invoked and what evidence is supplied to meet those standards (i.e., arguments). In addition, I can be hooked by a good narrative or even a poetic expression of endorsement, but these are also more fine-grained, qualitative, and require nonmathematical digestion. The rating is a weak sign of quality. Useful, but limited. Quick, but dirty. A start, but not a finish. Right?



I think you're right. Ratings offered here will be of more limited utility in terms of legitimate statistical inference. Most ratings are not aggregated, so you kind of have to compute the average in your head and responses are sporadic. Also, people who love or hate the film are likely to brigade the voting. I think that in this space, it is more about having a non-quantitative, but rather qualitative understanding of participants. When 4-5 five your trusted taste guardians/guides on MoFo (or some other forum) tell you "Check out X, it's great!", your warrant is probably justified, but it isn't quantitative. We have to get to know each other and calibrate each other's tastes. But isn't this the art of conversation?

Well, not to be a smartass, but I've never really thought much of rating systems in contexts like this (for precisely the reasons you mentioned in your post). I trust ratings as a sort of crude guide to quality and not much more. If I am trying to figure out if I want to pay cash money and drag family and friends to watch with me, I will dip my toes into critical assessments and see what forum members are saying to inform my decision. That's about it. In a context like this, I'm more interested to see what critical standards are invoked and what evidence is supplied to meet those standards (i.e., arguments). In addition, I can be hooked by a good narrative or even a poetic expression of endorsement, but these are also more fine-grained, qualitative, and require nonmathematical digestion. The rating is a weak sign of quality. Useful, but limited. Quick, but dirty. A start, but not a finish. Right?
I think we're in agreement then. Good discussion



I've mentioned it before, but 80-90% of all movies fall into one of three categories:



🫤



Apart from that, we're either determining the best of the best, or pointing out movies so bad they deserve public ridicule.



I find the arbitrariness of my star ratings to disincentives me from giving star ratings to me movies, even for my own reference.
I hear you, but I think that's where reviews come in. I never just rate movies--I always write up at least a few paragraphs about my viewing experience. A rating (and a review) is a reflection of my mindset while watching and reflecting. Sometimes I'll look back and be shocked at the rating I gave something. (Like a film I remembered giving a 2.5, maybe a 3, and then I gave it a 4?!!!!).



As I've said many times before, I hate rating movies. I do it every once in a while because I want something I've written to count as a "Review" here but mostly I don't because how do you do that with any accuracy?
The scale is just a constantly moving target. Just by changing genres you mess up the scale. Just by changing eras you mess up the scale. How consistently are you parsing your movies?
I think that the answer is to ignore the pressure that numbers/grades impart that you must be objective/consistent/etc.

My ratings are reflections of how much I enjoyed/appreciated a film. That's it. It's useful to me. And it might be somewhat useful to someone whose taste generally aligns to mine. But the key is that they are paired with written reviews. And I try to be pretty clear about things that earned a film points for me and things that lost it points. So something might get an extra half star from me because it starred an actor I like, or had a certain type of plot I enjoy.

I also think that ratings can help someone contextualize my writing. If I write about something that bothered me in the film, but still rate it very highly, someone can infer that it must not have been too bad. And if I write about something I liked, but my rating is relatively low, you can infer that it wasn't enough to make the film good/great.

I think that ratings and reviews go nicely hand-in-hand. A rating on its own? Eh. Not the most helpful thing.



You mean me? Kei's cousin?
Ratings are certainly helpful with ranking films, though they obviously don't always tell the full story. For instance, a
might have more replay value than a
even if the
is the technically better film.
__________________
Look, Dr. Lesh, we don't care about the disturbances, the pounding and the flashing, the screaming, the music. We just want you to find our little girl.



I hear you, but I think that's where reviews come in. I never just rate movies--I always write up at least a few paragraphs about my viewing experience. A rating (and a review) is a reflection of my mindset while watching and reflecting. Sometimes I'll look back and be shocked at the rating I gave something. (Like a film I remembered giving a 2.5, maybe a 3, and then I gave it a 4?!!!!).
I wish I had the expertise and time to review every film I watch



RIP www.moviejustice.com 2002-2010
If ratings are completely useless, someone should tell the healthcare industry, because they regularly ask questions about the pain scale.
I really do hate their physician pain scales... I usually always say a five, six or maybe seven because if it's anything less than that, why am I at the doctor's paying a stupid co-pay. On the other hand I've never been burned at the stake, never have been shot, and never have been stabbed, so I couldn't really say that a nine or 10 is like, nor do I understand how some randomly person at the doctor's office would say they're at a nine or ten when they're walking perfectly fine, not wincing, and laughing and having casual conversations on the phone in the waiting room. To me if I'm at a nine or 10, I got tears in my eyes and am screaming or on the ground rolling in pain. Certainly I'm not going to be chit chatting on the phone about whatever.

I usually just rate films on a simple F to A+ scale like the old fashion boring school grading system... simple but it works and allows for plenty of variety. Maybe too much as a five point scale might be a good compromise between too many options and not enough. Of course on letter box I do the five point... I think it is.

I'd be perfectly happy using the MoFo popcorn box thing too, but I still haven't figured out how to put those graphics into a post.

Generally if I love a film, think it's objectively great, does something that I find blind blowing or that I'm in awe of AND I wouldn't change a single thing about the film, I give it an "A+" If it's all of those things, but there's some weak points a few things I'd change, I'd give it an "A"
__________________
"A candy colored clown!"
Member since Fall 2002
Top 100 Films, clicky below

http://www.movieforums.com/community...ad.php?t=26201



The Guy Who Sees Movies

No, you don't. You don't need to know the competency of any individual reviewer. You need to know the competency of the crowd, because that is who your are consulting when you make reference to an average. Individuals average out of the equation. You only need to be familiar with the patterns of the herd, not the particulars of any member of it.
The competency of the reviewer is what I do want to know. I don't have much faith in crowds nor their taste in movies. It's crowds that bring us prime time TV as it is.



The competency of the reviewer is what I do want to know. I don't have much faith in crowds nor their taste in movies. It's crowds that bring us prime time TV as it is.
You seek a criterion of correctness outside of yourself. Otherwise, why would you care about the competency of any reviewer? Thus, there is an audience of people who sees the world correctly (or who at least see the world, more or less, your way). Take that herd in the aggregate and ratings will work (for you) and idiosyncrasies will average out of the equation. Your problem is not of crowds, but finding the watering holes where your crowd meets.



Victim of The Night
If ratings are completely useless, someone should tell the healthcare industry, because they regularly ask questions about the pain scale.



If ratings are completely useless, then someone should tell the pollsters, because they regularly use Likert-scales in polling research.


If you want a more meaningful scale than the five-start system, then I would recommend standardizing one and popularizing it. Should we have a ten-point scale? Should we have a scale with negative numbers and a zero point? Should we have a standardized rubric indicating what each level means? We could probably add some precision, although there will always be noise interfering with the signal. If you wish, we might pilot a new scale and a new rubric in a thread and see what it gets us. Until then, I'm giving this thread two monkeys out of five.
The difference, of course, is that in both these cases and pretty much every other one that uses them, the ratings are pre-defined. What like a 7/10 on the pain scale would feel compared to a 5 or a 9 is put right in front of you. Or, in the OPAC example, specific questions are asked with a rating system that is also defined.
In our movie ratings, none of this is true. What makes a movie a 4 for you versus me could be as different as any objective things can be. There are no specific questions and no defined meanings or values for the degrees of the scale. Everyone just defines them for themselves. If this was true in Medicine, the Pain Scale would be useless. If it was true in the OPAC example, no useful data would be obtained.
And that's what movie-ratings give you. Almost nothing. All you know is that I gave it 4 1/2 stars. What 4 1/2 stars means in this case, you have no idea unless I then describe it. Maybe I have nailed it and an average of the opinions of all people watching this film will be that it was excellent nearing greatness. Then again, maybe I'm just a huge fan of that genre and would give any Harry Potter film a floor of 4 stars but I liked this one a little more than most but not as much as The Goblet Of Fire, which is a perfect film in my opinion. Because, also, I am only 12 years old and haven't seen anything other than what's been in theaters and I have been given no objective and consistent criteria for what goes into the rating system. But of course no one knows that so maybe they take my 4 1/2 stars completely seriously.
It's a complete mess.