Assassinator Wrote:Senseito7 Wrote:Assassinator Wrote:What needs to be done is that instead of each anime being scaled according your distribution, it needs to be first scaled with the distribution of all the other scores for that particular anime (to get rid of "good anime gets good scores form everyone"), then scale that result with your own distribution. This way it should give better correlation figures.
Why not suggest it?
Because that's all talk and no proof. I'd need to study up on a whole crapload of statistical theory, to make sure everything is all right. And I really can't be stuffed doing that.
Oh, and that also might be quite computationally intensive. I mean some anime have like 50000 votes on it, and lets say you have 100 anime in your list that matches... sounds heavy. Although since I'm not much of a programmer, I can't really say for certain (ask Zinga), but it's probably not that viable.
Ah fair enough.
Alas, the percentage does give an
idea at least? Its just oversaturated or plain inaccurate?
Guys... aren't you talking too much into it? ^^'
Senseito7 Wrote:Alas, the percentage does give an idea at least? Its just oversaturated or plain inaccurate?
http://en.wikipedia.org/wiki/Correlation
Pretty much what they do is... for all the anime wee have in common, take my ratings as a set of X's, your ratings as a set of Y's, and find the correlation. 1 = perfect correlation, 0 = no correlation, -1 = perfect negative correlation.
If you plot all the X's vs all the Y's, you get some sort of plot. Basically, look at the 1st line of that picture I posted a few posts above, that should give you an idea of the kind of plots and what correlation you'll get.
tl;dr:
Positive correlation is pretty much "you like stuff I like", negative correlation is pretty much "you like stuff I hate". Problem is that anime isn't really random data, and generally most people like good anime, and most people don't really like bad anime, so most people end up liking and hating similar stuff, and that's a positive correlation. And it's pretty unlikely to get negative correlation (you'd have to hate stuff that I like. So if I like all the good stuff, you have to hate them).
MyAnimeList decided to make the -1 to 1 scale into a 0 to 1 scale, so -1 becomes 0 (0%), 0 becomes 0.5 (50%) and 1 is still 1 (100%). So a 50% "compatibility" is actually 0 correlation. And since it's pretty unlikely for the correlation to be negative, it's pretty hard to get lower than 50% compatibility, and so our 62% isn't as good as it sounds.
... All that is if I understood everything correctly.
Assassinator Wrote:Senseito7 Wrote:Alas, the percentage does give an idea at least? Its just oversaturated or plain inaccurate?
http://en.wikipedia.org/wiki/Correlation
Pretty much what they do is... for all the anime wee have in common, take my ratings as a set of X's, your ratings as a set of Y's, and find the correlation. 1 = perfect correlation, 0 = no correlation, -1 = perfect negative correlation.
If you plot all the X's vs all the Y's, you get some sort of plot. Basically, look at the 1st line of that picture I posted a few posts above, that should give you an idea of the kind of plots and what correlation you'll get.
tl;dr:
Positive correlation is pretty much "you like stuff I like", negative correlation is pretty much "you like stuff I hate". Problem is that anime isn't really random data, and generally most people like good anime, and most people don't really like bad anime, so most people end up liking and hating similar stuff, and that's a positive correlation. And it's pretty unlikely to get negative correlation (you'd have to hate stuff that I like. So if I like all the good stuff, you have to hate them).
MyAnimeList decided to make the -1 to 1 scale into a 0 to 1 scale, so -1 becomes 0 (0%), 0 becomes 0.5 (50%) and 1 is still 1 (100%). So a 50% "compatibility" is actually 0 correlation. And since it's pretty unlikely for the correlation to be negative, it's pretty hard to get lower than 50% compatibility, and so our 62% isn't as good as it sounds.
... All that is if I understood everything correctly.
you explained it nicely, I get it now. Thanks assassinator :3
So yes, the ratings do work, but it's not realistically a 0% to 100% scale, because it's way easier to get the 50-100 range than 0-50.
The only likely case I see with getting negative correlation is that there are lots of things you really like and at the same time I really hate. Lets say I absolutely hate all shows with loli and you absolutely love loli (lol at this example), and I vote all the shows with loli extremely low, and you vote them all extremely high, that'll get us a swing in the negative direction. If there are enough such swings (I hate enough stuff that you love), wee'd be able to end up with negative. Bust most people are "all rounders" and it's sort of difficult then.
Assassinator Wrote:So yes, the ratings do work, but it's not realistically a 0% to 100% scale, because it's way easier to get the 50-100 range than 0-50.
The only likely case I see with getting negative correlation is that there are lots of things you really like and at the same time I really hate. Lets say I absolutely hate all shows with loli and you absolutely love loli (lol at this example), and I vote all the shows with loli extremely low, and you vote them all extremely high, that'll get us a swing in the negative direction. If there are enough such swings (I hate enough stuff that you love), wee'd be able to end up with negative. Bust most people are "all rounders" and it's sort of difficult then.
Hmm intersting.
Browsing around I saw one or two people I had under 30% compatibility and much lower "Very Bad" compatibility.
Problem was that it was based on a dozen or so that wee shared... and not a very good example but I guess its AN example..
It's sort of like this. You can try sticking in random numbers and seeing what happens.
May not be exactly correct though if you compare with their results, since they may do scaling and who knows what else.
Assassinator Wrote:It's sort of like this. You can try sticking in random numbers and seeing what happens.
May not be exactly correct though if you compare with their results, since they may do scaling and who knows what else.
Lol good example..
4 of the 6 examples are close enough.. the 5th and 6th are different by quite a bit.. I'd say that's a pretty good compatibility estimation.
try plugging in the numbers of a sample of your shows, see if you're right