Post Reply 
S7 Talks Anime
Talk About Anime!!
Author Message
Assassinator
...

Posts: 6,646.6190
Threads: 176
Joined: 24th Apr 2007
Reputation: 8.53695
E-Pigs: 140.8363
Offline
Post: #371
RE: AniList 3/Anime Chatter with Senseito
Senseito7 Wrote:Lol good example..

4 of the 6 examples are close enough.. the 5th and 6th are different by quite a bit.. I'd say that's a pretty good compatibility estimation.

Errrr.... I actually came up with numbers completely randomly, with no thought whatsoever.  You're supposed to replace with your own numbers.
(This post was last modified: 02/11/2009 04:38 PM by Assassinator.)
02/11/2009 04:31 PM
Find all posts by this user Quote this message in a reply
ProperBritish
Daddy Proper
Team DreamArts

Posts: 5,666.3250
Threads: 192
Joined: 19th Nov 2008
Reputation: -2.36574
E-Pigs: 147.7035
Offline
Post: #372
RE: AniList 3/Anime Chatter with Senseito
i just started watching Elfen Lied, due to recommendation from Sensei

it is epictacular

[Image: rsz_contrast.png]

Spoiler for More sigs:
[Image: 6xu74t8]
[Image: sig.php]

[Image: 656embk]
[Image: sig.png]
02/11/2009 04:36 PM
Find all posts by this user Quote this message in a reply
roberth
Resident Full Stop Abuser.....

Posts: 4,580.2098
Threads: 200
Joined: 18th Jun 2007
Reputation: -5.5814
E-Pigs: 43.8419
Offline
Post: #373
RE: AniList 3/Anime Chatter with Senseito
Of course it is, its Elfen Lied :)

02/11/2009 04:39 PM
Find all posts by this user Quote this message in a reply
Assassinator
...

Posts: 6,646.6190
Threads: 176
Joined: 24th Apr 2007
Reputation: 8.53695
E-Pigs: 140.8363
Offline
Post: #374
RE: AniList 3/Anime Chatter with Senseito
One problem I see with using correlation is that it only respects the strength of the relationship between each other, and not the magnitude of the numbers themselves.

So if you 2 sets of numbers like
x = {2,3,2,3,2}
y= {6,9,6,9,6}

That's a perfect y=3x relationship, and you'll get a correlation of 1 (100% "compatibility").  But in reality, in terms of anime ratings context, I won't say they're compatible at all...
(This post was last modified: 02/11/2009 04:47 PM by Assassinator.)
02/11/2009 04:41 PM
Find all posts by this user Quote this message in a reply
ZiNgA BuRgA
Smart Alternative

Posts: 17,023.4213
Threads: 1,174
Joined: 19th Jan 2007
Reputation: -1.71391
E-Pigs: 446.0333
Offline
Post: #375
RE: AniList 3/Anime Chatter with Senseito
Assassinator Wrote:What needs to be done is that instead of each anime being scaled according your distribution, it needs to be first scaled with the distribution of all the other scores for that particular anime (to get rid of "good anime gets good scores form everyone"), then scale that result with your own distribution.  This way it should give better correlation figures.... if stuff doesn't get warped too much.
Not terribly sure that matters so much for correlation.  If both people rate it high, it's positively correlated.  Scale it down, and there's still positive correlation.  Pearson already takes this into account.
The scaling is just how they decided to do it.  I can't say anything wrong about it really.
What I'm saying is that there's plenty of other issues not regarding the numerical manipulations used:
- it's based on things you've watched in common; two people who watch very different things most likely have low compatibility since they probably like different stuff, but if they happen to rate some common stuff they saw in a consistent manner, they'll end up with a high score
- as above, distortion from small samples?

But coming up with a good statistic is always difficult anyway.


Assassinator Wrote:Oh, and that also might be quite computationally intensive.  I mean some anime have like 50000 votes on it, and lets say you have 100 anime in your list that matches... sounds heavy.  Although since I'm not much of a programmer, I can't really say for certain (ask Zinga), but it's probably not that viable.
Nah, voting average scores are easy to calculate.  Typically they just store the total and number of votes.  Average score = total ÷ num_votes.
02/11/2009 04:47 PM
Visit this user's website Find all posts by this user Quote this message in a reply
S7*
Sweet Dreams

Posts: 16,689.4373
Threads: 1,056
Joined: 3rd Apr 2007
Reputation: 14.29926
E-Pigs: 383.2289
Offline
Post: #376
RE: AniList 3/Anime Chatter with Senseito
Assassinator Wrote:
Senseito7 Wrote:Lol good example..

4 of the 6 examples are close enough.. the 5th and 6th are different by quite a bit.. I'd say that's a pretty good compatibility estimation.

Errrr.... I actually came up with numbers completely randomly, with no thought whatsoever.  You're supposed to replace with your own numbers.

Yeah I get that but I was talking about the output of the numbers you used in general.

ProperBritish Wrote:i just started watching Elfen Lied, due to recommendation from Sensei

it is epictacular

roberth Wrote:Of course it is, its Elfen Lied :)

Madwin
02/11/2009 05:01 PM
Find all posts by this user Quote this message in a reply
Assassinator
...

Posts: 6,646.6190
Threads: 176
Joined: 24th Apr 2007
Reputation: 8.53695
E-Pigs: 140.8363
Offline
Post: #377
RE: AniList 3/Anime Chatter with Senseito
ZiNgA BuRgA Wrote:Not terribly sure that matters so much for correlation.  If both people rate it high, it's positively correlated.  Scale it down, and there's still positive correlation.  Pearson already takes this into account.

I think wee're misunderstanding each other.  And by "scale" I was referring to "standardize", not linear scaling.  Ofcourse linear scaling won't do jack, I'm not fucking retarded.

Ok, lets start at the start.  Anime scores are not random, and certain shows everyone considers "good" while certain other shows everyone considers "poo poo", and so that will almost always give you positive correlation.  What I'm proposing is that wee standardize your score for each particular show with the sample for that show, so it removes the "quality element" of the show, and puts all shows on equal footing.  Then do the correlation. 

Not perfect, but should be better than what it is now.

ZiNgA BuRgA Wrote:The scaling is just how they decided to do it.  I can't say anything wrong about it really.
What I'm saying is that there's plenty of other issues not regarding the numerical manipulations used:
- it's based on things you've watched in common; two people who watch very different things most likely have low compatibility since they probably like different stuff, but if they happen to rate some common stuff they saw in a consistent manner, they'll end up with a high score
|
V

Quote:It's also a good idea imo to also take into account some other things such as selection choice (genre of anime).... if they're not doing that already.


ZiNgA BuRgA Wrote:
Assassinator Wrote:Oh, and that also might be quite computationally intensive.  I mean some anime have like 50000 votes on it, and lets say you have 100 anime in your list that matches... sounds heavy.  Although since I'm not much of a programmer, I can't really say for certain (ask Zinga), but it's probably not that viable.
Nah, voting average scores are easy to calculate.  Typically they just store the total and number of votes.  Average score = total ÷ num_votes.

You need to find the mean and SD of the sample, then find your z-score.  Mean and z-score aren't that hard, but finding the standard deviation probably might get intensive?
(This post was last modified: 02/11/2009 06:53 PM by Assassinator.)
02/11/2009 05:03 PM
Find all posts by this user Quote this message in a reply
ZiNgA BuRgA
Smart Alternative

Posts: 17,023.4213
Threads: 1,174
Joined: 19th Jan 2007
Reputation: -1.71391
E-Pigs: 446.0333
Offline
Post: #378
RE: AniList 3/Anime Chatter with Senseito
Assassinator Wrote:Ok, lets start at the start.  Anime scores are not random, and certain shows everyone considers "good" while certain other shows everyone considers "poo poo", and so that will almost always give you positive correlation.  What I'm proposing is that wee standardize your score for each particular show with the sample for that show, so it removes the "quality element" of the show, and puts all shows on equal footing.  Then do the correlation.
Linear or not, it's directionally the same.  But what you're proposing could have issues such as overstating directional inconsistency.
To give a basic example, say A rates "good show" 8 and "bad show" 4, B rates "good show" 9 and "bad show" 3.  If wee do a simple scale to the mean, and say the mean of "good show" is 8.5 and "bad show" is 3.5, A would have scaled scores of -0.5 and +0.5 and B would have +0.5 and -0.5 which would be negatively correlated - probably not what you want.

If that's what you're trying to say.

Assassinator Wrote:You need to find the mean and SD of the sample, then find your z-score.  Mean and z-score aren't that hard, but finding the standard deviation probably might get intensive?
Oh, SD calculations are probably stuck on O(n) time.  People don't watch anime every 2 seconds, so you could probably update it when someone votes.  If that's too much, can always perform staggered updates (ie refresh every hour).

But is it possible to keep a running SD?
EDIT: just tried, yes, possible to keep a running SD - you need to store the sum of differences squared - this can be updated with a few (constant time) calculations when a new vote arrives.
Spoiler for explanation:
You remember:
- total vote count (n)
- sum of all votes (t)
- sum of squared differences with mean (d)

When a new vote (v) arrives, calculate new total and mean (easy to do).  Now let mo be the old mean and mn be new mean.
Let i = mn - mo
dn = do + no(2imo + i2) - 2ito + (v - mn)2

Now that you have dn, calculating the stdev is trivial
(This post was last modified: 02/11/2009 06:57 PM by Assassinator.)
02/11/2009 05:50 PM
Visit this user's website Find all posts by this user Quote this message in a reply
Assassinator
...

Posts: 6,646.6190
Threads: 176
Joined: 24th Apr 2007
Reputation: 8.53695
E-Pigs: 140.8363
Offline
Post: #379
RE: AniList 3/Anime Chatter with Senseito
ZiNgA BuRgA Wrote:Linear or not, it's directionally the same. 

Linear is just scaling by a set factor, old*factor=new, so always the same direction.  Standardizing can changes directions depending on certain factors (although in our case, probably also the same direction).  If you mean it won't change the resulting correlation coefficient... errr, actually you're right. Scaling (with yourself) won't change anything. Facepalm.

ZiNgA BuRgA Wrote:But what you're proposing could have issues such as overstating directional inconsistency.
To give a basic example, say A rates "good show" 8 and "bad show" 4, B rates "good show" 9 and "bad show" 3.  If wee do a simple scale to the mean, and say the mean of "good show" is 8.5 and "bad show" is 3.5, A would have scaled scores of -0.5 and +0.5 and B would have +0.5 and -0.5 which would be negatively correlated - probably not what you want.

Well, that simply means their opinion is conflicting with respect to the mean.  A for whether you want that or not... depends on your point of view.  But yeah, I do agree with you it's probably not the best measure for compatibility.  Scrap that then.

Now that I think of it again, their system isn't too bad.  The major fault is that they decided to scale (-1,1) to (0,1), and labelled everything based on if it were (0,1) natively.  I mean they label 50% (actually 0 correlation) as "medium compatibility", and my 62% with Sensei is (0.24 correlation) as "medium to very compatible"... That's all fucked up, 0.24 is actually very weak correlation.  So yeah, either fix the scaling or fix the labeling.

ZiNgA BuRgA Wrote:EDIT: just tried, yes, possible to keep a running SD - you need to store the sum of differences squared - this can be updated with a few (constant time) calculations when a new vote arrives.
Spoiler for explanation:
You remember:
- total vote count (n)
- sum of all votes (t)
- sum of squared differences with mean (d)

When a new vote (v) arrives, calculate new total and mean (easy to do).  Now let mo be the old mean and mn be new mean.
Let i = mn - mo
dn = do + no(2imo + i2) - 2ito + (v - mn)2

Now that you have dn, calculating the stdev is trivial

Seems do-able then.
(This post was last modified: 02/11/2009 07:11 PM by Assassinator.)
02/11/2009 06:52 PM
Find all posts by this user Quote this message in a reply
Kchan
Omnipotent Presence

Posts: 331.4508
Threads: 50
Joined: 14th Oct 2009
Reputation: -9.23451
E-Pigs: 12.1890
Offline
Post: #380
RE: AniList 3/Anime Chatter with Senseito
Thank you Senseito7 I needed some suggestions (^_^)

_
02/11/2009 07:07 PM
Find all posts by this user Quote this message in a reply
Post Reply 


Forum Jump:


User(s) browsing this thread: 1 Guest(s)

 Quick Theme: