In the demo I put the obama prank photo http://karpathy.github.io/2012/10/22/state-of-computer-visio... and asked "Why is this picture funny?" and it responded "Question: Why is this picture funny? Answer: President Obama is taller than the average person."
Furthermore the man on the scale is faced the other way and wouldn’t know someone is stepping on the scale. There’s an element of theory of mind there. You would have to understand that the man on the scale is unaware of Obama’s action.
> @karpathy: We tried and it solves it :O. The vision capability is very strong but I still didn't believe it could be true. The waters are muddied some by a fear that my original post (or derivative work there of) is part of the training set. More on it later.