There are a lot of confounding variables here, and others have mentioned a few.
One I have not seen is "proximity to major arterials". Sometimes crime concentrates around major streets and parking lots where cars can be stolen or broken into. (The OP mentions "grand theft from locked auto" as being ~10% of crimes, for example.)
Another common crime, shoplifting and petty thefts, occurs around strips of shops, urban malls, transit stations, or bus stops.
These crimes really don't have to do with altitude, and they are common enough to really affect results. The reason I know this is by inspecting the LAPD compstat maps, which are superior, in many ways, to the linked one. One place to see them is:
This is Hollywood. You can notice the concentration of crimes along Hollywood Blvd, or the other major east-west streets. (The zoom tool is on the right, and it may be good to increase the time span to a month.) Inspecting the causes show they are often thefts from vehicles, etc., as mentioned above.
This is invalid. The numbers calculated are counts of crime incidents, not crime rates. A km^2 block with ten times the population density will have ten times the number of crimes; this doesn't mean it is ten times more dangerous!
A block with ten times the population density and four times the number of residences that has ten times the number of home break-ins is more dangerous - there are more crime victims. I suppose this means that the type of crime is important.
I live in the low-crime area of SF depicted in the graph, and I can confirm that most definitely there is a much lower population density than in other areas of SF, including places like PacHeights. It is mainly residential, with the houses being roughly 100 years old. Most people in SF that I know have never been to that area, mainly because there's no reason to unless you live there.
But then again, the crime rates in Outer Sunset are relatively low as well, and they are at roughly sea level, and they are at stark contrast to SOMA, etc.
Mind you, the houses in that area start at $900k for 2br/1ba houses, and 1.2M for 3/2 of reasonable size. There are very large houses as well for $2M+, especially in St. Francis Wood, and Forest Hill.
Good point. Also rich people live in the hills, Of course there is more crime in poorer areas with more density and with less care from police. Just put SF in the title and you'll still get a lot of self-loving sf people to click.
If I want to know the chances of someone breaking into my apartment I want the data per-apartment. If I want to know the same for cars I want the data per-car. If I want to know my chance of getting shot I want the data per-person. Overall crimes per person approximates what we care about much better than crimes per unit area.
> If I want to know my chance of getting shot I want the data per-person
In reality though, it isn't as simple as that. Sure, I don't want to be shot. I also don't want other people to be shot a lot in my neighborhood. If I walk to work and see a car parked on the street has been broken into, that bothers, even though my car is in a secure parking garage.
I don't want to be the victim of crime, but I also don't want to live amidst crime. I don't want to see it, I don't want to hear about it, and I don't want to think about it. Crime per km^2 is nearly as important to me as crime per capita.
To put it another way: if you live alone, statistics measured per-person are directly applicable to the chance something will happen to you. If you have a family who lives with you, though, the combined probability distribution of something happening to someone in your household changes quite dramatically.
Or, I suppose, if you're trying to decide which square kilometer of dirt to build apartment complexes on based on projected property value trends (where crime is always a large factor.) These tend to be the people who pay for this sort of data (and thus incentivize its collection), which is why it's put in their terms.
I'd like a map overlay of median(not average! that means nothing in SF!!!) income of those areas. Something tells me higher elevation = way more expensive, especially here in SF.
I don't think the common thief is wandering about on billionaire-row[1]. I think this has more to do with wealth than land-elevation itself.
I lived in a crappy $500/month place in Chinatown for a year, ending this spring. It was the lowest rent building I've ever heard of in SF and most my neighbors were elderly immigrants with limited English (or even Mandarin!) language skills.
It was way safer than any of the three directions that went down hill (i.e. towards Market, towards the Embarcadero or towards fisherman's wharf). There was also a notable absence of beggars. I think there really is something to it just being too much of a PITA to hike up all those hills.
It happens in the southern sections of the city too (which are not included on the map). Ingleside Terrace is considerably more expensive than the OMI (ocean view, merced heights, ingleside). Similarly, Mission Terrace, which not a posh neighborhood, is nonetheless a bit more expensive than the Excelsior, which is elevated ground toward McLaren Park.
Having lived in Potrero, I've noticed that the houses get significantly ritzier towards the top. But the side facing away from the city is a pretty steep downgrade.
That's exactly the point behind "Crime Doesn't Climb" - it's a reference to the wealthy being "Up the Hill". Though there are some flatter high wealth areas in the city, the hills are much more exclusively wealthy.
To your point on mean versus median, I agree that median is more accurate. Even better would be homeless %, or % under poverty.
This is why I'd be interested in looking at the graph for the neighborhoods south of 280, particularly in the eastern section of the city. In many areas, property values drop and elevations rise as you head south.
That doesn't necessarily mean more crime, nor does it necessarily mean an identical mix of crime. That's a big part of why it would be so interesting to take a look.
I really like the work here. Very cool graph and visualization, and if there are things I'd like to see, it's not that they are "missing", it's that the approach is triggering some ideas for how to look at and interpret the data.
I think it's just incredibly inconvenient to get from the tenderloin to the ritzy areas of SF by public transit. You pretty much need a car or car service.
The TL is immediately adjacent to the rity commercial area (Union Square), to the point where tourists accidentally walk there.
Bayview/HP aren't even that isolated -- the T muni line goes there, along with lots of buses.
At some level it's inconvenient to go anywhere in SF, but Pac Heights isn't too hard to get to by transit. The Marina is low elevation, but separated by some hills, but even that isn't inaccessible.
Well, there are cable cars but thugs and crooks don't like to ride with tourists. There are some social lines one simply doesn't lower oneself to cross (unless mugging them directly -- that's like social justice).
Financial crime is typically committed in offices in the financial districts of cities, not at the home addresses of the offenders. These districts are typically very well connected by public transport.
Probably for the perpetrators, probably not for the companies involved (which are likely to have the same set of Delaware and/or Cayman mailing addresses) :-P
But sure, I would expect that white collar crime is unsurprisingly correlated.
Maybe not though - boiler room style phone banks of investor fraudsters are a stereotype for a reason, and those aren't necessarily the high rent end of town.
Are you counting the crime as happening at the place of business, or at the home of the perp? It'd be hard to quantify, as WHERE a white collar crime happens is not particularly material. You can cook the books at home or at work.
Hmm... I don't know if equal-area tells you much. As you go higher, you also sometimes/often get much rougher terrain. Population density might be more useful?
Very good article! The Data Science Toolkit he pumps as an alternative to pirating google's APIs looks awesome, and deserves a direct link: http://www.datasciencetoolkit.org
Throwaway makes a very valid point that crime per areas isn't very useful. But even beyond that the map is also confusing because for the lower elevations it shows both lower areas AND higher area crime incidents. Then as it goes higher, it just shows the higher ones. In other words, it should really show an outer ring for the lowest elevation (since that's where the lower areas are) and then a smaller ring, etc. Only the highest areas would not have a hole in the center (most likely).
Think of elevation maps and how you would select out one range of elevation at a time - you'd end up with donut like rings. You wouldn't mark an area as 500 feet and show all the rings for 500 feet and up. Doing this makes it seem like there are far more crimes than actual for the lower levels.
TopOSM is Open Data (OpenStreetMap data) and is my go-to place for finding out where hills are. http://www.toposm.com/ (For looking at hills in cities, click the "+" on the right to get a menu where you can toggle streets ("Map Features") on and off.)
This is a fun exploration, but there are just so many plausible additional factors, from population density, to SFPD's selective enforcement, to many others that can be at least as significant if not more so than this one.
This is not meant to be an exhaustive causal analysis. You try to control for land mass, but don't really mention anything else that might indicate elevation to be a less significant factor and I think that does a great disservice to what otherwise is an interesting exercise.
I don't think the article is trying to claim that elevation magically has a causal impact on crime rates. It just shows that there is a correlation and claims nothing more. The point is that there IS some underlying root cause (or an extreme coincidence) and that is an interesting point in and of itself.
"A great disservice"? Lighten up, this was a hackathon project, not a doctoral thesis. And the code is there for anyone to build upon using other factors.
I agree. This is a good example of a small project that was worth getting out there. The code is posted, so if it's triggering some ideas, that's kind of the point - by all means, run with it!
I suppose the the analysis would be better if it used Census Block Groups for the grid. That it could better account for population density. Unfortunately extracting CBG data is extremely painful. CBG data would also be slightly problematic in that it is residence based, thus a place like NYC's Time Square may show high crime rates per "resident", but is extremely safe.
Does anyone know where you can get a simplified CBG data dump? All you would need are boundaries and population.
This is interesting, but on my browser, the map cuts out much of the southern section of SF. Is it included in the numbers? This area may present a particularly interesting section to study elevation changes.
I guess street crimes are more frequent in areas where there is an higher concentration of people/shops/bars etc.
I can't say I see a definite correlation with altitude: areas that are more secluded from the main SF buzz like the Marina and the west coast show very little crime, and they are basically at sea level.
Very interesting. Other comments have listed possible factors like population density, public transportation (MUNI). I'd love to see the correlation with:
What about adjusting for the greater probability that a given area is residential the higher the elevation? Perhaps more crime occurs outside residential areas. Just speculating here.
skwirl has the correct answer here, but it is buried in a sub-comment:
"I don't think the article is trying to claim that elevation magically has a causal impact on crime rates. It just shows that there is a correlation and claims nothing more. The point is that there IS some underlying root cause (or an extreme coincidence) and that is an interesting point in and of itself."
The first thing on that page is a GIF. There is no explanation. It is not clear what the GIF is meant to be portraying. I even went on to read the first two paragraphs, which also didn't explain what the GIF was supposed to be illustrating.
One I have not seen is "proximity to major arterials". Sometimes crime concentrates around major streets and parking lots where cars can be stolen or broken into. (The OP mentions "grand theft from locked auto" as being ~10% of crimes, for example.)
Another common crime, shoplifting and petty thefts, occurs around strips of shops, urban malls, transit stations, or bus stops.
These crimes really don't have to do with altitude, and they are common enough to really affect results. The reason I know this is by inspecting the LAPD compstat maps, which are superior, in many ways, to the linked one. One place to see them is:
http://www.crimemapping.com/map/region/LAPDHollywoodArea
This is Hollywood. You can notice the concentration of crimes along Hollywood Blvd, or the other major east-west streets. (The zoom tool is on the right, and it may be good to increase the time span to a month.) Inspecting the causes show they are often thefts from vehicles, etc., as mentioned above.