Monday, May 7, 2012

Chicago Crime - Updated

I was a bit dismayed to discover that the EveryBlock API stopped working. One of my most popular posts, Chicago Crime, depended on this API heavily.

Here I present an alternative data source for my previous post and a fun update. The update is a quick and dirty method to create popular crime density plots, as shown below.


In the spirit of transparency, the city of Chicago has taken the initiative to electronically release a great deal of data with Socrata Open Data API (SODA). Luckily for us, this release includes all filed police reports. The records available through this API range from the year 2001 to the present, which dwarfs the two week range provided by EveryBlock's API. However, I would not recommend downloading all of this data as the text file is approximately 2GB in size.

In this post I use only the filed police reports from one year prior to present. The data is available in both JSON and CSV format. I only use the CSV format here for the convenience of those with an earlier version of Mathematica. To import this data into Mathematica:





A plot of all reported crimes in Chicago between February 1st and March 31st of 2012 was created with the data.


Plot of incidents for any period within the year can be easily created with the following Mathematica code. Restrict the data to the range of interest by replacing the dates of the start and end strings. Dates must be in the format of mm/dd/yyyy. This code simply finds the positions of the strings that match the inputed dates and plots the daily totals of incidents.










The data can be parsed into primary descriptions including arson, homicide, sex offense, weapons violation, crim sexual assault, interfere with public officer, gambling, public peace violation, criminal trespass, liquor law violation, assault, deceptive practice, burglary, criminal damage, robbery, motor vehicle theft, offense involving children, narcotics, theft, other offense, and battery. Below is the plot and code to filter for only narcotics incidents.






















A secondary description can also be used to parse the data. Some secondary descriptions include aggravated: handgun, poss: crack, armed: handgun, harassment by telephone, poss: heroin(white), to land, retail theft, unlawful entry, strongarm - no weapon, to property, poss: cannabis 30gms or less, from building, $500 and under, forcible entry, to vehicle, domestic battery simple, simple, automobile, over $500, telephone threat, and wireroom/sports. Below is the plot and code to filter for incidents with a primary description of narcotics and a secondary description of poss: cannabis 30gms or less.























The incidents may also be separated into ones where arrests were made and ones where no arrests were made. Below is code to filter for incidents with a primary description of narcotics and a secondary description of poss: cannabis 30gms or less with arrests and without. Both plots comparing to total narcotics incidents are shown.





















The method to display where certain incidents occur in chicago is similar to my previous post, Chicago Crime. Please refer to that posting for a more detailed explanation on how to create the plot below. The plot here is a geographical visualization of where arrests occurred relating to incidents of narcotics with possession of cannabis with 30 grams or less.


The method to tabulate the number of incidents within neighborhoods is also similar to my previous post, Chicago Crime. This calculation totals the numbers of incidents within each neighborhood boundary. It is the most costly in computational time and power because of the generality in the counting code that allows it to use any boundaries (such as census tracts, police beats, etc.) The tabulation of arrests made relating to incidents of narcotics with possession of cannabis with 30 grams or less in each neighborhood is presented.


While doing some initial research for this post, I found "crime hotspots" as a popular visualization option. A similar visualization is easily created in mathematica with three simple lines of code.













The Mathematica code used in this post can be downloaded here. Some fun analysis of crimes in chicago will follow in future posts.

No comments:

Post a Comment