Traffic Accidents in San Francisco

Federal, state, and local governments have started making more of their data publicly available. This is part of the so-called open data movement. Application Programming Interfaces (APIs), which many institutions have begun supporting, allow researchers and other individuals interested in working with and analyzing data to access information, often large amounts of it, programmatically.

DataSF is the open data portal for the city (and county) of San Francisco.

In this notebook, I will access incident reports from the San Francisco Police Department (SFPD), focusing on traffic-related entries. I will show how to plot the data using the Seaborn library. I will also plot the traffic incidents for the past three months on a Leaflet map.

Extract

Using the requests module, we access the API with the SFPD data. We specify, using the limit parameter, the maximum number of records that should be returned. It is important to read the API documentation for information on limits, appropriate uses, and the various available options. The documentation for this specific data source can be found here. It is important to note that this data is updated daily, so your results will vary.

The data will be returned in a format known as JSON, which stands for JavaScript Object Notation, and will be stored in an object called response. Depending on your connection, this may take a few minutes.

In [1]:
import requests, json, pandas as pd
url = 'https://data.sfgov.org/resource/tmnf-yvry.json?$limit=50000'
response = requests.get(url)

Load

Now that the raw data is in the response variable, we can load it into a Pandas DataFrame. Here, we use json.loads() to get it in the proper format.

In [2]:
data = json.loads(response.text)
df = pd.DataFrame(data)
In [3]:
df
Out[3]:
address category date dayofweek descript incidntnum location pddistrict resolution time x y
0 900.0 Block of HARRISON ST NON-CRIMINAL 1412208000 Thursday LOST PROPERTY 146205499 {u'latitude': u'37.7775833632336', u'needs_rec... SOUTHERN NONE 13:00 -122.403876552366 37.7775833632336
1 2300.0 Block of SAN BRUNO AV VEHICLE THEFT 1414022400 Thursday STOLEN TRUCK 140900108 {u'latitude': u'37.7335714048034', u'needs_rec... BAYVIEW NONE 20:00 -122.406306152069 37.7335714048034
2 800.0 Block of MARKET ST BURGLARY 1412985600 Saturday BURGLARY, UNLAWFUL ENTRY 140859561 {u'latitude': u'37.7847532835996', u'needs_rec... SOUTHERN ARREST, BOOKED 11:24 -122.407036790381 37.7847532835996
3 800.0 Block of BRYANT ST NON-CRIMINAL 1414195200 Saturday LOST PROPERTY 146234658 {u'latitude': u'37.7752316978411', u'needs_rec... SOUTHERN NONE 11:45 -122.403742962696 37.7752316978411
4 LAKE MERCED BL / BROTHERHOOD WAY LARCENY/THEFT 1415232000 Thursday GRAND THEFT FROM LOCKED AUTO 146232997 {u'latitude': u'37.7146945276069', u'needs_rec... TARAVAL NONE 14:59 -122.485226315294 37.7146945276069
5 CALIFORNIA ST / DAVIS ST ASSAULT 1413504000 Friday AGGRAVATED ASSAULT WITH BODILY FORCE 140879462 {u'latitude': u'37.7935081014216', u'needs_rec... CENTRAL NONE 15:40 -122.397613296977 37.7935081014216
6 1000.0 Block of POTRERO AV SUSPICIOUS OCC 1410134400 Monday SUSPICIOUS OCCURRENCE 140757802 {u'latitude': u'37.7568256145719', u'needs_rec... MISSION NONE 17:10 -122.406656623063 37.7568256145719
7 2000.0 Block of MISSION ST OTHER OFFENSES 1411776000 Saturday POSSESSION OF BURGLARY TOOLS 140817036 {u'latitude': u'37.7639057653745', u'needs_rec... MISSION ARREST, BOOKED 21:15 -122.419519291984 37.7639057653745
8 CORTLAND AV / BAY SHORE BL SUSPICIOUS OCC 1411948800 Monday SUSPICIOUS OCCURRENCE 140821645 {u'latitude': u'37.7395669609406', u'needs_rec... BAYVIEW NONE 16:31 -122.406979507963 37.7395669609406
9 25TH ST / FOLSOM ST VEHICLE THEFT 1412812800 Thursday STOLEN AUTOMOBILE 140859395 {u'latitude': u'37.7509035118498', u'needs_rec... MISSION NONE 19:30 -122.413926655862 37.7509035118498
10 1800.0 Block of SUTTER ST LARCENY/THEFT 1413936000 Wednesday PETTY THEFT OF PROPERTY 146218806 {u'latitude': u'37.7865328905614', u'needs_rec... NORTHERN NONE 18:00 -122.430010289844 37.7865328905614
11 800.0 Block of BRYANT ST DRUG/NARCOTIC 1411689600 Friday POSSESSION OF MARIJUANA FOR SALES 140812462 {u'latitude': u'37.7752316978411', u'needs_rec... SOUTHERN NONE 13:47 -122.403742962696 37.7752316978411
12 0.0 Block of 4TH ST NON-CRIMINAL 1414108800 Friday LOST PROPERTY 146235361 {u'latitude': u'37.7850408915497', u'needs_rec... SOUTHERN NONE 11:45 -122.405020936706 37.7850408915497
13 1900.0 Block of BUCHANAN ST VEHICLE THEFT 1413331200 Wednesday STOLEN AUTOMOBILE 140881912 {u'latitude': u'37.7875021119303', u'needs_rec... NORTHERN NONE 22:30 -122.430043028044 37.7875021119303
14 CLAYTON ST / TWINPEAKS BL ARSON 1409788800 Thursday ARSON 140743613 {u'latitude': u'37.7610072307552', u'needs_rec... PARK NONE 01:13 -122.446434498728 37.7610072307552
15 200.0 Block of JONES ST NON-CRIMINAL 1409702400 Wednesday AIDED CASE 140741184 {u'latitude': u'37.7830494562119', u'needs_rec... TENDERLOIN NONE 10:30 -122.412465977545 37.7830494562119
16 SUTTER ST / STOCKTON ST LARCENY/THEFT 1411603200 Thursday PETTY THEFT FROM LOCKED AUTO 140810676 {u'latitude': u'37.7894347630337', u'needs_rec... CENTRAL NONE 15:00 -122.406958660602 37.7894347630337
17 HAYES ST / STANYAN ST LARCENY/THEFT 1409529600 Monday GRAND THEFT FROM UNLOCKED AUTO 146179838 {u'latitude': u'37.7728951613702', u'needs_rec... PARK NONE 22:34 -122.454286690265 37.7728951613702
18 GEARY BL / STEINER ST LARCENY/THEFT 1409616000 Tuesday PETTY THEFT OF PROPERTY 140741015 {u'latitude': u'37.7841504916183', u'needs_rec... NORTHERN NONE 20:00 -122.4345637373 37.7841504916183
19 600.0 Block of LEAVENWORTH ST VEHICLE THEFT 1416009600 Saturday STOLEN AUTOMOBILE 140967136 {u'latitude': u'37.7869348074983', u'needs_rec... CENTRAL NONE 03:48 -122.414935774638 37.7869348074983
20 400.0 Block of BAY ST ASSAULT 1410566400 Saturday CHILD ABUSE (PHYSICAL) 140772725 {u'latitude': u'37.8054746066172', u'needs_rec... CENTRAL NONE 15:40 -122.414501928254 37.8054746066172
21 BUSH ST / WEBSTER ST LARCENY/THEFT 1411084800 Friday PETTY THEFT FROM LOCKED AUTO 140789560 {u'latitude': u'37.7872308879586', u'needs_rec... NORTHERN NONE 00:30 -122.431816136158 37.7872308879586
22 1300.0 Block of 30TH AV BURGLARY 1412726400 Wednesday BURGLARY OF RESIDENCE, FORCIBLE ENTRY 140912313 {u'latitude': u'37.7628319123172', u'needs_rec... TARAVAL NONE 09:00 -122.488852509457 37.7628319123172
23 200.0 Block of KING ST LARCENY/THEFT 1414972800 Monday GRAND THEFT FROM PERSON 140932690 {u'latitude': u'37.7764328521786', u'needs_rec... SOUTHERN NONE 18:45 -122.394098177073 37.7764328521786
24 14TH AV / MORAGA ST VANDALISM 1409616000 Tuesday MALICIOUS MISCHIEF, VANDALISM OF VEHICLES 140739327 {u'latitude': u'37.7563445000784', u'needs_rec... TARAVAL NONE 09:30 -122.470967156063 37.7563445000784
25 700.0 Block of KEARNY ST LARCENY/THEFT 1415318400 Friday GRAND THEFT FROM A BUILDING 140944996 {u'latitude': u'37.7950029644549', u'needs_rec... CENTRAL NONE 16:00 -122.404861643051 37.7950029644549
26 STOCKTON ST / SUTTER ST LARCENY/THEFT 1415491200 Sunday GRAND THEFT FROM PERSON 140949736 {u'latitude': u'37.7894347630337', u'needs_rec... CENTRAL NONE 12:40 -122.406958660602 37.7894347630337
27 MARKET ST / 11TH ST VEHICLE THEFT 1412553600 Monday STOLEN MOTORCYCLE 140844885 {u'latitude': u'37.7755894130348', u'needs_rec... SOUTHERN NONE 08:15 -122.418698328939 37.7755894130348
28 1600.0 Block of WALLACE AV MISSING PERSON 1410480000 Friday FOUND PERSON 140770666 {u'latitude': u'37.727516075789', u'needs_reco... BAYVIEW LOCATED 20:15 -122.391415328578 37.727516075789
29 BOSWORTH ST / ROUSSEAU ST ROBBERY 1415836800 Thursday ROBBERY ON THE STREET, STRONGARM 140961489 {u'latitude': u'37.7333118541647', u'needs_rec... INGLESIDE NONE 09:30 -122.430279612614 37.7333118541647
... ... ... ... ... ... ... ... ... ... ... ... ...
26374 WILLOW ST / POLK ST VEHICLE THEFT 1414281600 Sunday STOLEN TRUCK 140908817 {u'latitude': u'37.7835680535504', u'needs_rec... NORTHERN NONE 23:00 -122.419274801449 37.7835680535504
26375 SUNSET BL / NORIEGA ST NON-CRIMINAL 1414713600 Friday FOUND PROPERTY 140922287 {u'latitude': u'37.7534118534114', u'needs_rec... TARAVAL NONE 10:27 -122.495225044311 37.7534118534114
26376 500.0 Block of CARTER ST MISSING PERSON 1414022400 Thursday MISSING JUVENILE 140907411 {u'latitude': u'37.7085268833154', u'needs_rec... INGLESIDE NONE 19:30 -122.423827920239 37.7085268833154
26377 MARKET ST / JONES ST ROBBERY 1414886400 Sunday ROBBERY, BODILY FORCE 140940621 {u'latitude': u'37.7809039697055', u'needs_rec... SOUTHERN NONE 15:30 -122.411979487494 37.7809039697055
26378 400.0 Block of 7TH ST OTHER OFFENSES 1410825600 Tuesday TRAFFIC VIOLATION 140779492 {u'latitude': u'37.7749623956038', u'needs_rec... SOUTHERN NONE 01:00 -122.405411882091 37.7749623956038
26379 900.0 Block of HARRISON ST LARCENY/THEFT 1410739200 Monday GRAND THEFT FROM PERSON 140776573 {u'latitude': u'37.7778758664587', u'needs_rec... SOUTHERN NONE 02:20 -122.403369341832 37.7778758664587
26380 200.0 Block of OFARRELL ST OTHER OFFENSES 1413072000 Sunday MISCELLANEOUS INVESTIGATION 140867354 {u'latitude': u'37.7862782769118', u'needs_rec... TENDERLOIN NONE 01:00 -122.408814633856 37.7862782769118
26381 600.0 Block of EDDY ST LARCENY/THEFT 1410825600 Tuesday GRAND THEFT FROM LOCKED AUTO 140780994 {u'latitude': u'37.7832989144971', u'needs_rec... NORTHERN NONE 13:30 -122.417889157912 37.7832989144971
26382 100.0 Block of CAMERON WY NON-CRIMINAL 1411344000 Monday AIDED CASE, MENTAL DISTURBED 140801788 {u'latitude': u'37.7207809778816', u'needs_rec... BAYVIEW PSYCHOPATHIC CASE 12:30 -122.386906562606 37.7207809778816
26383 100.0 Block of MASON ST LARCENY/THEFT 1415491200 Sunday GRAND THEFT FROM LOCKED AUTO 146234608 {u'latitude': u'37.7845792034875', u'needs_rec... TENDERLOIN NONE 21:00 -122.409401966956 37.7845792034875
26384 700.0 Block of GEARY ST LARCENY/THEFT 1410825600 Tuesday GRAND THEFT BICYCLE 140790642 {u'latitude': u'37.7862871732137', u'needs_rec... CENTRAL NONE 17:00 -122.416355393687 37.7862871732137
26385 200.0 Block of KING ST NON-CRIMINAL 1415145600 Wednesday AIDED CASE, DOG BITE 140938591 {u'latitude': u'37.7770884941977', u'needs_rec... SOUTHERN NONE 19:48 -122.393248740719 37.7770884941977
26386 100.0 Block of NOE ST LARCENY/THEFT 1411689600 Friday PETTY THEFT OF PROPERTY 140816044 {u'latitude': u'37.7672518492254', u'needs_rec... PARK NONE 10:30 -122.433431744037 37.7672518492254
26387 2100.0 Block of GOLDEN GATE AV LARCENY/THEFT 1412294400 Friday GRAND THEFT FROM LOCKED AUTO 146205041 {u'latitude': u'37.7775919355915', u'needs_rec... PARK NONE 20:00 -122.446579550068 37.7775919355915
26388 100.0 Block of APTOS AV ASSAULT 1412035200 Tuesday BATTERY 140826849 {u'latitude': u'37.7302147819889', u'needs_rec... TARAVAL NONE 10:11 -122.466263433574 37.7302147819889
26389 300.0 Block of SANSOME ST BURGLARY 1413244800 Tuesday BURGLARY, FORCIBLE ENTRY 140869849 {u'latitude': u'37.7936628536958', u'needs_rec... CENTRAL NONE 16:00 -122.401209811207 37.7936628536958
26390 ANZA ST / 10TH AV ASSAULT 1412553600 Monday AGGRAVATED ASSAULT WITH A DEADLY WEAPON 140843417 {u'latitude': u'37.7788954624522', u'needs_rec... RICHMOND NONE 14:15 -122.468426587712 37.7788954624522
26391 18TH ST / VALENCIA ST ROBBERY 1413244800 Tuesday ROBBERY, BODILY FORCE 140871703 {u'latitude': u'37.7617007179814', u'needs_rec... MISSION NONE 11:17 -122.42158168137 37.7617007179814
26392 GEARY BL / WEBSTER ST OTHER OFFENSES 1414886400 Sunday LOST/STOLEN LICENSE PLATE 140928906 {u'latitude': u'37.7845681170633', u'needs_rec... NORTHERN NONE 09:30 -122.431206932079 37.7845681170633
26393 HAYES ST / LAGUNA ST LARCENY/THEFT 1411430400 Tuesday PETTY THEFT OF PROPERTY 146199331 {u'latitude': u'37.7764631429439', u'needs_rec... NORTHERN NONE 20:45 -122.426265334521 37.7764631429439
26394 200.0 Block of 6TH ST LARCENY/THEFT 1409788800 Thursday PETTY THEFT FROM LOCKED AUTO 140747897 {u'latitude': u'37.7793934090507', u'needs_rec... SOUTHERN NONE 14:00 -122.406668854041 37.7793934090507
26395 1000.0 Block of MISSION ST VANDALISM 1411516800 Wednesday MALICIOUS MISCHIEF, VANDALISM OF VEHICLES 140805263 {u'latitude': u'37.780538483028', u'needs_reco... SOUTHERN NONE 05:00 -122.409198533233 37.780538483028
26396 2000.0 Block of MARKET ST ASSAULT 1413504000 Friday BATTERY 140879246 {u'latitude': u'37.7691618402243', u'needs_rec... MISSION NONE 15:10 -122.426893114012 37.7691618402243
26397 0.0 Block of BRADY ST WARRANTS 1412380800 Saturday WARRANT ARREST 140835721 {u'latitude': u'37.7725522501384', u'needs_rec... SOUTHERN ARREST, BOOKED 04:00 -122.420053263143 37.7725522501384
26398 0.0 Block of MONTGOMERY ST BURGLARY 1412899200 Friday BURGLARY,STORE UNDER CONSTRUCTION, FORCIBLE ENTRY 140858030 {u'latitude': u'37.7893117469688', u'needs_rec... CENTRAL NONE 19:00 -122.402168088575 37.7893117469688
26399 1300.0 Block of MISSION ST SEX OFFENSES, FORCIBLE 1415318400 Friday FORCIBLE RAPE, BODILY FORCE 140953680 {u'latitude': u'37.776202624796', u'needs_reco... SOUTHERN NONE 22:45 -122.414690237245 37.776202624796
26400 800.0 Block of GEARY ST WARRANTS 1415404800 Saturday WARRANT ARREST 140947326 {u'latitude': u'37.7860790733126', u'needs_rec... CENTRAL ARREST, BOOKED 15:45 -122.417994352344 37.7860790733126
26401 1500.0 Block of SUNNYDALE AV ASSAULT 1414886400 Sunday INFLICT INJURY ON COHABITEE 140927522 {u'latitude': u'37.7119166759701', u'needs_rec... INGLESIDE NONE 00:50 -122.416147813865 37.7119166759701
26402 1800.0 Block of POWELL ST WARRANTS 1414713600 Friday ENROUTE TO ADULT AUTHORITY 140922340 {u'latitude': u'37.8014737363544', u'needs_rec... CENTRAL ARREST, BOOKED 09:00 -122.411125687811 37.8014737363544
26403 1000.0 Block of MARKET ST SEX OFFENSES, FORCIBLE 1414108800 Friday SEXUAL BATTERY 140912567 {u'latitude': u'37.7816258614751', u'needs_rec... SOUTHERN NONE 15:00 -122.411002140916 37.7816258614751

26404 rows × 12 columns

This DataFrame includes information on the location, time, type of incident, and the police department district that responded.

Transform

In this section, we want to begin processing the data so that it's in a standardized format that we can use.

Date

First, we transform the values in the included date variable to a Python date. While the documentation is not clear on the type of data that is returned (this might also depend on the access method), it looks like the values represent the number of seconds since the epoch, which is "the point where time starts." On Unix-based systems, such as the one I'm currently on, the epoch is January 1, 1970. For more information on times in Python, see: Time access and conversions.

The way to determine the epoch on your system is to use the .gmtime() method in the time module. For an example, see below.

In [4]:
import time

time.gmtime(0)
Out[4]:
time.struct_time(tm_year=1970, tm_mon=1, tm_mday=1, tm_hour=0, tm_min=0, tm_sec=0, tm_wday=3, tm_yday=1, tm_isdst=0)

Use the .fromtimestamp() method to create a date variable from the number of seconds since the epoch.

In [5]:
import datetime

df['real_date'] = df['date'].map(lambda x: datetime.date.fromtimestamp(x))

Time

Because the values in the time variable are in a string format with the hours and seconds separated by a colon (:), we split on that value and create a time object.

In [6]:
df['real_time'] = df['time'].map(lambda x: datetime.time(int(x.split(':')[0]), int(x.split(':')[1])))

Lowercase

Here, we convert data to lowercase, where possible. We use the try and except approach because not all columns are of type string and, thus, cannot be converted to lowercase.

In [7]:
for col in df.columns:
    try:
        df[col] = df[col].map(lambda x: x.lower())
    except:
        pass

Traffic Filter

In this section, we create a new DataFrame for all traffic incidents. If the description column contains the word "traffic," we use it. Because the values in the location column are of type dict and because its values are already represented in separate variables in the DataFrame, we drop the column. By default, each row keeps its original index. This results in non-sequential numbering of the index values. So, we want to reset them.

In [8]:
traffic = df[df['descript'].str.contains("traffic")]
del traffic['location']
traffic = traffic.drop_duplicates().reset_index(drop=True)
In [9]:
traffic
Out[9]:
address category date dayofweek descript incidntnum pddistrict resolution time x y real_date real_time
0 mission st / pope st other offenses 1410739200 monday traffic violation 140777800 ingleside none 13:10 -122.441800386053 37.7154294159173 2014-09-14 13:10:00
1 capp st / 20th st other offenses 1410998400 thursday traffic violation arrest 140788415 mission arrest, booked 19:56 -122.41796448376 37.7586968166786 2014-09-17 19:56:00
2 200.0 block of stcharles av other offenses 1410307200 wednesday traffic violation arrest 140764976 taraval arrest, booked 23:07 -122.469238092022 37.7099130855206 2014-09-09 23:07:00
3 ridgewood av / hearst av other offenses 1412899200 friday traffic violation 140855616 ingleside none 08:33 -122.45340810993 37.7306726588748 2014-10-09 08:33:00
4 excelsior av / london st other offenses 1413417600 thursday traffic violation arrest 140877397 ingleside arrest, booked 23:35 -122.432856568129 37.7258720830037 2014-10-15 23:35:00
5 martin luther king jr dr / crossover dr other offenses 1410048000 sunday traffic violation 140754820 richmond none 20:07 -122.47736353874 37.7660604768749 2014-09-06 20:07:00
6 bryant st / 6th st other offenses 1411257600 sunday traffic violation 140798094 southern arrest, cited 23:56 -122.402527594665 37.7760382838642 2014-09-20 23:56:00
7 california st / 17th av other offenses 1409529600 monday traffic violation arrest 140736408 richmond arrest, cited 19:15 -122.476436219833 37.7842690832606 2014-08-31 19:15:00
8 19th av / sloat bl other offenses 1412121600 wednesday traffic violation 140827433 taraval arrest, cited 13:20 -122.475102006298 37.7346237083866 2014-09-30 13:20:00
9 california st / maple st other offenses 1414713600 friday traffic violation 140923821 richmond arrest, cited 19:32 -122.455052379833 37.7862078479432 2014-10-30 19:32:00
10 3rd st / newcomb av other offenses 1410652800 sunday traffic violation arrest 140775848 bayview arrest, booked 19:16 -122.390416955474 37.7355926106158 2014-09-13 19:16:00
11 4400.0 block of mission st other offenses 1415404800 saturday traffic violation arrest 140946077 ingleside arrest, booked 05:20 -122.432954114381 37.7271871916999 2014-11-07 05:20:00
12 san jose av / ocean av other offenses 1412985600 saturday traffic violation 140858961 ingleside arrest, cited 09:30 -122.444746663003 37.7229679041421 2014-10-10 09:30:00
13 capp st / 21st st other offenses 1415404800 saturday traffic violation arrest 140948120 mission arrest, booked 20:10 -122.417812558787 37.7571005796725 2014-11-07 20:10:00
14 hayes st / octavia st other offenses 1414454400 tuesday traffic violation arrest 140911713 northern arrest, booked 03:22 -122.424623480779 37.776674002552 2014-10-27 03:22:00
15 mcallister st / polk st other offenses 1414540800 wednesday traffic violation 140916478 northern arrest, booked 14:48 -122.418600974625 37.7802607511488 2014-10-28 14:48:00
16 800.0 block of bryant st other offenses 1412294400 friday traffic violation 140833537 southern arrest, cited 12:30 -122.403742962696 37.7752316978411 2014-10-02 12:30:00
17 900.0 block of market st non-criminal 1411171200 saturday traffic accident 140793389 southern arrest, cited 14:30 -122.410243678703 37.782222702064 2014-09-19 14:30:00
18 tunnel av / bay shore bl other offenses 1413590400 saturday traffic violation arrest 140882045 ingleside arrest, booked 14:25 -122.400659850547 37.713059586034 2014-10-17 14:25:00
19 golden gate av / leavenworth st other offenses 1412208000 thursday traffic violation arrest 140831036 tenderloin arrest, cited 15:54 -122.413869632554 37.7818621883318 2014-10-01 15:54:00
20 700.0 block of capp st other offenses 1413849600 tuesday traffic violation 140890979 mission arrest, booked 13:22 -122.417503523627 37.7553109114236 2014-10-20 13:22:00
21 800.0 block of bryant st other offenses 1410307200 wednesday traffic violation 140764396 southern arrest, booked 18:30 -122.403742962696 37.7752316978411 2014-09-09 18:30:00
22 15th st / mission st other offenses 1414108800 friday traffic violation arrest 140899307 mission arrest, booked 00:02 -122.419827929961 37.7666737552132 2014-10-23 00:02:00
23 masonic av / hayes st other offenses 1415232000 thursday traffic violation 140940392 park arrest, cited 12:30 -122.446091525499 37.773934479513 2014-11-05 12:30:00
24 bay st / kearny st other offenses 1413676800 sunday traffic violation 140885538 central arrest, cited 20:30 -122.407149304657 37.8064267317086 2014-10-18 20:30:00
25 3rd st / hollister av other offenses 1415404800 saturday traffic violation 140948506 bayview arrest, booked 23:45 -122.395946406663 37.721716954309 2014-11-07 23:45:00
26 taylor st / turk st other offenses 1415232000 thursday traffic violation arrest 140939771 tenderloin arrest, booked 09:09 -122.410768766343 37.7832145190309 2014-11-05 09:09:00
27 post st / stockton st other offenses 1412294400 friday traffic violation arrest 140834381 central arrest, cited 17:25 -122.406775392392 37.7884982866444 2014-10-02 17:25:00
28 600.0 block of keith st other offenses 1411516800 wednesday traffic violation 140806346 bayview arrest, cited 13:27 -122.382874729751 37.7371644263129 2014-09-23 13:27:00
29 parkpresidio bl / fulton st other offenses 1410134400 monday traffic violation 140758264 richmond arrest, cited 00:20 -122.471762293486 37.7731370564753 2014-09-07 00:20:00
... ... ... ... ... ... ... ... ... ... ... ... ... ...
580 3900.0 block of cesar chavez st other offenses 1414540800 wednesday traffic violation arrest 140920021 mission arrest, booked 18:00 -122.427253391744 37.7477934623372 2014-10-28 18:00:00
581 france av / paris st other offenses 1415232000 thursday traffic violation 140941776 ingleside arrest, booked 19:49 -122.43675507464 37.7192703128739 2014-11-05 19:49:00
582 valencia st / duboce av other offenses 1413849600 tuesday traffic violation arrest 140891654 mission arrest, booked 16:41 -122.422367409563 37.7698682392752 2014-10-20 16:41:00
583 alabama st / montcalm st other offenses 1413504000 friday traffic violation 140880124 ingleside juvenile booked 20:39 -122.410499537086 37.7455724045589 2014-10-16 20:39:00
584 gough st / bush st other offenses 1410393600 thursday traffic violation 140766148 northern none 13:00 -122.425237254245 37.7880691746724 2014-09-10 13:00:00
585 19th av / ocean av other offenses 1413676800 sunday traffic violation 140883695 taraval arrest, cited 03:31 -122.474954469109 37.7324560758732 2014-10-18 03:31:00
586 haight st / buchanan st other offenses 1415577600 monday traffic violation 140953981 northern arrest, booked 17:50 -122.427159869872 37.7725259201823 2014-11-09 17:50:00
587 grant av / sutter st other offenses 1415664000 tuesday traffic violation 140954989 central arrest, booked 01:15 -122.405402610955 37.7896302267231 2014-11-10 01:15:00
588 19th st / shotwell st other offenses 1414022400 thursday traffic violation arrest 140899250 mission arrest, booked 23:45 -122.415929849548 37.760433000405 2014-10-22 23:45:00
589 16th av / santiago st other offenses 1413244800 tuesday traffic violation 140870454 taraval none 21:17 -122.472922868983 37.7450437467702 2014-10-13 21:17:00
590 buchanan st / fell st other offenses 1410566400 saturday traffic violation arrest 140773513 northern arrest, cited 21:38 -122.427727138394 37.7753157578705 2014-09-12 21:38:00
591 powell st / lombard st other offenses 1412121600 wednesday traffic violation 140826134 central arrest, cited 00:30 -122.411393095746 37.803034284707 2014-09-30 00:30:00
592 oxford st / silliman st other offenses 1410912000 wednesday traffic violation 140785392 ingleside none 20:39 -122.419884766171 37.7273697641751 2014-09-16 20:39:00
593 1100.0 block of lasalle av other offenses 1411948800 monday traffic violation 140822825 bayview arrest, cited 22:58 -122.381912059451 37.7315458718958 2014-09-28 22:58:00
594 stanyan st / beulah st other offenses 1411948800 monday traffic violation 140822637 park arrest, booked 21:27 -122.453167315666 37.7673072115555 2014-09-28 21:27:00
595 ocean av / woodacre dr other offenses 1413590400 saturday traffic violation 140878743 taraval arrest, booked 13:15 -122.473200758245 37.7316696104689 2014-10-17 13:15:00
596 3rd st / fairfax av other offenses 1411948800 monday traffic violation arrest 140822104 bayview arrest, cited 19:00 -122.388204239979 37.7418910564289 2014-09-28 19:00:00
597 columbus av / vallejo st other offenses 1411603200 thursday traffic violation 140810096 central arrest, cited 17:59 -122.407871572784 37.7986962743293 2014-09-24 17:59:00
598 leavenworth st / ofarrell st other offenses 1411948800 monday traffic violation 140822881 tenderloin arrest, cited 23:53 -122.414619686906 37.7855830588767 2014-09-28 23:53:00
599 fillmore st / ofarrell st other offenses 1413936000 wednesday traffic violation 140894733 northern arrest, cited 17:06 -122.432706975892 37.7832909416659 2014-10-21 17:06:00
600 0.0 block of santos st other offenses 1410825600 tuesday traffic violation arrest 140779646 ingleside arrest, booked 03:35 -122.418724355518 37.7119211318901 2014-09-15 03:35:00
601 16th st / mission st other offenses 1413676800 sunday traffic violation 140885384 mission arrest, cited 18:34 -122.419671780296 37.7650501214965 2014-10-18 18:34:00
602 2000.0 block of mission st other offenses 1412380800 saturday traffic violation arrest 140836644 mission arrest, booked 13:20 -122.419566310261 37.7635154609084 2014-10-03 13:20:00
603 mission st / 17th st other offenses 1415404800 saturday traffic violation arrest 140945922 mission arrest, booked 02:58 -122.419515708406 37.7634292328793 2014-11-07 02:58:00
604 woodside av / portola dr other offenses 1410307200 wednesday traffic violation arrest 140763746 ingleside none 16:20 -122.451622869572 37.7455330433786 2014-09-09 16:20:00
605 vallejo st / sansome st other offenses 1415318400 friday traffic violation 140945615 central arrest, cited 23:40 -122.402423032514 37.7993831220372 2014-11-06 23:40:00
606 stockton st / sacramento st other offenses 1413504000 friday traffic violation 140879155 central arrest, cited 15:18 -122.407729769991 37.7931773026133 2014-10-16 15:18:00
607 2000.0 block of mission st other offenses 1411257600 sunday traffic violation arrest 140797212 mission arrest, booked 18:12 -122.419640488569 37.7642858666276 2014-09-20 18:12:00
608 alemany bl / ellsworth st other offenses 1411776000 saturday traffic violation arrest 140817133 ingleside arrest, cited 23:03 -122.418968700591 37.7322083102056 2014-09-26 23:03:00
609 400.0 block of 7th st other offenses 1410825600 tuesday traffic violation 140779492 southern none 01:00 -122.405411882091 37.7749623956038 2014-09-15 01:00:00

610 rows × 13 columns

Graphs

Now that the data is clean, we want to begin plotting. We'll create the following graphs:

  • Number of incidents by day of the week
  • Number of incidents by date
  • Scatterplot: indicents and time

To make plots using matplotlib, you must first enable IPython's matplotlib mode. To do this, run the %matplotlib magic command to enable plotting in the current notebook.

For more information, see: Plotting with Matplotlib.

In [10]:
%matplotlib inline

Number of Incidents by Day of the Week

Here, we want to count up the number of incidents for each day of the week. We do this using the .groupby() method.

In [11]:
# Group by
dayofweek = traffic.groupby('dayofweek')['dayofweek'].count()

# Put into a DataFrame
dow_df = pd.DataFrame(dayofweek)

# Rename column
dow_df.columns = ['count']

# Create a new column based on the day of the week
dow_df['dayofweek'] = dow_df.index
In [12]:
# Capitalize the first letter of the weekday name
for entry in dow_df['dayofweek']:
    dow_df['dayofweek'] = dow_df['dayofweek'].str.title()
In [13]:
# Create a dictionary with the order of the days
weekdays = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']
mapping = {day: i for i, day in enumerate(weekdays)}
key = dow_df['dayofweek'].map(mapping)

# Sort the DataFrame
dow_df = dow_df.iloc[key.argsort()]

# Drop the dayofweek index; reset the index
dow_df = dow_df.reset_index(drop=True)
In [14]:
dow_df['order'] = dow_df.index
dow_plot = dow_df.groupby(['order', 'dayofweek'])['count'].sum()
In [15]:
dow_df
Out[15]:
count dayofweek order
0 81 Monday 0
1 91 Tuesday 1
2 94 Wednesday 2
3 84 Thursday 3
4 86 Friday 4
5 93 Saturday 5
6 81 Sunday 6
In [16]:
import seaborn as sns
rc={"figure.figsize": (10, 8), 'axes.labelsize': 18, 'font.size': 18, 'axes.titlesize': 18, 'xtick.labelsize': 12, 'ytick.labelsize': 12}
sns.set(rc=rc)

sns.barplot(x='dayofweek', y='count', data=dow_df, palette="Paired",  x_order=weekdays)
Out[16]:
<matplotlib.axes.AxesSubplot at 0x113883790>

There don't seem to be drastic differences in the number of incidents on each day of the week. I excepted Friday and Saturday to be the highest, but this sample shows Tuesday and Wednesday having just as many, if not more, incidents. My assumption of the number of incidents during the weekend may be true for nighttime hours. This barplot does not control for that.

Number of Incidents by Date

Next, we want to see if there are any patterns in the number of incidents across days in the sample.

In [17]:
# Group by
bydate = traffic.groupby('real_date')['real_date'].count()

# Put into a DataFrame
bydate_df = pd.DataFrame(bydate)

# Rename column
bydate_df.columns = ['count']

# Create a new column based on the day of the week
bydate_df['date'] = bydate_df.index

# Reset the index
bydate_df = bydate_df.reset_index(drop=True)
In [18]:
rc={"figure.figsize": (10, 8), 'axes.labelsize': 12, 'font.size': 12, 'legend.fontsize': 12.0, 'axes.titlesize': 12, 'xtick.labelsize': 0}
sns.set(rc=rc)

bydate_plot = sns.barplot(x='date', y='count', data=bydate_df, palette="Blues")

There are a few cases of exceptionally low traffic incidents, especially more recently, but there doesn't seem to be a discernible overall pattern. This plot seems to indicate greater variance in the number of traffic incidents the closer to the current date. However, it's important to take note of the scale. The highest recorded traffic indicents during the past three months are around 19 cases; the average seems to be close to 5 or 6 per day.

Traffic Indicents by Police District

Finally, we want to look at the number of incidents in each district.

In [19]:
# Group by
pddistrict = traffic.groupby('pddistrict')['pddistrict'].count()
In [20]:
# Put into a DataFrame
pd_df = pd.DataFrame(pddistrict)

# Rename column
pd_df.columns = ['count']

# Create a new column based on the day of the week
pd_df['pddistrict'] = pd_df.index

# Reset the index
pd_df = pd_df.reset_index(drop=True)
In [21]:
# Capitalize the first letter of the weekday name
for entry in pd_df['pddistrict']:
    pd_df['pddistrict'] = pd_df['pddistrict'].str.title()
In [22]:
import seaborn as sns
rc={"figure.figsize": (10, 8), 'axes.labelsize': 18, 'font.size': 18, 'axes.titlesize': 18, 'xtick.labelsize': 12, 'ytick.labelsize': 12}
sns.set(rc=rc)

sns.barplot(x='pddistrict', y='count', data=pd_df, palette="Paired")
Out[22]:
<matplotlib.axes.AxesSubplot at 0x115e27590>

While it's informative to know about where traffic incidents are occurring, it would be useful to have context. For example, does the Richmond area have very few traffic incidents because there are few drivers there or because it's a safe neighborhood? Geographic scale can also skew these results. If the Mission district is much larger than the other districts, the higher number of incidents may simply be due to that fact.

For Mapping

Now, we use the data to create a map of traffic incidents in San Francisco.

Create Geojson to Convert to JavaScript

In [23]:
import json

geo_data = {
    'type': 'FeatureCollection', 
    'features': []
}

for i in traffic.index:
    if traffic['y'][i]:
        
        # Each tweet is a GeoJSON "feature"
        feature = {
            'type': 'Feature', 
            'geometry': {
                "type": "Point", 
                "coordinates": [float(traffic['x'][i]), float(traffic['y'][i])]
            },
            
            # A feature's "properties" become attribute columns in GIS
            'properties': {
                'day': traffic['dayofweek'][i], 
                'district': traffic['pddistrict'][i],
                'resolution': traffic['resolution'][i],
                'address': traffic['address'][i]
            }
        }
        
        # Add the feature into the GeoJSON wrapper
        geo_data['features'].append(feature)

with open('sftraffic.geojson', 'wb') as f:
    json.dump(geo_data, f, indent=2)
            
print len(geo_data['features']), 'geotagged entries saved to file'
610 geotagged entries saved to file

In []: