Saint Petersburg car accidents data research
Project year
2017 — 2019
Project role
Data scientist & urban designer
Made for
Citizens, activists, urban and transportation planners
How it started
Unlike many other projects, this one began solely off of my own initiative, without any client. There were two sides to this story, which ultimately became two integrated parts of one project. The challenge I saw as a citizen and urban designer was how many car crashes are happening in the city every day that could have never happened if our streets and intersections were designed slightly different. As data scientist I was almost appaled by how road police department and transportation committee ignores all the data they have on car crashes. So I took the data on crashes and started visualizing it.
Interactive visualization of all car crashes from the database of Saint Petersburg police for the year 2018
Map shows all car crashes from the database for 2018, there's a timeline on the bottom of the map and filters on the right. You can also zoom in and out.
This visualization can help identify the most dangerous locations and prioritize redesign locations.
The very first time I saw the data in .CSV, I just wanted to put it on the map ASAP, because I already pictured how great it would be too see all of the crashes in the city, where they actually happened. I quickly sorted data by city districts and threw it into interface. The result was astonishing even for me — firstly, how visual crashes become when you make an interactive timeline animation of them, and secondly, how other people recognize it better that way too: in the span of 24 hours my map was mentioned/covered in 5 online newspapers, two of them local and three federal. It looked like for the first time people actually saw how bad it was out there, behind the statistical numbers of crashes.

In the process, I've noticed that data I've worked with had major flaws, e.g. one third of points (each representing one crash) were out of city boundaries, perhaps by means of bad geocoding, manual geocoding or just lack of motivation of police officers and bad quality control on this data. To set the scale, one third is around 7 thousand points/rows in the table. Fixing this data by hand was inefficient and downright crazy, so I've decided to do some programming instead. The problem was, I had in my bag of knowledge only a slight memory of html I've learned in school and the basics of Python learnt about a year ago. By basics I actually mean basics. But it looked like Python was a good fit for the task (and will help me in the future), so I've bought a course on Udemy and improved my skills to the point when I could manage this task.

Soon it was obvious that Python was a good choice — not only did I tackled the analysis of the data on hand, but I also parsed the data for 3 whole years (2015-2017) in one go, and used it afterwards for the analysis. To put this in context, the website with original data lets you download only 14 days in one go, for some silly reason. Well, what a shame.

With Python it was quite easy to analyze the data by streets, seasons, time of day and fix any bugs faster. When the service I intend to make goes live, the bacck end will be implemented on Python.

Below you can see 4 pictures, generally illustrating the stages of my workflow to date.
After various iterations of cleaning, analyzing and visualizing the data I came up with an idea for an online tool, which will be based on crash data, be visual and interactive. It will help citizens make requests for safety improvement of streets, intersections & pedestiran crossings and help urban planners back up their proposals for safety improvement of streets with good visual data. This multipurpose tool for citizens, activists and urban professionals is the aim of the project, although right now it has been put on hold for a period of time.

After the new stage of analysis in 2018-2019 I've been invited as a speaker to International cycling congress in Saint Petersburg and also my work on crashes with cyclists have been sent to transportation committee to learn from when making new cycling infrastructure and enhancing the existing.
Similar projects:
Made on