Automating quality tests for map data to solve urban parking


Geographic maps have been around literally forever. Used to guide sailors across the ocean and caravans through the desert, one of the first ones, a cave painting found in France dating back to 14,500 BC actually referred to the location of stars instead of continents and cities. Humans in almost all cultures have independently developed and used maps.

Fast forward a couple of thousand years, maps are hotter than ever. Google sent its StreetView cars around the globe. Facebook mapped out the world’s population with satellite imagery. The rise of autonomous vehicles further fuels the need for high-accuracy map information. However, one crucial aspect was never sufficiently covered for a long time: Parking.

Why quality tests matter more than ever

Having started a few years ago with first collections of static information (e.g. prices, opening hours etc.), live on-street parking maps are now hot and trendy in the industry. This new league of maps not only contains locations and restrictions of parking options but also their predicted and live availability.


Three-fold solution to the parking pain: Information


Data quality here is extremely critical. Parking availability information, by nature, outdates extremely fast. Thus, continous updates and reliable accuracy need to be ensured to guide car drivers to open parking spots.

But how can the data quality be evaluated? Interestingly, there are currently no common measurement standards. To bring light into the dark, we decided to share our approach for an automated map data testing pipeline. Thisincludes the collection of a ground truth dataset in a target area, as well as the comparison with serviced data.

This post demonstrates the pipeline setup to test parking availability predictions. We chose Berlin-Mitte as a sample target area.

Special: We published all relevant data online, open and available to everyone (here and here).

Automating manual work

Many mapping companies are still sending out human surveyors to collect a ground truth to compare it to their provided content. This method is not only inaccurate and prone to errors, it is also slow and extremely expensive.

Thus, we decided to automate what’s possible: We collect geo-referenced imagery with a low-cost setup, analyze the data using computer vision and compute quality metrics on the results, all in one single data processing pipeline.

Collecting raw data

We mounted a smartphone in the windshield of our car and ran a simple custom app that captures a series of geo-referenced images. Every image has a corresponding set of location and movement information.


Standard smartphone in the windshield mount

Below a first glimpse of the input data: During the test drive, the camera was centered a little more towards the right side of the street. We did not capture the left side of the street because the view was obstructed by other cars quite frequently. The resulting limitation: In one-way streets, where parking is possible, we did not cover the left streetside.


Raw data (left), availability analysis (right), captured by smartphone behind the windshield

We collected data in Berlin-Mitte in the extended area around Friedrichstraße. The location is interesting due its great diversity: While the south is one of Berlin’s most frequented places, the south-eastern part piece of the covered area (Humboldt University) provides less parking options and is also less frequented. In the north, we see mostly residential areas, where car ownership is generally high, compared to other surrounding areas. The test drive took place on Wednesday, 13th of March 2019 between 2 pm and
4.30 pm. This way, we were able to obtain a realistic snapshot of the parking situation on the covered area between (late) lunchtime and beginning of the daily evening rush hour.


Covered area in Berlin Mitte

Extracting a parking availability ground truth

We put the data into a custom module of our proprietary computer vision segmentation model. We specifically designed it to capture the number of parked cars on the street sides (counter in the lower left corner). The video below shows the result of the analysis for an exemplary road segment.

Automated counting of parked cars with computer vision

Quality metric

We wanted to compare the accuracy of our predictive parking model with the actual situation on the street. Thus, a quick background on the idea of predictive parking: The goal is to predict, whether an individual driver will be able to find a vacant parking spot at a specific parking option upon arrival. If you want to learn more about this, check out this earlier post.

The AIPARK API (v.1.0) returns a prediction value between 0 and 100. In end-user services, this availability prediction is typically represented in the form of a color scheme with three easy categories:

  • green: high availability
  • yellow: intermediate availability
  • red: low availability


Similar color scheme as for traffic lights, Photo by Kai Pilger on Unsplash

In order to enable benchmarking of different services and to stick with the typical user experience, we decided to map the prediction values to the three categories using equally distributed value ranges. This function shows how it is determined whether a prediction is correct:

 def compare(prediction, actual_value):
    prediction is a value between 0 and 100
    actual_value is the number of open spots / total capacity
    return value states if the prediction was true or false
    d = discretize_prediction(prediction)
    if inRange(actual_value,0,1/3) and d is "low":
        return True
    elif inRange(actual_value,1/3,2/3) and d is "intermediate":
        return True
    elif inRange(actual_value,2/3,1) and d is "high":
        return True
        return False

Prediction accuracy may then simply be computed using this formula:



Running the analysis pipeline yields in a prediction accuracy of 91,94% when applied to the full test data set of all 62 parking areas. While overall prediction accuracy is already remarkable, analyzing the remaining errors is more interesting:

For four parking areas, where the prediction was too positive (predicted availability is higher than ground truth), the mean category deviation is 0.8 at an estimated standard deviation of 0.02.

For those parking areas, where the prediction was too negative (predicted availability is worse than ground truth), the mean category deviation is -0.8 at an estimated standard deviation of 0.06.

In fact, the predicted category values deviate at maximum one from the ground truth availability category.


With this post, we want to show, how map quality testing for sophisticated features such as parking availability can be automated using a low cost data collection setup and an AI-powered analysis pipeline. In the next posts, we’ll introduce some more thoughts about data quality and automation. Stay tuned!

Want to reproduce the results?

If you want to reproduce the results of this article or play around with the data, follow the instructions in the project on GitHub.

Results will be printed on the command line if you run compare_predictions.py.

python3 compare_predictions.py

Note: You will need to enter your AIPARK API key first at the bottom of the file. Sign up here to retrieve your API key: https://studio.aipark.io/sign-up/(it’s free). Also, make sure that you are using Python3 instead of Python2.7. Otherwise, you may end up with rounding errors.

Keen to check out the raw ground truth? We shared the anonymized street-level footage here.

About the author

Torgen is CIO and Co-Founder at AIPARK, a Berlin-based tech company that provides live parking maps for developers. AIPARK’s APIs extend the functionality of Connected Cars, Mobile Apps and Traffic Management systems.