BLOG
Detecting Informal Dwellings in Cape Town using Deep Learning tools in ArcGIS Pro
Written by Amy Barnes, Sean Cullen, Derck Vonck
As with most countries around the world, South African residents live in an ever-changing environment. It is especially difficult in informal settlements where a structure (dwelling) could be present today and moved tomorrow. Due to its economic climate, South Africa has a large percentage of its population that resides in informal settlements. These informal settlements require service from various government entities such as EMS, policing and water services delivery to ensure a functioning environment.
The private sector also requires up-to-date dwelling framework information to serve these communities. There is a need to estimate household counts to not only fill in gaps with historical data, but to supplement many sectors in providing services, including banking and market research Furthermore, the undertaking of spatial analysis by the private sector provides opportunity to derive insights for product placement and market awareness.
South Africa conducts Census surveys every ten years. However, as these informal settlements are dynamic, the household counts are outdated by the time data is released. This causes challenges in accessing up-to-date dwelling frameworks and maintaining them.
Our Goal
With a collective need for updated information that estimates household count in active migration areas, we initially relied on identifying dwelling footprints secured from aerial photography. However, this quickly became challenging as it is a manual and tedious process that is subjective to the digitizer’s interpretation of the imagery.
Why Deep Learning & Why Esri?
With the increased availability of data and the improvement of technology and processing power, we wanted to use deep learning methodologies to extract building footprints of these informal settlements. Using ArcGIS Pro, the process of developing and running your own deep learning models is as simple as running a few geoprocessing tools (no coding required). ArcGIS Pro provides tools to capture training data, train various models, inference and then derive the needed footprints after processing.
With the ArcGIS Image Analyst extension, you can perform entire deep learning workflows with imagery in ArcGIS Pro. Geoprocessing tools are used to prepare imagery training data, train an “Object Detection model” and detect the features of interest (buildings).
Deep Learning Process
1. Identify Training Area & Create Samples
The study area we focused on is Barcelona, South Africa, an informal settlement in the Western Cape, which is known for its dynamic changes over time. Next, using a simple script, we identified 24 random sample areas to build a fishnet grid of 20 meters and randomly selected cells within our area of study. Within the sample areas, approximately 1600 footprints were digitised as input for training our model. Aerial photography flown by the local municipality, with a ground resolution of 8cm, was used as the training input.
2. Train Model
We trained an object detection model (MaskRCNN with a ResNet-50 backbone) and let the training run to 100 Epochs (validation loss stabilized after 25-30 Epochs). MaskRCNN is an ideal instance segmentation model for delineating precise objects in an image, in this case, detecting building footprints. The model converged to an accuracy of over 90%.
3. Detect Objects
Using the trained model, Figure 2 displays the output after using the “Detect Objects using Deep Learning” tool in ArcGIS Pro. There was a total of over 5700 dwellings detected. At this point, the results required some processing to distinctly delineate the building footprints. The raw outputs was not favourable as some of the resultant polygons are split with gaps or overlaps for a single structure. These errors are known to be caused by the inferencing tool inputs and the step parameter.
4. Post Processing Results
To obtain more accurate results, we cleaned the data from Figure 2. Due to the density of informal settlements, the dwellings are in very close proximity to each other to the point that the detected footprints tend to overlap. To mitigate this occurrence, a negative buffer of 0.5m was created on the output, to isolate individual dwellings to remove the overlap between dwellings. This allowed us to dissolve the overlapping polygons that belong to a single dwelling into a single polygon. Then, a buffer of 0.5m was run on the dissolved footprints to restore the shape of the originally detected footprints. Lastly, the Regularise Building Footprint tool within ArcGIS Pro was run to create rectangular footprints.
The final results are shown in Figure 3.
Figure 1
Figure 2
Figure 3
Last thoughts
Overall, using deep learning to detect informal dwellings in our area of interest was invaluable. Using the output has enabled us to make more meaningful decisions regarding the provision of assistance and services to the settlement. Additional refinement to our current model could include incorporating more training data to make the model more regionally suitable due to the dynamic mature of informal settlements.
Further applications include extending analysis by incorporating Esri’s survey solutions to conduct sampling surveys and estimate household sizes. Using this data, one can potentially calculate a correlation with floor plan size, allowing governments to plan long-term urban infrastructure. On a broader scale, it can also be leveraged in applications such as damage assessments in the event of natural disasters for a more proactive response.