Deep-learning Coupled with Novel Classification Method to Classify the Urban Environment of the Developing World


Qianwei Cheng1, AKM Mahbubur Rahman2,3 Anis Sarker2, Abu Bakar Siddik Nayem2, Ovi Paul2, Amin Ahsan Ali2,3, M Ashraful Amin2,3, Ryosuke Shibasaki1 and Moinul Zaber3,4,5, 1The University of Tokyo, Japan, 2The Independent University Bangladesh, Bangladesh, 3The Data and Design Lab, Bangladesh, 4University of Dhaka, Dhaka Bangladesh, 5United Nations University, E-government Operating Unit (UNU-EGOV), Portugal


Rapid globalization and the interdependence of the countries have engendered tremendous in-flow of human migration towards the urban spaces. With the advent of high definition satellite images, high-resolution data, computational methods such as deep neural network analysis, and hardware capable of high-speed analysis; urban planning is seeing a paradigm shift. Legacy data on urban environments are now being complemented with high-volume, high-frequency data. However, the first step of understanding the urban area lies in the useful classification of the urban environment that is usable for data collection, analysis, and visualization. In this paper, we propose a novel classification method that is readily usable for machine analysis and it shows the applicability of the methodology in a developing world setting. However, the state-of-the-art is mostly dominated by the classification of building structures, building types, etc., and largely represents the developed world. Hence, these methods and models are not sufficient for developing countries such as Bangladesh where the surrounding environment is crucial for the classification. Moreover, the traditional classifications propose small-scale classifications, which give limited information, have poor scalability and are slow to compute in real-time. We categorize the urban area in terms of informal and formal spaces and take the surrounding environment into account. 50 km × 50 km Google Earth image of Dhaka, Bangladesh was visually annotated and categorized by an expert and consequently, a map was drawn. The classification is based broadly on two dimensions the state of urbanization and the architectural form of the urban environment. Consequently, the urban space is divided into four classifications: 1) highly informal area 2) moderately informal area 3) moderately formal area and 4) highly formal area. For semantic segmentation and automatic classification, Google’s DeeplabV3+ model was used. The model uses the Atrous convolution operation to analyze different layers of texture and shape. This allows us to enlarge the field of view of the filters to incorporate a larger context. Image encompassing 70% of the urban space was used to train the model and the remaining 30% was used for testing and validation. The model can segment with 75% accuracy and 60% Mean Intersection over Union (mIoU).


Remote Sensing, Satellite Image, Building classification, Urban Environment, Deep Learning, Semantic Segmentation, Urban Planning, Socio-economic situation, Poverty Estimation.

Full Text  Volume 11, Number 1