Learning
to Count Buildings in Diverse Aerial Scenes
Counting buildings in
aerial scenes is an important yet challenging task. We propose to learn
the
relationship between building counts and low level features and infer
building counts directly based on low level features. Although deep
learning based approaches show promising performance on object
segmentation, this method does not require expensive training and can
deal with image where individual objects are difficult to be separated.
The main
contributions of this study are described as follows.
Learning from map data
Building footprints from GIS maps combined with images provide the data
that can be used to learn the relationship between building counts and
image features. However, it is very common that images and maps are not aligned
well. We perform a cross-correlation between building footprints and
image gradients, which greatly reduce misalignments. See below the
alignments before (left) and after (right) correction.
Straight line extraction
We utilize straight line segments to estimate building numbers, because
a major characteristic of buildings from an aerial view is straight
edges. We follow the line support region framework proposed by Burns
(1986), which identifies line support regions as spatially contiguous
pixels with consistent gradient orientations, and estimates line
parameters (orientation, centroid, and length) based on regions.
Previous work estimates line parameters based on boundary shapes of
line support regions. However, region boundaries do not always reflect
the actual orientations and locations of lines. We determine line
orientations based on structure tensors and locate lines based on Hough
transform. This method utilizes gradients of all pixels in a region and
thus generates more reliable results. In the figure below,
left is the
result from the Unsanlan and Boyer method (2004), right is our result.
Line-building relationship
We collect a large number of image tiles and corresponding building
counts. We find that for similar buildings there is a strong linear
relationship between line numbers of building numbers. Here are a few
examples.
This observation leads to a simple approach for counting buildings with
similar appearances. We estimate a linear regression model between line
numbers and building numbers based on a few examples and feed the total
extracted line number to the model to obtain total building
number. Below is an example for counting shelters in a refugee camp
(within blue polygon), where line and
building numbers in two red windows are used to estimate the regression
model.
For the method dealing with scenes containing different types of
buildings, please see the following paper.
Jiangye
Yuan and Anil Cheriyadat, Learning to count buildings in diverse aerial
scenes, ACM SIGSPATIAL GIS,
2014. [pdf]