PARAMETER OPTIMIZATION FOR A THERMAL SIMULATION OF AN URBAN AREA

: Urban heat island (UHI) phenomenon is a significant challenge in urban planning, and accurate temperature predictions are crucial for effective decision-making. The choice of material parameters is crucial to simulate a realistic temperature distribution and identify potential UHIs. This paper introduces a framework for optimizing the material properties based on sensors boxes placed upon surfaces made up by different materials. The methodology covers an optimization approach for the material properties to achieve accurate surface temperature simulation. The results, which involved close-range validation and macro-scale validation, show a significant improvement in the agreement between the simulated and measured temperature time series, especially for tiled roofs and asphalt roads. However, the accuracy for grassland areas decreased, possibly due to differences in soil moisture. Overall, the proposed framework shows promising results for future work in improving the accuracy of thermal simulation of urban areas.


INTRODUCTION
Digital twins are increasingly being applied in order to obtain quantitative output from existing and not yet existing entities, thus providing valuable help for a very broad spectrum of applications.Probably, Google maps can be considered the most famous digital twin in the world, since it helps to assess the arrival time.However, especially in recent years, due to the additional impulse generated by the Covid 19 pandemic towards the digitalization, the concept of digital twin has been introduced and considered as promising to portfolio managers (Miskinis, 2023), cardio-surgeons (Coorey et al., 2022), and of course, urban planers (Lehtola et al., 2022).We are most interested in supporting the latter group, since the challenges of everincreasing urbanization, global warming, and sustainability are omni-present and expected to gain importance in the decades to come.The term of Urban Heat Island (UHI), appearing more and more often in this context, combines all these challenges.Since UHIs also have negative repercussions on different areas of everyday life (quality of life; energy consumption; productivity; biodiversity, and others), they deserve being treated on the multi-disciplinary level by urban planners, meteorologists, architects, and scientists.
In order to identify and to track UHIs in current and future cityscapes and urban designs, a digital twin is supposed to produce accurate -in spatial, temporal, and thermal resolution -output for the urban temperatures.Compared to the two-dimensional Google Maps, the particularity of the digital twin in the context of UHI prevention is the requirement for a four dimensional entity.Evolution in altitude is not only important for computation of shadows from buildings and trees, but also for tracing of wind direction and speed, as is done in the case of Computational Fluid Dynamics (CFD) simulations like Palm-4U (Maronga et al., 2020).Time is important because of temporal shadows, when some materials heat up quicker or cool down longer than other or conduct heat more easily than others.
However, the question is how to retrieve the material parameters for large scenes.In many previous works (Kottler et al., 2019, Bulatov et al., 2020, Guo et al., 2018, Aguerre et al., 2019) , material parameters are usually manually or automatically assigned based on literature.This approach is sufficient given a certain generalization: materials need to be globally generalized and no differentiation between local conditions, such as building materials or building techniques, considered.The wide range of urban materials is therefore not covered, to the disadvantage of simulation accuracy.Furthermore, this oversimplification comes with an additional challenge: the material parameter choice from literature itself.For example, assuming general metal roofs for simplification, actual roofing materials might be, e.g., aluminium, zirconium, or copper.The densities of copper varies from 2712 kg/m 3 up to 8940 kg/m 3 (Bulatov et al., 2020).In the long term, metallic roofs depend in their constitution more on atmosphere and environment than other materials (corrosion).Further materials, such as asphalt, concrete, or brick, are being used in many different compositions of their components depending on locality, operation purpose, producer, and other factors, which increase their physical properties' range.In summary, sufficiently accurate choice of material properties from literature is a demanding task.It can be assumed, however, that the construction of the city took place during a few distinctive epochs.Then, calibrating the material parameters using one to three close-range datasets with measurements from each of these epochs could already improve the simulation accuracy significantly.
We wish to establish the mathematical framework for computation of the unknown material parameters aiming a better adaption to the actual local conditions of the targeted area to be simulated.To do this, we will establish the correspondence between the measured and simulated temperatures.The unknown parameters are coded in a five-to-seven-dimensional vector and the surface temperature is the function of these parameters.The starting values are taken from (Bulatov et al., 2020) and literature sources mentioned therein.The error minimization takes place using the trust region method.The thus developed framework is clearly based on close range validation, and we will eventually discuss some quantitative and qualitative results on the large scale simulation of the urban area under consideration.

PREVIOUS WORKS
The review paper of (Weng, 2009) represents a good starting point for a literature review as it acts as contradistinction of two main tools of assessing urban thermal behavior, namely in-situ measurements and remote sensing methodologies.Knowledge of urban surface heat balance is fundamental to understanding of UHIs and urban thermal behavior.To quantify the urban surface heat balance, in-situ measurements (largely tower-based) can be obtained.A fundamental problem with the in-situ measurements of heat fluxes lies in the great difficulty of selecting a representative site for a larger surrounding region, due to the complexity of urban material composition (Weng, 2009).The authors of (Grimmond and Souch, 1994) outlined the method to extract surface cover information for the Multi-City Urban Hydrometeorological Database (MUHD).Still, it is challenging to investigate the detailed spatial pattern of heat fluxes in a city, if cost, time, instrument and data calibrations are considered all together.One way out is provided by simulations from the airborne remote sensing data, as was proposed e.g. in (Bulatov et al., 2020), and allowing to cover larger scenes.Unfortunately, one cannot do this without certain generalization of materials, which brings about the difficulty to assign the source to the deviation between the measured and simulation temperatures (Strauß and Bulatov, 2022).The deviation are brought about by a coarse error of the thermal remote sensing, which are commonplace in urban environments (Piringer et al., 2002).Surface parameters can be estimated incorrectly due to land cover misclassification.Finally, sometimes the model does not consider a certain natural phenomenon or there is an anomaly (such as a recently irrigated lawn).Therefore, it is an ongoing efforts to utilize quantitative surface descriptors for assessing the interplay between urban material fabric and urban thermal behavior.
In addition, these considerations have been made previously for 2D only (Zhibin et al., 2015).However, with a increasingly available computational resources, in particular, graphical engines, simulation of 3D models is becoming increasingly popular.The heat equation is extended by more complex terms which are characteristic for the meshes: convection heat flux, three-dimensional conduction and solar radiation that can be computed via occlusion analysis.Important contributions for real-time-oriented evolution of the surface temperature distribution have been made by (Peet, 2019, Kottler et al., 2019, Bulatov et al., 2020, Xiong et al., 2016, Guo et al., 2018).Other approaches are based on combined heat and computational fluid dynamics numerical simulation methods, which means that surface temperatures are computed together with air temperatures and airflow on a volumetric grid, and have the advantage that UHIs can be retrieved with a higher confidence.Due to being highly computational, only small and restricted areas on the macro-level can be simulated (Ashie andKono, 2011, Huo andChen, 2022).
Remote sensing can help define and estimate parameters for urban surface characteristics and apply them to urban atmospheric models (Voogt and Oke, 2003).The critical parameters for describing the heat balance and land surface temperature are surface characteristics such as albedo, surface roughness, soil thermal inertia, and soil moisture.A mathematical framework for parameter estimation, presented by (Bartos and Stein, 2015), minimizes the difference between predicted and measured temperature time series.On contrary, (Yang et al., 2020) aims to search for the relevant information in the internet, using python interfaces.Yet, other authors have used commercially available software, such as the ENVI-met model, which was first published in 1998 by (Bruse and Fleer, 1998) and further developed since then (Ouyang et al., 2022).These authors rely on the proposed material parameters or perform in-situ measurements to get the needed material properties.
Based on the literature review, it can concluded that there is a need for more accurate methods to estimate surface parameters from remote sensing and an understanding of the interplay between urban materials and thermal behavior.

METHODOLOGY
In this section, details about the simulation, optimization method, and implementation details are covered in three subsections (3.1 to 3.3, respectively.

Simulation
In the context of this work, the digital twin of an urban area is considered to comprise its 3D reconstruction, given as a triangular mesh, semantic class labels per triangle, environmental data including air temperature and wind speed amongst others, and the triangle-wise simulated surface temperature.To achieve this digital twin, we formally follow the procedure proposed in (Bulatov et al., 2020) from raw sensor data to the digital twin defined as above.Therein, a classification map is generated at first from raw sensor data involving multispectral imagery and LiDAR data, with classes building, terrain, and tree region.
Second, class refinement is performed yielding roof, ground and tree types.Thereon based, surface tesselation is carried out yielding a 3D mesh with class labels per triangle, cp. Figure 1.These class labels are further needed to allow the physics-based simulation of surface temperatures, which will be outlined in the following.
Figure 1.Cut-out of an semantic 3D city mesh created from raw sensor data following (Bulatov et al., 2020).Colors indicate the triangle-wise numeric semantic class label which is associated with its material.
Generally, the temporal and spatial distribution of temperature is determined by the so-called heat equation.In (Kottler et al., 2019), the authors employ a finite volume approach to analyze a 3D triangular mesh.They achieve this by virtually extending each triangle into a series of prismatic layers, with a relative depth denoted as d.Among these layers, the outermost one is referred to as the surface layer, while the rest are called the inner layers.Then, the heat equation is given by for each layer with material properties ρ (density) and cV (specific heat capacity), and heat fluxes I (conduction), A (convection), R (radiative heat exchange with the sky) and S (solar heating).Since convection and heat radiation only contribute at the surface, it is A = R = S = 0 for all sub-surface layers, and at the surface with its temperature Ts = T0, temperature T1 of the first inner layer, the convective heat transfer coefficient h, the Stefan-Boltzmann constant σ, the thermal emissivity ε, exposure factors to the sky γ sky and to the sun γsun, the solar abledo a, total sky temperature T total , solar irradiance at ground Esun, the surface triangles' orientation φs relative to incoming sunlight, and the heat conductivity κ.The convection parameter h is given by h = h1 + h2v wind with constant (for a given material) parameters h1,2.For the inner layers, denoted by i ∈ [1, .., δ − 1], I is expanded to whereby T δ is constant, following the Dirichlet condition.The lateral heat conduction for each layer u ∈ {0, 1, . . ., δ −1} with the number of layers is given by with the area As of the surface triangle, its neighboring triangles Nj, length L jj ′ of the edge between the triangle at its respective neighbor, and the corresponding temperature gradient (∇T ) jj ′ along the edge normal n jj ′ .For details, we refer to references (Kottler et al., 2019, Bulatov et al., 2020) We store the material-dependent parameters into a vector x, where C denotes the unique semantic class label given by the previously introduced procedure of the 3D mesh generation.Since C is assigned to each triangle, the collective structure of the 3D mesh with material properties x is referred to as the parameter mesh M (x).
Another important macro-factor influencing the temperatures of the surface is the environment.There are several environmental variables explicitly (T air ) or implicitly mentioned in (2) to (5).To the latter, we can count, for example, the wind speed, important for computation of h in (2), or the time at the location in question, which allows computation of Esun and ϕs in (4).Usually, this parameters are provided by weather servers or, in the case of close-range simulation, by some measurement equipment.

Optimization
At the beginning of the optimization procedure, an initial value is chosen for the parameter vector x to be optimized.As explained in the previous section, the parameter mesh object M is created with the current parameter vector.The environment object E contains the important parameters such as air temperature, wind speed, relative humidity, cloud coverage as well as the time and location of the simulation.Due to potential mismatches between the time points of the measured data and the simulation, missing intermediate values are linearly interpolated.Then, M and E are passed to the simulation object T to perform the simulation of the surface temperature over the training period according to Section 3.1.For the rest of this paper, we will use the notation T (x) = T (M (x), E).The simulated surface temperatures are compared to the measured values to assess the goodness of the current parameters.In our case, the objective function that is minimized is the mean square error between simulated and measured surface temperatures: where Tn, Tn are the measured and the simulated surface temperatures at the n-th time point, respectively, and N is the number of time points.
For parameter optimization, the trust-region method (Byrd et al., 1999) is used.It is an iterative method based on interior point techniques and expects as input the objective function depending on the unknown parameter vector.
The optimization problem is bounded by constraints in the form of lower and upper bounds.These bounds are given by the underlying physics of the assumed materials and their empirical values from the literature.For each xi holds: where li and ui are the lower and upper bounds for the i-th element of x from (8).This ensures that the optimized parameter values are within realistic and reasonable bounds.

Implementation details
While placing temperature boxes to retrieve T , we always paid attention to choose sunlit locations in the middle of the surface occupied by a certain material.This approach eliminates the need for the occlusion analysis step from (4), and the lateral conduction term from ( 7) is unnecessary as well.As a consequence, we reduce the whole workflow to one triangle per measurement unity.Since the weather conditions are the same for each iteration step, the environment E is created once for all before the optimization.All this allows for a faster simulation since only the parameter mesh M (x) is recreated with the corresponding parameters at each iteration.
The upper and lower bounds of these parameters were obtained from extensive literature research conducted by (Bulatov et al., 2020).One additional parameter, which indeed should be chosen with care is the number of layers, referred to as depth discretization δ, i.e. spatial step size, while time discretization t refers to the time step used in the simulation, which is inversely proportional to the number of measurements N .A higher value of δ can produce more accurate results, but there is a threshold value beyond which increasing it does not significantly improve the results.Contrarily, a lower value of δ requires a higher value of N to prevent the simulation from getting unstable, resulting in a higher computational cost.Unfortunately, the optimal choice of δ depends strongly on material being simulated.We thus determined appropriate value of δ using exhaustive search with exponentially varying ranges.
Another important contribution concerns the metal roofs.Metal has a high thermal conductivity, which results in a strong coupling with the core temperature.This would not make much sense; Rather the model from Section 3.1 should be changed to take into account the specifics of these roofs.We now allow for a very low thermal conductivity inside the insulation and roof mix, while retaining the surface properties of metal.This is where the advantage of the layer model is obvious: we take the metal properties of the exterior layer and the values for conductivity κ, density ρ and specific heat capacity cv for tile in all other layers.
To improve the performance of Box4 on a grass area, it has been recognized that the parameters used in equation ( 8) do not yield satisfactory results.This is because grass, being a living organism, possesses different thermal properties compared to non-living materials such as stone or metal.For instance, grass can absorb water from the ground and undergo transpiration.
In light of this, the heat transfer coefficient within the convection term is set to h = h1 + h2v wind .Therefore, the parameter vector x is now extended by h1andh2 while for other boxes, not measuring wind speed, empirical values of h have been used.

RESULTS
In this section, the results of the parameter optimization for the thermal simulation of an urban area are reported.Five boxes with located on the surface of different roof materials and colors are used to measure temperatures and compute them with the simulated The boxes are described in Section 4.1 and our main findings on close range validation of our method in Section 4.2.In Section 4.3 our macro-scale validation will prove that the assumption of similar materials within a considerable urban area thus confirming the sensibility of parameter optimization.

Experimental setup
The temperature boxes considered in this work and visualized in Figure 2 have an approximate dimension of 15×7.5×7.4 cm for length, width, and height, respectively, and consist of the following five Arduino modules: The DS1307 real-time clock, veml7700 light sensor, si7021 air temperature and humidity sensor, ds18b20 temperature sensor and the sen0170 wind speed sensor next to a SD card reader.The light sensor was included to determine if the box was shaded at a given time (e.g.cloud coverage or shaded by a tree or a building).The power is provided by a power bank.The Arduino can be programmed to specify the time interval in which the sensor data should be read.Larger time intervals allow for a longer overall recording.
The box is sealed so that no rain or moisture can destroy the electronics.After connecting the power bank to the Arduino, the device runs a routine and beginns recording data.This involves initializing the sensors, resetting the clock, and creating a text file on the SD card where the data will be saved.The recording continues until the power bank is depleted.All sensor readings are stored on the SD card.Compared to the box shown Box 0 The temperature sensor was attached to white/grey-ish paving Box 1 Placed on an orange/red tile roof.
Box 2 Placed on a metal garage roof (silver-ish colour).
Box 3 Placed on a metal shed roof (silver-ish colour).
Box 4 Placed on grass Table 1.Boxes placed upon different materials in the Figure 2, the boxes placed in the field were painted white to reduce the risk of overheating when placed in the sun.
The sensors record time, air temperature, solar exposure, humidity, and wind speed in 2.5-minute intervals over a three-day period.The experiments took place in a suburb of the Australian city of Perth in December 2021.This was a dry summer day, on which issue of overheating has been very acute.
In Table 1, an overview of the installation location and material of the five measurement boxes together with the underlying surface materials is presented.In order to avoid overfitting, certainly a danger in close-range validation scenarios, the measured values are divided into two parts.The measurements from the first two days are used to optimize the material parameters while the last day will be our validation set.

Close Range Validation using in-field measurements
We utilized an existing implementation of the interior-point algorithm, namely, the MATLAB function of fmincon, and set the tolerance of the first-order optimality to ε = 10 −6 .The optimization stopped for Box0 and Box4 after 30 iterations and for Box4 after 20 iterations.In view of the high computing time, the optimization for Box2 and Box3 was terminated after 15 steps.The computing time needed for optimization to complete varies depending on the material being used.Box0, positioned on the pavement, and Box1, positioned on a tile roof, both took around 15 minutes to complete, but Box1 was faster because it went through 30 iterations while Box0 only had 20.
While these boxes were relatively fast, the metal roof boxes Box2 and Box3 posed a problem as they took multiple hours (around 12h) to complete 15 iterations.The fourth box, placed on grass, was faster than the metal roof boxes but not as fast as Box0 and Box1, with optimization using 20 iterations completed in about 25 minutes.One factor that influenced the computing time was the time discretization.For stability reasons, the higher the depth discretization, the smaller the time increments had to be chosen.
The Table 2 shows an overview of the parameters for five different boxes (Box0 to Box4) as well as the function values at the beginning and at the end of the simulation.Encouraging decays of the square root of the objective function f for the training data have been achieved, ranging from 2.2K in Box0 to 16.7K in Box2.We can see that the temperatures from the first two boxes have been modeled quite well before optimization.The proves the soundness of the simulation framework for these two classes and that the parameter values are rather close to default.For metallic roofs, this was not the case.On the contrary, the model had to be revised in order to adjust the output to the measurements.After this modifications, Box2 and Box3, which previously showed the largest deviations, achieve temperature deviation close to those of the other boxes after parameter optimization.Currently Box4, which was placed upon grass, has the largest deviation from T ; this could be because some the model is incomplete since it does not include latent heat fluxes.This problem was addressed by optimizing the convective heat parameters additionally.However, this approach led to non-realistic parameters.The difference between training and testing measurements has been slightly larger in Box1 than in Box0.Since the temperatures are measured in K during optimization, these differences are, however, have a low impact on the results.It can also be observed in Figure 3 that the measured temperatures on the third day exhibit more fluctuations (for all boxes) which may be due to the dying battery.
The left column of Figure 3 shows the non-optimized temperature characteristic curve of the simulation with the parameter set from the database in red and the measured temperature time series from the measurement boxes in black.The optimized temperatures are shown in green on the right, while the grey boxes show the measurements not used for computation of x.
The temperature curves for Box0 and Box1 exhibit qualitative similarities, with the exception of a drop in temperature just before sunrise, possibly due to morning dew, which is not yet taken into account in the model sufficiently.In contrast, the old model yields poor results for Box2 and Box3 (metal roof), while the new model's optimized daytime temperatures well.However, the model does not reflect well the distinctive temperature variation in the second half of the second night, indicating unaccounted environmental factors, such as clouds or dust, which have an insulating effect and are not captured by sensors.A possible solution is to incorporate additional weather data, such as cloud radar, which can detect cloud cover even at night.
For the old model of Box4, i.e., without the optimization of the convection parameters, the optimized temperature curve roughly follows the measured data in the morning.However, the temperature exceeds the expected values in the afternoon, and the cooling during the night takes longer.One possible explanation for this deviation is that grass, being a living organism, has the ability to draw water from the ground and undergo transpiration, which allows it to cool itself.2. Overview of the parameters, considering the initial value x0, the lower bound case l, the upper bound case u, and the optimized case x.The measure units for the parameters are: Density ρ in kg/m 3 ; Specific heat capacity cv in J/kg/K; Conductivity κ in W/m/K; Emissivity ϵ ∈ [0; 1]; Albedo a ∈ [0; 1]; The square root of the objective function f (9) in K.For optimized boxes two values for training and validation cases are shown.In the grey cells, we specified the number of layers δ and time discretization t in seconds, which were determined empirically.

Validation using remote sensed thermal images (city scale)
The simulation has been carried out twice for a district in the City of Melville covering an area of 2 × 1 km 2 .The district data was provided by the City of Melville and was previously discussed in the publication by Bulatov et al. (2020) [reference: (Bulatov et al., 2020)].The local conditions in terms of weather, construction, and materials in this district are quite similar to the location where the measurement boxes were utilized for optimization.The first run of the simulation was carried out with the original material parameters while the second run involved the newly optimized values.Figure 4 shows a part of the 3D city model with the color-coded temperature differences per triangle which results from the simulation with the original and optimized material parameters.Note that only three classes had been optimized in their physical values, i.e. tiled roofs, grass, and street, while the full simulated thermal digital twin contains eight semantic classes.In particular, we excluded the metal roofs from the validation because of three reasons: first, they exhibit high variety of color, and with it, albedo values.Second, their inner model, rethought and revisited in Section 3.3, may not be applicable to the multitude of roofs.In order to estimate the most suitable inner model, some statistics over the Figure 3. Temperature differences in the 3D thermal digital twin between the simulation results based on the original and optimized material parameters.In the grey rectangles we show that measurements that were not used for computation of the optimized parameters x and where the simulated values of T were computed using x and plotted into the graph.
region of interest would have been required.Third, the thermal signatures for metal roofs are falsified because of multiple reflections (Burkard et al., 2020).We compare the surface temperatures resulting from original and optimized material parameters against an aerial thermal image of the area, which was recorded at around 8pm on a summer night several years ago.Thus, the environmental conditions in the simulation had to be chosen to be comparable to the conditions of the thermal image, and not identical to the data from the measurement boxes.We validated the root-mean-square error (RMSE) and mean average deviation (MAD) for those classes on which we performed the material parameter optimization.
Table 3 summarizes our findings and shows the positive effect of parameter optimization.Tiled roofs and streets profit clearly from our proposed parameter optimization framework.Yet for grass areas, both error measures degrade.This might be caused due to differences in soil moisture.The applied simulation framework is not yet able to take into account the soil moisture variations.Hence, differences between the environmental conditions at thermal image and measurement box recordings might yield another decrease in simulation precision.However, given the limitation of measurement box recordings and the improvement in the areas of roofs and streets, the proposed optimization framework is proved in its principle and represents a promising tool to be applied to further datasets.

CONCLUSION
We presented a mathematical framework for recovering the unknown material parameters using the thermal simulation, whose output was compared to the temperature measurements.In total, very different materials were investigated and the temperature curves after parameter optimization fit the measurements much better for all materials, which demonstrates the soundness of the proposed approach.It is worth to note that for most materials with exception of the metallic surfaces, also the initial parameters do match quite well with the simulation output, which implies that the assumptions made in our simulation are, for the most part, justified.Metallic surfaces represented a challenge since the model assumptions of constant material along the layer does not hold.To cope with this, we adjusted the range of the inner model parameter to address the lower thermal conductivity inside the insulation and roof mix and account for insulating air layers, leading to improved results.
For the large scale comparison, we used the thermal image recorded several years ago, but to our satisfaction, the result for tiled roofs and roads has slightly improved with respect to the default parameters.For grass, this was not the case.Probably, the soil moisture and evapotranspiration of natural surfaces and vegetation must be taken into account in the future.
It has still to be verified if the simplified simulator model can be used for identification of Urban Heat Islands where the simulation may have to be carried out for many consecutive days.
In this regard, we consider the most encouraging finding of our contribution that once calibration of parameters is carried out for a couple of days, the accuracy of simulation of previously unknown data remains quite high.

Figure 2 .
Figure 2. Overview of the components of the temperature box

Figure 4 .
Figure 4. Temperature differences in the 3D thermal digital twin between the simulation results based on the original and optimized material parameters. Table