LOSSLESS DATA COMPRESSION OF GRID-BASED DIGITAL ELEVATION MODELS: A PNG IMAGE FORMAT EVALUATION

: At present, computers, lasers, radars, planes and satellite technologies make possible very fast and accurate topographic data acquisition for the production of maps. However, the problem of managing and manipulating this data efficiently remains. One particular type of map is the elevation map. When stored on a computer, it is often referred to as a Digital Elevation Model (DEM). A DEM is usually a square matrix of elevations. It is like an image, except that it contains a single channel of information (that is, elevation) and can be compressed in a lossy or lossless manner by way of existing image compression protocols. Compression has the effect of reducing memory requirements and speed of transmission over digital links, while maintaining the integrity of data as required. In this context, this paper investigates the effects of the PNG (Portable Network Graphics) lossless image compression protocol on floating-point elevation values for 16-bit DEMs of dissimilar terrain characteristics. The PNG is a robust, universally supported, extensible, lossless, general-purpose and patent-free image format. Tests demonstrate that the compression ratios and run decompression times achieved with the PNG lossless compression protocol can be comparable to, or better than, proprietary lossless JPEG variants, other image formats and available lossless compression algorithms.


INTRODUCTION
DEMs are generally used to describe the surface of the earth or planets, and virtual worlds in video games.They are produced in a number of ways, most of them by direct field measurements of elevations at specific locations using for example LIDAR (Light Detection and Ranging), photogrammetry or Interferometric Synthetic Aperture Radar (Fujisada et al., 2012).
DEMs represent a continuous surface and depending on the application they can be very dense, with grid point distances ranging from one metre to less than one hundred metres, thus incorporating large amounts of data (Poli and Soille, 2011).It should be pointed out that this work relates to rectangular, regularly sampled elevation datasets.Other DEM representations include irregularly sampled elevation points stored in a TIN (Triangulated Irregular Network) and the contour representation (Maune, 2007).Spatial data structures such as the quadtree of Samet (Samet, 1987).can also be used for constructing and/or representing a DEM.
A basic processes operating on DEMs is data compression.Compressing a DEM reduces the storage space and facilitates faster access to the data as well as faster transfer.Image compression algorithms are relevant to DEM compression because they deal with comparable issues.Image compression algorithms target natural images, where the information loss brought by imperfect reconstruction is usually not a problem because It is the human visual perception that is being targeted by the compression process.This is not the case of DEMs.In fact, users of DEMs are very reluctant to use altered datasets.Some issues are: altered slope, altered hydrology and altered visibility.
In addition, grid-based DEMs are usually estimated from interpolated values.These estimates will always be affected by several sources of errors (sampling, measurement, interpolation methods, etc.), and an inevitable disparity will occur between observations and DEM-reconstructed elevation values (Owen and Grigg, 2004).Traditionally DEM errors are reported by summary statistics, using a single value, such as the Root Mean Square Error (RMSE), which quantifies the average deviation between ground-based observation and DEM values at a set of control locations.These statistics are global measures of DEMs accuracy and are not specific to any particular location.Therefore it is assumed that the error rates are similar everywhere, from the highest sub areas to the flattest ones.
However, if the sampling data is evenly and densely distributed (as it is the case of the DEMs considered in this work) over the grid, these error measures may indeed be indicative of DEMs accuracy.A complete evaluation of accuracy assessment of DEMs is beyond the scope of this work which basically relates to the compression of the DEM once the DEM is generated or constructed according to specific accuracy requirements, applications and uses.For a comprehensive study of local and global probabilistic accuracy assessments of DEMs the reader is referred to Harvey (2008).DEM data can be very dense it requires relatively high memory for its storage.Costs for storage space may be considered as no longer crucial, since the price for storage media has been almost exponentially decreased (at present <$50 per Terabyte).Nonetheless, more significant are system dependent parameters, like bandwidth and CPU speed.Time is crucial for Internet and real-time applications.The largest bottleneck in modern computers is the transfer of data in the busses between memories (primary and secondary) and processors (central and graphic processors).Transferring large amounts of data using external communication (cables or wireless interfaces) is even more critical.Processors are sometimes so fast that the time for transferring and decoding data may be faster than transferring non-coded data (Yea and Pearlman, 2004).
Very dense DEM data may preclude numerous redundancies, which can be encoded efficiently with modern image processing compression techniques (Moshe et al, 2007).Hence, this paper evaluates the effects of the PNG (Portable Network Graphics) lossless compression protocol on floating-point elevation values for 16-bit DEMs of dissimilar terrain characteristics.Since DEMs can be displayed and processed using an image format, they can be read, exported and imported directly by a scientific application supporting that format and, therefore, can be processed in a lossy or lossless compressed form as required (Kidner and Smith, 2003).
The majority of the research in the compression of raster images for DEMs representation has focused on lossy compression algorithms.Lossy compression schemes may only achieve modest compression before significant information is lost.Even if greater compression could be achieved, and if some loss of data is acceptable, there is still controversy over the role of lossy compression for particular applications (Isenburg et al. 2004).Indeed, if only mild levels of lossy compression are attained, then significantly improved lossless compression techniques might be more appropriate.
Furthermore, complex compression schemes are more costly to develop, implement, and deploy, and the use of proprietary schemes (i.e.JPEG2000) may have a cost (and risk) associated with the end of life of equipment especially for long-term archives and storage (Remondino, 2003).Proprietary compression schemes may also compromise interaction among different softwares.Hence, the use of popular consumer industry standards such as the PNG image format can reduce the cost and risk of using lossless image compression for efficient storage and representation of mass generated DEMs.
An important consideration for the use of raster images for DEM representation and compression is the floating-point nature of the height values.These values are usually converted to integers, thus requiring image formats that can handle sufficiently large bit-depths to represent vertical data with sufficient resolution and accuracy.This work evaluates the performance of traditional and state-ofthe-art lossless compression techniques for 16-bit greyscale images representing DEMs.Emphasis is placed on those techniques that have been adopted or proposed as international standards, and particular attention is directed to the performance of the popular PNG lossless compression protocol.This protocol is fully lossless, and since it supports up to 48-bit truecolour or 16-bit grey-scale values (65,536 grey shades) it makes it suitable for depicting terrain models with ideal efficacy, accuracy and resolution while preserving full integrity of the elevation data it represents.
Additional features that make the PNG image format suitable for DEMs representation and compression are presented in section 3. Tests with numerical examples and statistical measures that validate the use of the PNG image format are given in section 4. Upon comparing with other image file formats that support 16-bit coding reported in the literature, which includes tests carried out with the popular TIFF (with LZW compression) standard and with the proprietary compression schemes of JPEG variants such as JPEG2000 and JPEG-LS, the PNG compression protocol achieves comparable and consistent results at a similar or lower computational costs.

RELATED WORK
A limited number of papers have examined DEMs lossless compression from a digital image processing perspective.Indeed, very few image formats are suitable for representing a DEM as a raster image with sufficient accuracy and resolution.The majority of these papers deal with DEM compression by using variants of the JPEG compression algorithm.By way of example, Shantanu and Sapiro (2001) investigated the lossless compression of terrain images using the JPEG-LS standard whereas Bjorke and Nilson (2002) and Oimoen (2005) detailed wavelets based compression schemes for terrain images.This scheme support elevation query mechanisms while allowing the compression and/or decompression of specific terrain areas of interest within an elevation range.
Owen and Grigg.( 2004) demonstrated the use of JPEG2000 for compressing and querying DEMs, and provided comparisons with compression utilities such as winzip and winrar.Moshe and Shamir ( 2007) presented an image compression terrain generalisation algorithm based on Discrete Cosine Transforms (DCT) and Discrete Wavelets Transforms (DWT) that were specifically adjusted to fit DEMs.Bjørke and Nilsen (2003)  carried out a similar work using Wavelets Transforms as applied to the simplification and compression of digital terrain models.Alternatively, Hilbring (2004) used 16-bits PNG images in the integration of high-resolution DEMs into 3D GIS applications for environmental systems.
Other research papers have studied the compression of floatingpoint 3D data using image formats by focusing on maximising the compression ratio as the decompression speed was not relevant.For example, Usevitch ( 2003) and Gamito and Dias (2004) proposed extensions and modifications to the JPEG2000 standard that allows floating-point data to be efficiently encoded with bit-plane coding algorithms.In these papers, the floatingpoint values are represented as "big integers" where decimal figures are converted to integers by multiplying a given height by a factor of 10.
The image compression protocol JPEG2000 can achieve both lossless and lossy compression Taubman and Marcellin (2004).Its compression gains over other image format are attributed to the use of Discrete Wavelet Transforms (DWT) and a more sophisticated entropy encoding scheme.While the lossy compression option of JPEG 2000 is superior to the ordinary JPEG compression, it is recognized as unsuitable for terrain datasets, as it compromises successive data processing (Florinski, 2011).
The less known JPEG variant (JPEG-LS) can provide an error bound on its output.Algorithms like JPEG-LS, which can provide an error guarantee, are called near-lossless (Russ, 2011).Unfortunately, these image formats are not yet supported by web browsers and are proprietary and license protected.TIFF images have also been utilised for representing DEMs.The greatest strength of TIFF is that it can support the full range of image sizes, resolutions, and colour depths.TIFF incorporates support for the LZW compression technique.Although the LZW technique is one of the most popular compression algorithms, its use may also be restricted due to proprietary limitations.The primary weakness of TIFF is the large file size which slows overall performance and limits its use for storage and internet applications.
A modification of the TIFF format referred to as GeoTiff has also been implemented for DEM representation in GIS applications.GeoTiff DEMs are similar to TIFF images or graphics files except that instead of colour pixel values, the file contains a grid of 16 bit or 32 bit elevation measurements.
However, GeoTiff DEMs cannot be read as a graphics file by image editing programs due to the metadata associated to it.They can only be read by specialised programs which are designed for their use (i.e.3DEM, www.hangsim.com/3dem).

THE .PNG IMAGE FILE FORMAT
The PNG is a popular image format used for storing compressed raster images in a lossless manner.The compression engine is based on the Deflate method (Miano, 1999) which is a widely used, patent-free algorithm for universal, lossless data compression.The format is defined by the specifications outlined by the PNG Development Group.It is an International Standard published under the formal name ISO/IEC 15948.Apart from being a patent-free standard, the PNG format is also endorsed by the World Wide Web Consortium (W3C).
The compression works in a pipeline approach in which the image pixels are passed through a lossless arithmetic transformation named delta filtering, or simply filtering, and processed further as a (filtered) byte sequence (Roelofs, 1999).Filtering does not compress or otherwise reduce the size of the data, but it makes the data more compressible.
For instance, a sequence of bytes increasing homogeneously from 1 to 255 will compress either very poorly or not at all.But a minor modification of the sequence, that is, leaving the first byte alone but substituting each successive byte by the difference between it and its precursor converts the sequence into a highly compressible set of 255 equal bytes, each having the value 1.
Apart from being an effective lossless compression process, the PNG format has many useful features such as alpha transparency and gamma correction.Often, gamma differences between platforms can make a DEM image appear darker or lighter.The PNG format, stores the original gamma information to ensure that the image is displayed correctly in any gammaaware environment in which it is viewed (Roelofs, 1999).This gamma correction feature solves the problems related to different DEM rendering methods for images with inadequate balance between brightness and contrast.Also, a PNG image can be stored in interlaced order to allow progressive display.The purpose of this aspect is to allow images to "fade in" when they are being displayed on-the-fly.Interlacing slightly expands the file size on average, but it gives the user a meaningful display faster.These characteristics make this image format ideal for storing and visualising DEM data and for web-based GIS applications.
Moreover, the PNG format has become very popular amongst graphic artists and web developers as most browsers and image editing/processing programs support it.A comprehensive description of the background, theory and additional applications of this image format is beyond the scope of this work and the reader is referred to Memon et al. (1997) and Russ  (2011) for more detailed information.

TESTS
This section presents the results of using the PNG compression protocol for four DEM datasets of dissimilar terrain characteristics.The results are compared with the compression performance of JPEG2000, JPEG-LS and TIFF for 16-bit depth.Although one of the major (and often only) concerns in image coding techniques is that of compression efficiency, it is not the only comparison factor used here.Attention is also given to evaluating the complexity of the compression.In these tests, the complexity is evaluated in terms of the run decompression times as computed on an Intel PC with 2.83 GHz and 3.00 GB of RAM, under a Windows XP operating system.Different and varied geographic features (i.e.ridges, peaks, valleys and water bodies) characterise the selected four DEMs. Figure 1 shows the elevation images and the corresponding aerial photography for each site investigated.Accordingly, the data sets considered are named respectively: ridges, peaks, valley and lake.This selection was based on the continuity of the terrain, their uniqueness in geo-morphological form and the terrain complexity as defined by the vertical variation in heights per unit area.
Having the largest variations and standard deviation ridges and peaks are the most topographically complex.Valleys and bodies of water on the other hand are the gentlest, that is, height differences are the smallest of the four sites.As the terrain datasets investigated have different characteristics and topography it is reasonable to expect different compression ratios.It is also reasonable to expect that a compression algorithm that performs best on a certain dataset may not be the best choice for another.
The raw data for these tests originated from points scattered in 3D and stored in .csv(comma separated values) files as x, y, z non-uniformly spaced vectors.The average size of these .csvfiles (.zip compressed) was 30 Mb and the average number of x, y, z points was approximately 3.2 million.The DEMs were created by fitting/interpolating the data of the non-uniformly x, y, z points to determine the height value (Z) that would exist at the intersection of a regularly spaced XY grid.The four DEM surfaces generated by the interpolation process always passed through the original data points and were created by a method referred to as triangulation with linear interpolation Watson, 1992).The software used for this interpolation process was Matlab 7.1 from www.mathworks.com.
Triangulation with linear Interpolation was selected because of the highly dense and even distribution of the xyz points of the data sets.This interpolation method is most effective when the source data is evenly distributed over the grid area.Also, the method does not extrapolate elevation values beyond those found in the source data.
Relatively small RMSE were generated for the four data sets considered in this section, that is, +/-0.05m., +'-0.06 m, +/-0.03m and +'-0.04 m. for ridges, peaks, valley and lake respectively.These figures were computed upon comparing the elevations at each original xy position as compared to those produced by the constructed DEMs at the same locations.
Each DEM covered an area of 4 km 2 with overall height differences ranging between 100 and 1200 metres above sea level.For the purpose of the compression process the height information in all data sets was rounded to the first decimal point for a DEM resolution of 1m.Hence, regular grids (2kmx2km) of interpolated elevation points were created for each data set.The DEMs floating-point data was converted to integer to ensure compatibility with image format standards.As the conversion to integer by mere truncation results in a loss of information (i.e. the information content after the decimal place will be lost), the floating-point height values of the four DEMs were multiplied by a factor of 10.For instance, an elevation value of 400.3 would be converted to (400.3x10=4003).
The DEMs were then transformed to the image formats selected for the tests (i.e.TIFF, JPEG2000, JPEG-LS and PNG) with each image format capable of handling a bit-depth of 16 bits/pixel.The software implementation used for importing, exporting and displaying the images was Matlab 7.1.Compression effectiveness was evaluated by comparing the size of the compressed output with the size of the raw pixel data (i.e. the compression ratio) whereas compression efficiency was measured in term of de-compression execution time only.By way of example, all the DEMs images required a memory of 7.8 Mb as bit-map images.
The compression ratio shown for the Peaks data set for the case of the PNG in  (Taubman and Marcellin, 2004) was used.In the case of JPEG-LS the default options were chosen (Shantanu and Sapiro, 2001).The LZW compression protocol was applied to the TIFF images whereas the maximum compression setting was considered for the PNG files.
Table 2 shows the execution times, relative to PNG, for decompression.It shows that JPEG-LS, in addition to providing the best compression ratios, is close to the fastest algorithm and therefore apparently of low complexity.JPEG2000 is considerably more complex while PNG is close to JPEG-LS.It should be noted that while JPEG-LS and JPEG2000 are symmetrical (i.e.encoding and decoding times are similar), but this is not the case for PNG, which is strongly asymmetrical where the encoding time is longer than the decoding time (Miano, 1999).
One notable exception to the general trend is the valley image, which contains mostly patches of constant colour levels as well as gradients.For this type of image, PNG provides by far the best results.Another exception is lake, in which JPEG-LS and PNG achieve much larger compression ratios.The majority of the image contains a flat area representing a water body (see Figure 1-b The JPEG compression/reconstruction variants proved to be acceptable for the overall compression of all DEMs considered in these tests.However, these variants introduced spurious oscillations, or "ringing" artifacts into all DEMs flat areas.An effective solution was to extract flat areas from the original data prior to compression.Flat areas may be detected and delineated by noting where derived aspect is undefined, or where local variance is zero.
These criteria were used to create a mask that when intersected with the original DEM provided a raster of flat areas only, with their elevations intact.This raster was then compressed without any further processing, as it was composed of a finite number of contiguous areas of constant elevation values.By retaining this data as a separate layer along with the compressed data, flat areas could be easily restored after the DEMs were reconstructed while retaining full floating-point accuracy.
By contrast, this additional process (which required additional coding time) was not required when using the PNG protocol as the Deflate algorithm used by this image format is designed to detect and compress with efficiency areas of constant elevations (pixel values) without any loss of information.A linear example was described in Section 3.This has the advantage of improving coding/decoding processes for speed of transmission over digital links.
From Table 1, on average, PNG performs the best, although this is solely due to the large compression ratio it achieves on the valley image.JPEG-LS provides the best compression ratio for three of the four images.This shows that as far as lossless compression is concerned, PNG seems to perform reasonably well in terms of its ability to efficiently deal with various types of terrain.However, in the case of abrupt pixels (elevations) variations such as ridges and peaks PNG is outperformed by the JPEG-LS algorithm.
Due to the lossless nature of the image compression protocols used in these tests there was no need to evaluate the postcompression accuracy of the DEMs as the integrity of these DEMs were unaltered by the process.In other words the difference between the original constructed DEMs and the compressed version was virtually 0 (zero) as expected.It may also be added that the compression ratios given in Table 1 are indicative of what compression ratios may be expected when compressing in a lossless manner various types of DEMs of different terrain characteristics.

Recommendations or suggestions regarding what compression
ratio is required depending on the nature of the terrain being considered may be ascertained if the type of compression adopted a lossy compression.In this instance, areas of interest can be compressed at different compression ratios and an error estimate or accuracy assessment may be determined in each case.With lossless compression, the compressed DEM will decompress at an exact duplicate of the original, mirroring its quality and integrity.
Table 2 shows the execution times, relative to PNG, for decompression.It shows that JPEG-LS, in addition to providing the best compression ratios, is close to the fastest algorithm and therefore apparently of low complexity.JPEG2000 is considerably more complex while PNG is close to JPEG-LS.It should be noted that while JPEG-LS and JPEG2000 are symmetrical (i.e.encoding and decoding times are similar),but this is not the case for PNG, which is strongly asymmetrical where the encoding time is much longer than decoding time (Miano, 1999).
To further investigate and evaluate the .PNG image file format a further comparison was carried out with the compression achievable using other proprietary available lossless data compression algorithms.Three such algorithms were tested and the results are shown in below in Table 3

DISCUSSION AND CONCLUSIONS
Elevation datasets that are usually referred to as DEMs (Digital Elevation Models) can be depicted as grey-scale images, where each elevation sample is translated to a grey-scale 'pixel' value.However, since digital images formats are originally designed for the compression of natural images and not terrains, their parameters need some modifications prior to their application to elevation data.In this context, this paper demonstrated that the compression of floating point DEM data using the PNG image format generates acceptable compression results.
These results are based upon comparisons with other proprietary image formats supporting 16 bit-depth.This assessment took into consideration terrain models of various characteristics and formations, and was based on two significant factors: (a) the compression efficiency (i.e.compression ratio) and (b) the complexity (decoding execution time).Results have shown that the PNG scheme will save lossless elevation images, in some instances more efficiently than JPEG2000, JPEG-LS and TIFF (LZW).This is the case if the terrain comprises uniform or gently varying height gradients as in the case of valleys and water bodies.
Future work includes research into the effect on the compression capability of the PNG when increasing/decreasing the resolution of DEM data.Higher/lower resolution may perhaps provide a better correlation between adjacent pixel (elevation) values and therefore a better performance of the PNG compression protocol.Similarly, further studies may be directed to determine whether the possible partition of elevation images into tiles containing defined regions of similar heights concentration may improve compression outcomes.
As future DEMS will continue to improve in spatial resolution and thereby richness in detail (i.e.improved dynamic ranges), a more detailed analysis will be considered in relation to how the compression methods presented here perform as a function of DEM resolution and not just pixel size.Also, a factor that requires further investigation is the conversion of floating point height values to integer.Considering only one place of decimal (i.e.400.3 to 4003) may require a good argument for not going to two or more places and how this affects compression results, and how the PNG compression protocol respond to that (compared to other methods).It appears like JPEG-LS is favourable for detailed-rich areas.Hence, the future applications of the PNG format where detail-richness will continue to increase also needs to be addressed.
To conclude, and in view of the immediate accessibility of PNG, the results here reported support its adoption for the compression of elevation data for a number of applications (e.g., storage, retrieval and web based applications).This does not mean that PNG provides a complete solution to the problem, and indeed, the development of compression algorithms tailored to elevation data is still an open area of research.
Despite the rapid growth of the Internet for storage and display of World Wide Web based GIS applications, the available image file formats have remained relatively limited.The PNG format is versatile and offers a network-friendly, patent-free, lossless compression scheme that is truly cross-platform compatible.The widespread acceptance of PNG by the World Wide Web Consortium and by the most popular web browsers and graphic manipulation software companies suggests an everexpanding role of the PNG for terrain representation, storage, retrieval and display.

Figure 1 -
Figure 1 -Aerial view (left) and c obtained from grid DEMs of 1 me

TABLE 1 -
Table 1 (i.e.2.8) was simply determined by dividing 7.8 Mb by the memory requirement need to store the same image in PNG format which in this case was 2.78 Mb (i.e.7.8Mb/2.78Mb=2.8).The same process was applied to determine the remaining figures in Table 1.
Lossless compression ratiosThis table summarises the lossless compression efficiency of JPEG-LS, JPEG 2000, TIFF (LZW compressed) and PNG for all the test images.For JPEG 2000 the reversible DWT filter, referred to as JPEG 2000R
Lossless decoding times (secs.)for the PNG compression protocol as compared to TIFF (LZW), JPEG-2000 and JPEG-LS.

TABLE 3 -
. Lossless compression ratios from using other standard compression programs as compared to the .PNG In Table3the best file compression algorithm, WinRar 3.0, is only able to achieve a compression ratio of 3.79:1.It can also be seen from these results that the .PNG Lossless protocol is able to compress the data more efficiently than winrar, winace and winzip file compression standards