Earth Sci Inform DOI 10.1007/s12145-016-0271-5
RESEARCH ARTICLE
Iterative selective spatial variance reduction of MYD11A2 LST data George Ch. Miliaresis 1
Received: 8 May 2015 / Accepted: 8 September 2016 # Springer-Verlag Berlin Heidelberg 2016
Abstract The aim of this research effort is to develop a method that will allow to map and evaluate thermal anomalies in SW USA from the MYD11A2 night land surface temperature (LST) imagery being available for the year 2014, that present higher spatial (1 km) and temporal (46 images per year) resolution than the MYD11C3 LST data (12 images per year at 5.6 km spatial resolution). The fact that is MYD11A2 LST imagery is projected to a rectangular grid did not affect the X, Y and elevation (H) spatial decorrelation stretch. Principal component analysis and linear regression models isolated and removed the X, Y, H (spatial) dependent variance included in the data while metrics devised verified the selective spatial variance reduction. The reconstructed 46 LST images represent the amount the LST deviates from the X, Y and H predicted for the year 2014. The thematic information content of the reconstructed LST images is verified by cluster analysis and mapped the spatial extend and the temporal variability of thermal anomalies within the study area. The positive thermal anomaly clusters are spatially arranged mainly west of Sierra Nevada in Great Basin Section where extensional tectonics create a series of titled elongated mountain blocks along the N to S direction in between basins bounded by normal faults, while the negative thermal anomaly clusters are spatially arranged along the coastal region, further north and in the Communicated by: H. A. Babaie Electronic supplementary material The online version of this article (doi:10.1007/s12145-016-0271-5) contains supplementary material, which is available to authorized users. * George Ch. Miliaresis
[email protected] 1
Environmental Conservation & Management, Open University of Cyprus, 38, Tripoleos Str, Athens 104-42, Greece
western region far from the tilted mountain tectonic blocks of the Great Basin Section. The spatial maps that define regions with (positive or negative) thermal anomalies and distinct mean land response could assist landcover studies and support urban and rural planning in the context of emerging climatic change. Keywords Land surface temperature . MODIS . Environmental terrain analysis . Thermal anomaly mapping . Geothermal processes . Great Basin section
Introduction Land surface temperature (LST) is a key factor in environmental assessment (Miliaresis 2014). Due to climatic change, night temperatures are expected to increase at a faster rate than day temperatures due to less radiant heat loss because of increased cloudiness (Alward et al. 1999). The duration of a crop growth cycle is conditioned by the daily temperatures the plant is exposed to. Therefore, an increase in daily temperature will speed up plant development by reducing the duration between sowing and harvesting (Hertel et al. 2010). Thus, crop productivity may fall with the shortening of a cycle. In addition, high night temperature decreases production by decreasing the photosynthetic function (Turnbull et al. 2002). Land surface temperature (LST) data sets are computed from the satellite-based remotely sensed images with high temporal resolution at a moderate resolution scale, allowing the day and night monitoring of earth’s surface. An example being the LST acquisitions from MODerate-resolution Imaging Spectroradiometer (MODIS) on board the two EOS satellites, Terra and Aqua, presenting sun-synchronous, nearpolar, circular orbits (Wan 2013).
Earth Sci Inform
MODIS has a viewing swath width of 2330 km. The local equatorial crossing time is approximately a) around local solar time 10:30 a.m. in a descending node for Terra, and b) around local solar time 1:30 p.m. in an ascending node for Aqua (Wan 2014). Thus, MODIS data is comprised of day-time (10.30 AM, 1:30 PM) and night-time (1:30 AM, 22:30 PM) LSTs. For the coarsest resolution, MODIS LST data produced at 0.05 degrees (~ 5.6 km at the equator), a geographic latitude/longitude grid is used (MYD11C3 2015) which is referred to as the Climate Modeling Grid (CMG). On the other hand in order to maintain reasonable file sizes for the higher resolution (1-km) products (MYD11A2 2015) a tiled sinusoidal grid is used. The MODIS global LST products are available at various temporal resolutions. The various datasets are composed from the daily 1-km Terra (MOD11A1) and the Aqua (MYD11A1) products (Table 1). More specifically the days/nights for the MYD11A1/MOD11A1 product in clear-sky conditions and with validated LST values are averaged to form the 8-day (MYD11A2 2015) and the monthly (MYD11C3 2015) LST products. The identification and mapping of LST anomalies from multi-temporal imagery is a key issue in the: 1. Environmental terrain characterization and terrain analysis studies (Lillesand et al. 2004; Miliaresis and Tsatsaris 2010, 2011; Miliaresis 2014), 2. Monitoring of geothermal and geotectonic phenomena (Blackett 2014; Miliaresis and Seymour 2011; Zouzias et al. 2011; Miliaresis 2013a), 3. Recognition of the infrared emissions seen in night-time satellite images of the land surface before major earthquakes (Blackett et al. 2011; Freund et al. 2007; Ouzounov et al. 2007; Piroddi et al. 2014). Miliaresis (2009) defined LST anomalies from a time series of MYD11C3 LST imagery as regions presenting significantly higher or lower LST than their surrounding area. The assessment and interpretation of thermal anomalies is based on both the temporal (seasonal variation) and the spatial pattern. LST is correlated to elevation (H) and latitude (LAT), so the Table 1
MODIS LST products
Resolution Grid CMG CMG CMG Tile Tile Tile
Product name per Satellite Spatial 5600 m 5600 m 5600 m 1000 m 1000 m 5600 m
Temporal Daily 8 day Monthly Daily 8 day Daily
Terra MOD11C1 MOD11C2 MOD11C3 MOD11A1 MOD11A2 MOD11B1
Aqua MYD11C1 MYD11C2 MYD11C3 MYD11A1 MYD11A2 MYD11B1
quantification of thermal anomalies in vast regions is difficult (Miliaresis 2012a, 2012b). Towards this end various research efforts have been developed to map thermal invariant regions in both space and time from the coarsest in both spatial resolution (5.6 km) and temporal resolution (monthly averages) MYD11C3 LST imagery. In this context Miliaresis (2012a, 2012b) presented a method for selective variance reduction of LST that is based on elevation and latitude decorrelation stretch of multi-temporal MYD11C3 LST (monthly) imagery. The method was extended to account for longitude (LON) and distance form the coastline and applied in vast regions in a) Arabian Peninsula and Zagros Ranges (Miliaresis 2013a) and b) USA (Miliaresis 2013b) where LST anomalies are revealed. From the methodological point of view, principal component analysis (Mather and Koch 2011) is applied to the multitemporal MYD11C3 LST dataset. Then, linear regression models (Landam and Everitt 2004) are applied to the first two principal components (PCs), in attempt to isolate and remove the effects of H, LAT and LON (Miliaresis 2012a, 2012b). Although the first 2 PCs account for almost the 95 % of the total variance evident in the time series, there is not an account on the amount of variance related to H, LAT, LON in the remaining PCs. The resolution of MYD11C3 data set is regional in both space (5600 m) and time (monthly averages) while MYD11A2 product is composed from the daily 1-km MODIS LST product and stored on a 1-km Sinusoidal grid as the average values of clear-sky LSTs during an 8-day period (MYD11A2 2015). Thus, the MYD11A2 presents a spatial resolution that is almost 34 times greater than the spatial resolution of MYD11C3 product. There are 46 images per year for the MYD11A2 product in comparison to the 12 images per year being available for the MYD11C3 product. The increased temporal and spatial resolution of MYD11A2 imagery is crucial in the mapping and interpretation of LST anomalies. Thus, the implementation of the H, LAT, LON de-correlation stretch on a time series formed my 46 MYD11A2 images per year is a challenging issue. More specifically: 1. From the methodological point of view, the variance will be distributed in 46 PCs evident in MYD11A2, instead of the 12 PCs evident in MYD11C3 dataset. If only the first 2 PCs are considered in the linear regression models then an account of the amount of variance (that possibly persists in the remaining 44 PCs) related to H, LAT, LON, is required. 2. The pixel co-ordinates of an MYD11A2 image are not related in a straightforward manner to a latitude and longitude grid since pixels are projected to MODIS Sinusoidal grid (a kartesian grid) instead of a latitude, longitude grid.
Earth Sci Inform
3. Some problems might arise due to the increased temporal resolution of MYD11A2 dataset. The average values of clear-sky LSTs, during an 8-day period form the pixel values in each of the 46 images. Thus, the percentage of no data value pixels per image (due to cloud coverage during the 8-day sampling period) should be greater in MYD11A2 dataset than in the MYD11C3 (monthly images) dataset. 4. MYD11C3 is a seamless dataset of global extent while MYD11A2 is a global tilled dataset. The aim of this research effort is to develop a method that will allow to map and evaluate thermal anomalies in SW USA from the 46 MYD11A2 night LST images being available for the year 2014. Such an effort will allow terrain characterization and environmental assessment at a spatial resolution of 1 km and with temporal resolution of 8 days from MODIS night LST imagery. In addition, added value products in the field of thermal anomaly mapping are expected to be developed for the 1-km tiled sinusoidal grid MODIS datasets that will allow the further exploitation and application of MODIS imagery in the fields of terrain analysis and environmental terrain characterization.
Methodology The temporal, the projection and the accuracy characteristics of the data within the study area are presented. Then, the appropriate modifications of elevation, latitude, longitude decorrelation stretch method are derived and implemented.
Fig. 1 a MODLAND Integerized Sinusoidal Grid and the tile (5, 8) used in this research effort. b The latitude/longitude grid of the tile (5, 8). A shaded relief map is presented within the data tile (SRTM30 2009). c No-
Study area and data The Version-5 MODIS/Aqua global Land Surface Temperature (LST) and Emissivity 8-day data are composed from the daily 1-km LST product (MYD11A1) and stored as the average values of clear-sky LSTs during an 8-day period (Wan 2008). In this research effort, the night (around local solar time 1:30 AM) data layer acquired over the southwestern coast of the United States for the year 2014 is used. There are 46 LST images per year. The ID, the time interval (starting and ending date) as well as the filename of each image is provided in the Online Resource-Table A. Note that the Image 46 (an exception) includes the 5-day average values of clear-sky LSTs in the period of December 27 31, 2014. The data is stored on the MODLAND Integerized Sinusoidal Grid, based on the sinusoidal projection (MYD11A2 2015) while the radius of the reference sphere equals to 6,371,007.181 m. A sinusoidal projection shows relative sizes accurately (equal-area), but distorts shapes and directions (Snyder 1987). The land LST 1-km product is produced and distributed in adjacent non-overlapping tiles (Fig. 1a) that are approximately 10 degrees square at the equator (MYD11A2 2015). The tile coordinate system starts at (0, 0) (horizontal tile number, vertical tile number) in the upper left corner and proceeds right (horizontal) and downward (vertical). The tile in the bottom right corner is (35, 17). Thus, each MYD11A2 image tile is projected to a rectangular grid composed of 1200 rows and 1200 columns and presents a spatial resolution equal to 927.4 m. The tile used in this research effort (Fig. 1a) corresponds to row 5 (vertical tile number) and column 8 (horizontal tile number). The bounding coordinates of the tile (v05, h08) present longitude in the range − 130.5407 to −103.9134 degrees and latitude in the range 30 to 40 degrees. Parallels are equally
data mask (black pixels) superimposed over the color shaded relief map of the study area (Fig. 1b)
Earth Sci Inform
spaced straight lines parallel to each other (Fig. 1a, b), while the central meridian is a straight line and all other meridians are shown as equally spaced sinusoidal curves (Snyder 1987). The study corresponds to SW USA (portions of California, Nevada, Arizona etc. etc. are included) and NW Mexico. Version-5 MODIS/Terra Land Surface Temperature/ Emissivity products are validated to Stage 2 (Wan 2008), which means that their accuracy has been assessed over a widely distributed set of locations and time periods via several ground-truth and validation effort. Accuracy is better than 1 K (0.5 K in most cases), as expected pre-launch (Wan 2008). For the daily 1 km products, slightly higher errors may occur at large view angles and in semi-arid regions. The error in LSTs contaminated by clouds and heavy aerosols can be very large. The LSTs that are severely contaminated by clouds and heavy aerosols are removed from Version 5 MODIS LST products using empirical constraints on temporal variations in clear-sky LSTs (Wan 2008). The land (sea is excluded) in data tile (v05, h08) correspond to 1,153,363 pixels (80.95 % of the data tile area extent). No data values limit the coverage. In the current implementation, a pixel is excluded from computations if it is flagged as no-data value pixel in one of the 46 MYD11A2 nighttime LST images of 2014. 343,161 pixels (23.83 % of the area extent of the data tile) were flagged as no data pixels. Thus pixels included in the computations correspond to the 56.26 % of the area extent of the data tile. The no-data mask is superimposed over the color shaded relief map (SRTM30 2009) of the study area in Fig. 1c, in attempt to visualize the spatial extent of no data pixels. The 46 MYD11A2 LST images for the year 2014 are presented in Fig. 2. NASA collected elevation data for much of the world using a radar instrument aboard the space shuttle that orbited the earth from the 11th through the 22nd of February 2000 (Farr and Kobrick 2000). NASA released a global elevation dataset called SRTM30, referring to the name of the mission and the spatial resolution of the data, which is 30 arc-seconds. It is the updated version of the GTOPO30, the US Geological Survey 1-km global digital elevation model, with SRTM data used in place of the original data, when possible (SRTM30 2009). The SRTM30 DEM is projected to a latitude/longitude grid referenced to WGS-84 (horizontal datum) while elevations are referenced to EGM96 geoid (vertical datum). It presents spatial resolution of 0.5 min (approximately 926 m at the equator). The DEM of the study area is reprojected to the MODIS sinusoidal grid while elevations are resampled by bilinear interpolation (Fig. 1b). The elevations after the no data mask implementation (Fig. 1c) present a range from −83 m to 4105 m. Negative elevations indicate the Salton Trough (in between USA and Mexico) and the Death Valley’s Badwater Basin (in between California and Nevada) (Miliaresis and Argialas 1999, 2000). The landcover in the study area is accessed by the MCD12Q1 product which describes land cover properties derived from
Fig. 2 The 46 LST images for the year 2014. A common look up table is used with LST in the range − 10 to 30 degrees Celsius (the lighter a pixel is the greater the LST). Pixels with LST less than −10 degrees Celsius are depicted black while pixels with LST greater than 30 degrees Celsius are depicted white
observations spanning a year’s input of Terra- and AquaMODIS data (MCD12Q1 2015). The MOD12Q1 classification schemes are multitemporal classes describing land cover properties as observed during the year (12 months of input data). The data is available at 500 m spatial resolution (image dimensions 2400 * 2400 rows/columns) projected to the MODIS sinusoidal grid. The primary land cover scheme identifies 17 land cover classes defined by the International Geosphere Biosphere Programme (IGBP), which includes 11 natural vegetation classes, 3 developed and mosaicked land classes, and 3 nonvegetated land classes (Friedl et al. 2010). The 2012 MCD12Q1 data tile (v05, h08) for the western Unites States is used since there are no data available for the 2014 and 2013. Iterative selective spatial variance reduction In previous research efforts (Miliaresis 2012a, 2012b, 2013a, 2013b) principal components analysis (PCA) transformation is applied to the multi-temporal monthly LST dataset in order to minimize the effect of H, LAT and LON. PCA is a linear transformation technique that produces a set of images known as PCs that are uncorrelated with one another and are ordered
Earth Sci Inform
in terms of the amount of variance they explain from the original image set (Maaten and Hinton 2008). PCAs are computed from the linear combination of eigenvectors and the corresponding pixel values of the initial images (Mather and Koch 2011). PCA has traditionally used in remote sensing as a means of data compaction since it is common to find that the first 2 or 3 components are able to explain the majority of the variability in data values. Later components thus, tend to be dominated by noise effects. By rejecting these later components, the volume of data is reduced, with no appreciable loss of information (Siljestrom et al. 1997). It is assumed that the first 2 PCAs are correlated in a straightforward way to H, LAT and LON (Miliaresis 2012a, 2012b, 2013a, 2013b). In order to quantify the contribution of the independent variables (H, LAT, LON) to PCAs, empirical models based on multiple linear regression analysis are used while ANOVA table for each regression verify the statistical significance (Miliaresis 2012a, 2012b, 2013a, 2013b). The model performance is further assessed by the R2 (R is the multiple correlation coefficient between the independent variables and the dependent variable) that represents the extent of variability in the dependent variable explained by all the independents variables (Landam and Everitt 2004). Finally the multi-temporal LST imagery is reconstructed by considering the residual images of PC1 and PC2 as well as the remaining PCs (PC3 to PC12) (Miliaresis 2012a, 2012b, 2013a, 2013b). In order to adjust the method to the data spatial and temporal characteristics, the following considerations should be made: &
The LST data projection characteristics (a rectangular grid) is a critical factor. The LST data should be not reprojected to a latitude, longitude geographic grid, in order to avoid to resample the LST data values per pixel for the 46 image time series. Thus X and Y pixel co-ordinates instead of the corresponding latitude, longitude pairs are considered to represent the independent variables in the linear regression models applied to the first two PCs (Fig. 1a). Y co-ordinate according to the projection characteristics is straightforward related to the geographic latitude (Fig. 1b).
&
&
&
In the current implementation that are 46 LST images per year. So, the first 2 PCs might not isolate the total amount of variance that is X, Y, and H dependent. Thus, the method will be implemented iteratively. In each step, the variance that is X, Y and H dependent is removed from the 46 LST images by modeling the first 2 PCs. The data is reconstructed and the X, Y, H decorrelation stretch is implemented again to the new PC-1 and/or PC-2 image until no X, Y and H variance is evident. In addition a metric should be defined that will account for the X, Y, H dependent variance in each step. The metric will be based on the cross-correlation matrix of the 46 reconstructed LST images and the linear regression models applied to the first 2 PCs. In the previous implementations (Miliaresis 2012a, 2012b, 2013a, 2013b) standardized principal components analysis (Eastman and Fulk 1993) is applied. Due to the selected iterative implementation of the method, not standardize PCA is applied. Thus in each step, the reconstructed LST imagery will presented thermal anomalies expressed in degrees Celsius, instead of deviations from the mean. The method (Miliaresis 2016a) is implemented in both Octave (Miliaresis 2016b) and Python (Miliaresis 2016c) while the data sets (Miliaresis 2016d) are also available for testing and evaluation. In the previous implementations each reconstructed LST image presents a mean equal to 0 and a standard deviation equal to 1 (Miliaresis 2012a, 2012b, 2013a, 2013b).
Data analysis The cross correlation matrix (Online Resource-Table B) indicates negative correlation in between LST and H, X, Y throughout the year. Thus the greater the H and the latitude (Y), the less the LST for a certain pixel is. The X versus LST negative correlation indicates (according to the projection characteristics) that LST decreases from the SE towards the NW direction (Fig. 1b). Figure 3 visualizes the LST image
Fig. 3 The correlations in between X, Y, H versus LST for the 46 images, indicating that the absolute correlations are maximized for H. The X axis express time in the form of the 46 MYD11A2 LST images synthesised over an 8 day sampling period for the year 2014
Earth Sci Inform Table 2 Mean and standard deviation of correlation in between LST and X, Y, H through out the year 2014 Annual LST Correlation
H
Y
X
Mean
-0.70
-0.55
-0.50
0.05
0.08
0.14
Standard deviation
correlation versus X, Y, and H, indicating that the absolute correlations are maximized for H. Mean (annual) and standard deviation of LST correlation versus X, Y, and H are presented in Table 2. The eigenvalues, the eigenvectors and the PCs (a linear combination of the eigenvectors weighted by the eigenvalues) are presented in the Online ResourceTable C. Note that in the current implementation, the eigenvectors and the eigenvalues are computed by considering the variance-covariance matrix and not the correlation matrix. Thus, non standardized PCA is implemented according to Eastman and Fulk (1993). The eigenvalues indicate that first 2 PCs accounts for the 93.2 % of the variance evident within the multi-temporal dataset. In order to quantify the contribution of the independent variables (X, Y, H) to PCs (dependent variable), empirical models based on multiple linear regression analysis are used (Table 3, Table 4). For PC-1 the 69.2 % (R2) of the variance is explained by X, Y, H (Table 3) while for PC-2 the independent variables account for the 49.3 % (R 2) of the variance (Table 4). The analysis of variance (ANOVA) tables for the multiple linear regression equation of best fit for PC-1 and PC-2 are presented in Table 3 and Table 4 respectively. The F-statistic and the t-statistic (Landam and Everitt 2004) combined are used in estimating the success of the regression models and for adding or deleting variables. More specifically
Table 3 Multiple linear regression of PC-1 versus H, Y, X
&
&
The F value indicates the overall significance of the regression. For PC-1 the F-test far exceeds the F-critical value (26.12) at the 0.01 significance level. Hence the overall regression is significant (Table 3). For PC-1 the coefficients for X, Y and H express the individual contribution of the independent variable to the dependent variable (Table 3). The absolute values of the ttest for the 3 independent variables and the constant term far exceed the t-critical value (2.58) at two tailed 0.01 significance level and hence the 3 independent variables and the constant term depart from 0 (Table 3).
The ANOVA table for PC-2 also verifies the overall significance of the regression model as well as the significance of the 3 independent variables and the constant term (Table 4). The images that correspond to the residuals and the predicted values for the multiple linear regression models are presented in Fig. 4 for PC-1 and in Fig. 5 for PC-2. The two residual images visualize the spatial distribution of the unexplained variance of each regression model (Table 3, Table 4). The positive residuals (Fig. 4, Fig. 5) indicate regions for which the multiple linear regression model under-predicts the actual PC values. It is concluded that: &
&
PC-1 accounts for the 86.6 % (Online Resource-Table C) of the total variance evident in the multi-temporal dataset while 69.2 % (Table 3) of the variation is explained by the linear regression model. Thus, the 59.94 % of the total variance evident in the multi-temporal dataset is explained by the linear regression model while 26.66 % of the variance is quantified by the residual image (Fig. 4). PC-2 accounts for 6.6 % (Online Resource-Table C) of the total variance evident in the multi-temporal dataset while 49.3 % (Table 4) of the variation is explained by the linear regression model. Thus, the 3.26 % of the total variance
Multiple Regression Equation: PC1 = a * H + b * X + c * Y + d Regression Statistics ANOVA Regression Table
Apparent R2 = 0.692118 Adjusted R = 0.831936 Adjusted R2 = 0.692117 Source Degrees of freedom sum of squares Mean square Regression 3 584,465,586 194,821,856 Residual 810,198 259,994,166 320.9 Total 810,201 844,459,752 F(3, 810,198) = 607,107 Regression Coefficients t-test d 210.9 196.9 a -0.027461 -738.3 c -0.000043 -552.8 b -0.000005 -50.8 Apparent R = 0.831936
Earth Sci Inform Table 4 Multiple linear regression of PC-2 versus H, Y, X
Multiple Regression Equation: PC2 = a * H + b * X + c * Y + d Regression Statistics ANOVA Regression Table
Apparent R = 0.654002 Adjusted R = 0.654001 Source Degrees of freedom Regression 3
Apparent R2 = 0.427719 Adjusted R2 = 0.427717 sum of squares Mean square 27,504,723 9,168,241
Residual
810,198
36,800,912
Total
810,201
64,305,635 F(3, 810,198) = 201,845
Regression Coefficients
evident in the multi-temporal dataset is explained by the linear regression model of while the 3.34 % of variance is quantified by the residual image (Fig. 5).
The predicted PC-1 and PC-2 images (Figs. 4 and 5) account for the 63.2 % (59.94 % + 3.26 %) of the total variance evident in the multi temporal dataset (Fig. 2) that is X, Y, H dependent. If PC-1 and PC-2 are omitted then a significant portion 30 % (26.66 % + 3.34 %) of variance that is included within the two residual images (Figs. 4 and 5) that is independent from X, Y, and H will be subtracted. So the PC-1 and PC-2 residual images (Figs. 4 and 5) as well as the PCA-3 to PCA-46 (Online Resource-Table C) images are considered for the reconstruction of the multitemporal LST dataset. The reconstructed dataset account for the 36.8 % of the total variation evident in the initial LST imagery (Fig. 2). The portion of the variance (30 %) derived from the two residual PC-1 and PC-2 images is independent from X, Y, H while for the remaining 6.8 % (derived from PC-3 to PC-46) X, Y, H dependencies might be evident. In order to eliminate such a dependency, the method should be applied repetitively. At first the cross- correlation matrix of the reconstructed dataset (Online Resource-Table D) indicate no X, Y, and H Fig. 4 The PC-1 image as well as the predicted and the residual images (Table 3). The darker a pixel is the lower its value. The greyscale range is (−51.1, 160.0), (−31.4, 119.8), and (−119.9, 105.01) for the PC-1, the predicted and the residual images respectively
45.42
t-test
d a
337.9 -0.006367
838.5 -455.0
c b
-0.000001 0.000027
-51.1 775.2
dependency since the correlation coefficients of the 46 LST images versus X, Y, Z approach 0. On the other hand, although the effect of X, Y, H to correlation coefficient is minimized, it might be possible for the PC-1 of the reconstructed LST dataset to present a significant portion of variance that is X, Y, H dependent. For once more, eigenvalues, eigenvectors and PCs are computed and presented in the Online Resource-Table E. Note that non standardized PCA is implemented again. The eigenvalue indicate that PC-1 accounts for the 71.5 % of the variance evident in the reconstructed LST dataset. In order to quantify the contribution of the independent variables (X, Y, H) to PC-1 (dependent variable), a multiple linear regression analysis is used (Table 5). For PC-1 of the reconstructed dataset, the 0 % (R2) of the variance is explained by X, Y, H (Table 5). The analysis of variance (ANOVA) table for the multiple linear regression equation of best fit for PC-1 is presented in Table 5 while the F-statistic and the t-statistic combined are used in estimating the success of the regression model and for adding or deleting variables. More specifically: &
The F-test for (3, 810,198) degrees of freedom, do not exceeds the F-critical value (26.125) at the 0.01 significance level (Conf. Level 99 %). Hence the overall regression is not significant.
Earth Sci Inform Fig. 5 The PC-2 image as well as the predicted and the residual images (Table 4). The darker a pixel is the lower its value. The greyscale range is (3.6, 72.2), (19.0, 59.6), and (−56.0, 36.4) for the PC-2, the predicted and the residual images respectively
&
&
The absolute values of the t-test for the 2 independent variables (H and X) and the constant term do not exceed the t-critical value (2.576 for 810,198 degrees of freedom) at two tailed 0.01 significance level (Conf. Level 99 %). Hence the 2 independent variables and the constant term do not depart from 0. For Y variable the t-test exceed the t-critical value (2.576 for 810,198 degrees of freedom) at two tailed 0.01 significance level (Conf. Level 99 %), but since the coefficient C equals to 0 (Table 5), C do not depart from 0 too.
Thus, the reconstructed dataset is sufficiently decorrelated after the first iteration, as far as the spatial position (X, Y, H) is concerned. Thematic information content The reconstructed dataset is formed by 46 images depicting thermal anomalies either negative or positive. The thermal anomalies are expressed in degrees Celsius since unstandardized PCA is applied. In each of the 46 images, at a certain pixel, the provided residual LST value indicates the amount the LST at the pixel deviates from the X, Y, H predicted LST during the 8 day sampling period. In order to assess the
Table 5 Multiple regression of PC-1 of the reconstructed dataset versus H, Y, X
information content of this long time series formed by 46 images, K-means cluster analysis (Mather and Koch 2011) is applied. The underlying idea is that cluster analysis will group the pixel pattern (the 46 values per pixel sampled at the 46 reconstructed LST images) to classes. Each class will represent pixels presenting an analogous temporal and spatial pattern. The temporal pattern is expressed by the cluster centroid curves while the spatial pattern is expressed by the spatial extent of each cluster (Miliaresis 2009). In a recent research effort (Miliaresis 2016e) the local minima of cluster centroid curves were interpreted to correlate to the Daymet (Thornton et al. 1997) total accumulated precipitation for the cluster and the time interval (month) under study (Miliaresis 2016f). Daymet data (Thornton et al. 1997) consider the total accumulated precipitation and so it is the sum of all forms of precipitation converted to water equivalent. The 46 images representing the reconstructed LST is expected to generalize to a) a cluster spatial map and to b) a set for centroid curves. K-Means cluster analysis (Mather and Koch 2011) is applied. It begins by initializing centroids, assigns each pixel to the cluster whose centroid is nearest, updates centroids, then repeats the process until there is no overall change in cluster
Multiple Regression Equation of the reconstructed dataset: PC-1 = a * H + b * X + c * Y + d Regression Statistics ANOVA Regression Table
Apparent R = 0.008011
Apparent R2 = 0.000064 Adjusted R = 0.007855 Adjusted R2 = 0.000062 Source Degrees of freedom sum of squares Mean square Regression 3 16,873 5624 Residual 810,198 262,907,614 325 Total 810,201 262,924,487 F(3, 810,198) = 17.332 Regression Coefficients t-test d -1.9 -1.766 a 0.000056 1.486 c 0 6.331 b 0 -1.300
Earth Sci Inform
Fig. 6 The 8 cluster centroids on the basis of the reconstructed LST images (thermal anomalies). The X axis express time in the form of the 46 MYD11A2 LST images synthesised over an 8 day sampling period for the year 2014
centers during the current iteration. Eight clusters were mapped after 100 iterations. The cluster centroids (indicating the temporal pattern for each cluster in 2014), are presented in Fig. 6 and in Online Resource-Table F. The centroid co-ordinates are expressed in degrees Celsius and indicate the thermal anomaly (positive or negative) at the 46 sampling points of the year 2014. The mean land response (MLR) that is the mean LST of all the pixels located within a specific cluster is presented in Fig. 7 and in Online Resource-Table F. In addition each cluster is represented (Table 6) on the basis of its area extend (occurrence), elevation statistics as well as the mean (annual) centroid co-ordinate and st. dev. (on the basis of Fig. 6) indicating the degree the cluster centroid curves differ in thermal anomaly terms. In Table 7, the percent of IGBP landcover classes per cluster (according to the 2012
MCD12Q1 product) is presented. Note that only the IGBP landcover classes with occurrence greater than 0.01 % are considered. The spatial pattern of the 8 clusters is indicated in Fig. 8 and in Online Resource-Fig. G. The clusters are merged in 3 groups (Fig. 9) according to the mean centroid co-ordinate presented in Table 6.
Discussion of results The reconstructed imagery encompasses the Binformation^ for the purposes of model building while the statistical variance that is correlated to X, Y, and H is filtered. Thus, the reconstructed imagery presents the magnitude the
Fig. 7 The 8 cluster centroids on the basis of the initial LST images of Fig. 2 (mean land response (MLR) per cluster). The X axis express time in the form of the 46 MYD11A2 LST images synthesised over an 8 day sampling period for the year 2014
Earth Sci Inform Table 6 Percent area extent, elevation statistics, annual (mean) cluster thermal anomaly (mean centroid co-ordinate and st. dev.) per cluster
Cluster
Area Extent (%)
Annual thermal anomaly (Degrees Celsius)
Mean
Mean
St. dev.
St. dev.
1
18.1
1053
693
1.6
3.4
2
12.8
1133
662
3.7
3.4
3
20.9
1050
694
-0.4
3.4
4 5
11 11.9
1245 1118
567 767
0.5 -1.8
3.3 3.2
6 7
3.3 5.7
1471 1416
694 725
7.1 -4.7
3.4 3.3
8
16.3
1101
693
-2.3
3.2
standardized LST value per pixel deviates from the X, Y and H predicted. The standard deviation of correlation (Table 2) as well as the X and Y curve fluctuations in Fig. 3 indicate that the effect of local meteorological phenomena during the 8-day sampling interval of LST is more severe to the X, Y dependency of LST than to the H dependency. Clustering induced a degree of generalization in the 46 reconstructed LST images for year 2014 and allowed the spatial definition of thermal anomalies (Fig. 8) and their assessment according to cluster centroids (temporal pattern) (Fig. 6). MLR centroids (Fig. 7) are biased since the spatial framework for their computation are the spatial clusters defined on the basis of the 46 reconstructed LST images. On the hand, since the initial 46 LST images (Fig. 2) are used for the computation of the centroid values, MLR indicate the actual LST that is applied to the biosphere in each cluster. Such a consideration is very useful in rural and urban planning studies. The local minima and the local maxima are common for the 8 cluster centroid curves in Fig. 6 and the MLR curves in Fig. 7. So it is concluded that they represent LST fluctuations Table 7 Landcover (Friedl et al. 2010; MCD12Q1 2015) occurrence per cluster
Elevation (m)
induced by continental in scale meteorologic and weather phenomena affecting the entire study area. For cluster 6, a positive thermal anomaly (LST is greater than the X, Y, H predicted one) is observed throughout the year, while for cluster 7 a negative thermal anomaly is valid (Fig. 6). These two clusters present almost identical elevation statistics (Table 6) while they differ 12 degrees Celsius in the observed annual LST anomaly (Table 6). Their spatial extent indicates a N to S (according to latitude, longitude grid visualized in Fig. 1b) elongated pattern of objects (Online Resource-Fig. G). The 3 groups of clusters in Table 7 and Fig. 9, present positive (clusters 1, 2, and 6), negative (clusters 5, 7, and 8) and almost 0 (clusters 3 and 4) thermal anomaly. More specifically: &
The positive thermal anomaly clusters (Fig. 9) are spatially arranged mainly west of Sierra Nevada in Great Basin Section where extensional tectonics create a series of titled elongated mountain blocks along the N to S direction in between basins bounded by normal faults (Miliaresis and Argialas 1999, 2000; Nash and Johnson 2003).
MCD12Q1 LandCover
CLUSTERS
ID
IGBP
1
1 5 6
Evergreen Needleleaf forest Mixed forest Closed shrublands
1.8 0.0 0.7
2.1 0.2 1.6
1.6 0.0 0.6
5.2 0.5 7.1
8.9 0.3 6.4
1.7 0.5 1.3
6.6 0.0 1.5
0.8 0.0 0.3
7 8 9 10 11 12 13 14 16
Open shrublands Woody savannas Savannas Grasslands Permanent wetlands Croplands Urban and built-up Cropland/Natural vegetation Barren or sparsely vegetated
70.9 2.3 0.4 9.8 0.0 0.8 1.1 0.0 12.2
71.8 3.5 0.7 7.6 0.0 0.6 3.8 0.1 8.0
62.9 2.0 0.1 12.9 0.0 2.3 0.7 0.1 16.8
58.7 11.2 2.8 8.7 0.1 1.6 3.8 0.2 0.1
39.0 13.4 3.3 15.6 0.0 9.8 2.6 0.4 0.3
61.6 3.3 0.4 12.3 0.1 0.4 2.2 0.1 16.0
37.4 7.2 0.9 33.3 0.0 5.1 0.2 0.2 7.4
54.8 1.1 0.0 16.2 0.0 6.5 0.5 0.1 19.7
2
3
4
5
6
7
8
Earth Sci Inform
Fig. 8 The spatial extent of the 8 clusters
&
&
The negative thermal anomaly clusters (Fig. 9) are spatially arranged along the coastal region, further north and in the western region far from the tilted mountain tectonic blocks of the Great Basin Section. For the third group of clusters presenting almost zero thermal anomalies, there is not a favorite geographic location as it is depicted in Fig. 9.
Table 7 captures some minor differences in the landcover pattern among the negative and positive thermal anomaly groups of clusters. More specifically, the occurrence of Fig. 9 Merging of clusters on the basis of the temporal variation of LST (cluster centroids) presented in Table 6 (annual thermal anomaly per cluster)
open shrublands is slightly greater for the positive thermal anomaly clusters while for negative anomaly clusters the occurrence of grasslands and croplands is definitely greater. Are these differences in landcover pattern enough to explain the difference in the observed annual LST anomaly (Table 6) among the positive and negative thermal anomaly clusters? Atmospheric circulation, landscape pattern and physiography (proximity to sea, mountain ranges, regional peneplains), landcover and landuse are some of the factors controlling the regional in scale and in time thermal anomalies identified in MODIS imagery (Miliaresis and Tsatsaris 2010, 2011). Nevertheless it should be noted that regional scale geothermal activity (Nash and Johnson 2003; Tester et al. 2007) that is related to plate tectonics (Miliaresis and Argialas 1999, 2000) might be one of the processes that create positive throughout the year thermal anomalies (Miliaresis 2009, 2013a, 2013b) in the night MODIS LST imagery in California and Nevada. Such an observation is supported by the spatial extent (within the Great Basin Section occupied by the tilted mountain blocks) and the shape of the objects forming the clusters observed in the Online Resource-Fig. G that are developed in a N-S direction (according to latitude, longitude grid visualized in Fig. 1b). In the light of the recent research effort (Miliaresis 2016e) that correlates the local minima of cluster centroid curves at
Earth Sci Inform
the 5.6 km spatial resolution (climatic modeling grid) to the Daymet total accumulated precipitation (Thornton et al. 1997), new applications reveal. In this context the high frequency fluctuations (at an 8-day sampling interval) observed in Fig. 6 at the 1 km spatial resolution of the MYD11A2 LST data should be organized a) to local minima representing the impact of precipitation to LST and b) local maxima representing the current status of LST anomaly at a certain cluster and at a certain time lag. The influence of precipitation to LST according to Fig. 6 tends to be restored shortly after the event to its seasonal/monthly maximum. The local maxima might be depend on lithologic conditions (thermal inertia), geothermal activity, ground water resources and water table depth, irrigation, soil moisture etc. etc. Thus, in the future research effort the DAYMET data that is assimilated daily at 1 km spatial resolution should be correlated to the 8-day at 1 km spatial resolution MYD11A2 LST data in attempt the verify the precipitation dependency of local minima of LST cluster centroids. Data from various satellite missions could also further refine the dependency of thermal anomalies from soil moisture and ground water storage since DAYMET precipitation consider the total accumulated precipitation and so it is the sum of all forms of precipitation converted to water equivalent. For example: &
&
ESA’s Soil Moisture and Ocean Salinity (SMOS) mission is dedicated to making global observations of soil moisture over land every three days at a spatial resolution of 50 km (Drusch et al. 2012) while Groundwater and drought indicators are also derived from terrestrial water storage observations derived from GRACE satellite data (Houborg et al. 2012) describing current wet or dry conditions at a pixel size of 150,000 km2 with almost a month time lag.
Conclusion MYD11A2 LST data provide a constant in space and a regular in time sampling of earth’s surface that is of higher spatial (1 km) and temporal (46 images per year) resolution than the MYD11C3 data (12 images per year at 5.6 km spatial resolution). The fact that is MYD11A2 LST imagery is projected to a rectangular grid did not affect the X, Y, H (spatial) decorrelation stretch that was successful. The first 2 PCs were proved to isolate the X, Y, H dependent variance included in the data. Linear regression models of the first two PCs decomposed the X, Y, H dependent variance to the corresponding predicted images. The PC-1 and PC-2 residual images as well as the PCA-3 to PCA-46 images were used to reconstruct the multi-temporal LST dataset. The reconstructed
46 LST images represent the amount the LST deviates from the X, Y and H predicted for the year 2014. The metrics devised on the basis of the cross-correlation matrix and the regression model of PC-1 of the reconstructed data verified the selective spatial (X, Y, H) variance reduction. The thematic information content of the reconstructed LST images is verified by cluster analysis and mapped the spatial extend and the temporal variability of thermal anomalies in the study area. The spatial maps that define regions with (positive or negative) thermal anomalies and distinct mean land response (MLR) could assist landuse and landcover studies and support urban and rural planning in the context of climatic change. In addition, a study that will include a longer image time series of the last decade (in an attempt to filter seasonal meteorological phenomena, the climatic trend, and landuse/ landcover change) combined by detailed geothermal, heat flow data and field work, might identify a possible regional geothermal component of tectonic origin in MODIS LST image time series data. Acknowledgments The author is grateful to the 2 anonymous reviewers for their corrections, comments and suggestions during the revision process.
References Alward RD, Detling JK, Milchunas DG (1999) Grassland vegetation changes and nocturnal global warming. Science 283:229–231 Blackett M (2014) Early analysis of Landsat-8 thermal infrared sensor imagery of volcanic activity. Remote Sens 6:2282–2295 Blackett M, Wooster MJ, Malamud B (2011) Exploring claims of land surface temperature precursors to the 2001 Gujarat earthquake. Geophys Res Lett 38:L15303. doi:10.1029/2011GL048282 Drusch M, Mecklenburg S, Kerr Y (2012) SMOS over land, European Space Agency, B152: 39–49. https://earth.esa.int/c/document_ library/get_file?folderId=131556&name=DLFE-4401.pdf . Accessed 30 August 2016 Eastman JR, Fulk M (1993) Long sequence time series evaluation using standardized principal components. Photogramm Eng Remote Sens 59:1307–1312 Farr TG, Kobrick M (2000) Shuttle radar topography mission produces a wealth of data. Am Geophys Union Eos 81:583–585 Freund FT, Takeuchi A, Lau BS, Al-Manaseer A, CC F, Bryant NA, Ouzounov D (2007) Stimulated infrared emission from rocks: assessing a stress indicator. eEarth 2:7–16 Friedl MA, Sulla-Menashe D, Tan B, Schneider A, Ramankutty N, Sibley A, Huang X (2010) MODIS collection 5 global land cover: algorithm refinements and characterization of new datasets. Remote Sens Environ 114:168–182 Hertel TW, Burke MB, Marshall B, Lobell DB (2010) The poverty implications of climate-induced crop yield changes by 2030. Glob Environ Chang 20:577–585 Houborg R, Rodell M, Li B, Reichle R, Zaitchik B (2012) Drought indicators based on model assimilated GRACE terrestrial water storage observations, water. Resources. Research 48:W07525. doi:10.1029/2011WR011291
Earth Sci Inform Landam S, Everitt BS (2004) A Handbook for Statistical Analyses Using SPSS. Chapman & Hall/CRC Press, New York Lillesand TM, Kiefer RW, Chipman JW (2004) Remote sensing and image interpretation, 5th edn. John Wiley, New York Maaten L, Hinton G (2008) Visualizing high-dimensional data using tSNE. J Mach Learn Res 9:2579–2605 Mather PM, Koch M (2011) Computer processing of remotely-sensed images, 4th edn. John Wiley and Sons, New York MCD12Q1 (2015) MODIS/Combined (Terra + Aqua) Land Cover Type Yearly L3 Global 500 m SIN Grid. U.S. Geological Survey, LP DAAC. https://lpdaac.usgs.gov/products/modis_products_ table/mcd12q1. Accessed 30 August 2016 Miliaresis G (2009) Regional thermal and terrain modeling of the afar depression from multi-temporal night LST data. Int J Remote Sens 30:2429–2446 Miliaresis G (2012a) Selective variance reduction of multi-temporal LST imagery in the East Africa rift system. Earth Sci Inf 5:1–12 Miliaresis G (2012b) Elevation, latitude/longitude decorrelation stretch of multi-temporal LST imagery. Photogramm Eng Remote Sens 7: 151–160 Miliaresis G (2013a) Terrain analysis for active tectonic zone characterization, a new application for MODIS night LST (MYD11C3) dataset. Int J Geogr Inf Sci 27:1417–1432 Miliaresis G (2013b) Thermal anomaly mapping from night MODIS imagery of USA, a tool for environmental assessment. Environ Monit Assess 185:1601–1612 Miliaresis G (2014) Daily Temperature Oscillation Enhancement of Multi-temporal LST Imagery. Photogramm Eng Remote Sens 80: 423–428 Miliaresis G (2016a) An unstandardized selective variance reduction script for elevation, latitude & longitude decorrelation stretch of multi-temporal LST imagery. Modeling Earth Systems & Environment 2. doi:10.1007/s40808–016–0103-0 Miliaresis G (2016b) SourceForge Project: svr-py, Selective Variance Reduction MeGa py script with clustering & classification support https://sourceforge.net/projects/svr-py/. Accessed 30 August 2016 Miliaresis G (2016c) SourceForge Project: selective-variance-reduction. Octave Implementation. http://selective-variance-reduction. sourceforge.net. Accessed 30 August 2016 Miliaresis G (2016d) SourceForge Project: test-sites-datasets https://sourceforge.net/projects/test-sites-datasets/. Accessed 30 August 2016 Miliaresis G (2016e) Revealing the precipitation dependency of regional in time and in space thermal anomaly peaks in SW USA. Model Earth Syst Environ 2. doi:10.1007/s40808–016-0093-y Miliaresis G (2016f) Spatial decorrelation stretch of annual (2003–2014) Daymet precipitation summaries on a 1-km grid for California, Nevada, Arizona & Utah Environmental Monitoring & Assessment, 188. doi:10.1007/s10661–016–5365-5 Miliaresis G, Argialas D (1999) Segmentation of Physiographic Features from the Global Digital Elevation Model/GTOPO30. Comput Geosci 25:715–728 Miliaresis G, Argialas D (2000) Extraction & Delineation of Alluvial Fans from DΕμs & Landsat TM Images. Photogramm Eng Remote Sens 66:1093–1101 Miliaresis G, Seymour KST (2011) Mapping the spatial & temporal SST variations in Red Sea, revealing a probable regional geothermal anomaly from Pathfinder V5 data. Int J Remote Sens 32:1825–1842
Miliaresis G, Tsatsaris A (2010) Thermal terrain modeling of spatial objects, a tool for environmental and climatic change assessment. Environ Monit Assess 164:561–572 Miliaresis G, Tsatsaris A (2011) Mapping the spatial and temporal pattern of day-night temperature difference in Greece from MODIS imagery. GIsci Remote Sens 48:210–224 MYD11A2 (2015). MODIS/Aqua Land Surface Temperature and Emissivity 8-Day L3 Global 1 km. U.S. Geological Survey, LP DAAC. https://lpdaac.usgs.gov/products/modis_products_ table/myd11a2. Accessed 30 August 2016 MYD11C3 (2015) MODIS/Aqua Land Surface Temperature and Emissivity Monthly L3 Global 0.05 Deg CMG. U.S. Geological Survey, LP DAAC. https://lpdaac.usgs.gov/products/modis_ products_table/myd11c3. Accessed 30 August 2016 Nash GD, Johnson GW (2003) Conceptualization and implementation of a tectonic geomorphology study for geothermal exploration in the Great Basin. U.S.a. transactions - geothermal. Resources 27:663– 667 Ouzounov D, Liu D, Kang C, Cervone G, Kafatos M, Taylor P (2007) Outgoing long wave radiation variability from IR satellite data prior to major earthquakes. Tectonophysics 431:211–220 Piroddi L, Ranieri G, Freund FT, Trogu A (2014) Geology, tectonics and topography underlined by L’Aquila earthquake TIR precursors. Geophys J Int 197:1532–1536. doi:10.1093/gji/ggu123 Siljestrom PA, Moreno A, Vikgren G, Caceres LM (1997) The application of selective principal components analysis to a thematic mapper image for the recognition of geomorphologic features configuration. Int J Remote Sens 18:3843–3852 Snyder J (1987) Map Projections—A Working Manual. USGS P r o f e s s i o n a l P a p e r 1 3 9 5 . h t t p : / / p u b s . e r. u s g s . gov/usgspubs/pp/pp1395. Accessed 30 August 2016 SRTM30 (2009) Near Global Digital elevation model. US Geological Survey. http://dds.cr.usgs.gov/srtm/version1/SRTM30/SRTM30_ Documentation. Accessed 30 August 2016 Tester JW, Anderson BJ, Batchelor AS, Blackwell D, DiPippo R, Drake EM, Garnish J, Livesay B, Moore MC, Nichols K, Petty S, Toksoz MN, Veatch RW, Baria R, Augustine C, Murphy E, Negraru P, Richards M (2007) Impact of enhanced geothermal systems on US energy supply in the twenty-first century. Phil Trans R Soc A 365: 1057–1094. doi:10.1098/rsta.2006.1964 Thornton PE, Running SW, MA W (1997) Generating surfaces of daily meteorological variables over large regions of complex terrain. J Hydrol 190:214–251 Turnbull MH, Murthy R, Griffin KL (2002) The relative impacts of daytime and night-time warming on photosynthetic capacity in Populus detoides. Plant Cell Environ 25:1729–1737 Wan Z (2008) New refinements and validation of the MODIS landsurface temperature/emissivity products. Remote Sens Environ 112:59–74 Wan Z (2013) MODIS Land Surface Temperature Products Users’ Guide, Collection-6. ICESS, University of California, Santa Barbara. http://www.icess.ucsb.edu/modis/LstUsrGuide/MODIS_LST_ products_Users_guide_Collection-6.pdf. Accessed 30 August 2016 Wan Z (2014) New refinements and validation of the collection-6 MODIS land-surface temperature/emissivity products. Remote Sens Environ 140:36–45 Zouzias D, Miliaresis G, Seymour KST (2011) Probable regional geothermal field reconnaissance in the Aegean region from modern multi-temporal night LST imagery. Environ Earth Sci 62:717–723