Descarga la aplicación para disfrutar aún más
Vista previa del material en texto
UNIVERSIDAD NACIONAL AUTÓNOMA DE MEXICO PROGRAMA DE MAESTRÍA Y DOCTORADO EN GEOGRAFÍA EVOLUCIÓN DE LAS SEQUÍAS EN LA PENÍNSULA DE YUCATÁN, MÉXICO TESIS QUE PARA OPTAR POR EL GRADO DE: DOCTOR EN GEOGRAFÍA PRESENTA: DAVID ROMERO DIRECTORA DE TESIS Dra. María Engracia Hernández Cerda Instituto de Geografía UNAM CODIRECTOR Dr. Roger A. A. Orellana Lanza Centro de Investigación Científica de Yucatán Ciudad de México, septiembre 2018 UNAM – Dirección General de Bibliotecas Tesis Digitales Restricciones de uso DERECHOS RESERVADOS © PROHIBIDA SU REPRODUCCIÓN TOTAL O PARCIAL Todo el material contenido en esta tesis esta protegido por la Ley Federal del Derecho de Autor (LFDA) de los Estados Unidos Mexicanos (México). El uso de imágenes, fragmentos de videos, y demás material que sea objeto de protección de los derechos de autor, será exclusivamente para fines educativos e informativos y deberá citar la fuente donde la obtuvo mencionando el autor o autores. Cualquier uso distinto como el lucro, reproducción, edición o modificación, será perseguido y sancionado por el respectivo titular de los Derechos de Autor. 1 Agradecimientos Esta tesis no habría sido posible sin el apoyo de varias organizaciones y personas, de las cuales espero no haber omitido ningún nombre: A mis directores, la Dra. María Engracia Hernández y el Dr. Roger Orellana Lanza, quién debería estar registrado así si el sistema lo permitiera, por su apoyo, sus enseñanzas, su crítica constructiva y también por dar acceso a sus bases de datos climáticos y de NDVI. A la Dra. Rosalía Vidal Zepeda de mi comité tutoral. Al Dr. Edgar Torres Irineo por su apoyo en la programación de R. Al Dr. Alfonso Condal y el Dr. Pedro Luis Ardisson Herrera por guiarme en los estudios geoestadísticos. Al Dr. Stefan Kern por compartir los datos de ASCAT y colaborar conmigo en el artículo sobre la recesión de humedad del suelo. A la Dra. Claudia Tania Lomas Barrié y el M.C. Jesús Manuel Soto Rocha de INIFAP por compartir amablemente los datos de humedad del suelo medidos in situ. A la Dra. Ana García de Fuentes por su apoyo permanente. A CONAGUA por compartir los datos de la base ClicCom. A CONACYT por la beca otorgada (número 496322/329960). 2 Índice Agradecimientos 1 Índice 2 Introducción 4 Capítulo 1 - Variaciones escalares para el modelamiento geoestadístico de los datos climatológicos 12 Resumen 12 Abstract 13 1. Introduction 13 2. Data and Methodology 15 2.1 Data 15 2.2 Data exploration and pre-treatment 17 2.3 Variogram calculation 18 2.4 Model evaluation 19 3 Results 20 3.1 Data exploration 20 3.2 Variographic scale analyses 23 4 Discussion and Conclusions 28 5 References 30 Supplementary material 38 Capítulo 2 - Determinación de la constante de disminución de la humedad del suelo por datos satelitales, el caso de la península de Yucatán 52 Resumen 52 Abstract 53 1. Introduction 53 2. Materials and Methods 55 2.1. Study area 55 2.2 Data 57 3. Theory and calculation 60 3.1. Theory 60 3.2. Extraction of recession phases and estimation of a recession constant 61 4. Results 65 4.1. Recession constants from ASCAT data 65 4.2. Validation of recession constants 68 3 5. Discussion 69 6. Conclusions 71 Acknowledgments 71 References 72 Supplement 1: Satellite-based soil moisture data 80 References 82 Capítulo 3 - Modificación del Índice Estandarizado de Precipitación para la evaluación de la sequía pre-estival en zonas de clima tropical, aplicación a la península de Yucatán 85 Resumen 85 Abstract 85 1. Introducción 86 2. Área de estudio 87 2.1 Localización geográfica 87 2.2 Tipos de sequía 88 3. Materiales y métodos 88 3.1 Materiales 88 3.2 Métodos 89 4. Resultados 95 4.1. Aplicación de los índices 95 4.2. Validación de los índices con NDVI 97 4.3. Identificación de los episodios extremos 97 5. Discusión 98 6. Referencias 100 Anexos: Mapas de representación de la severidad de la sequía pre-estival (1955, 1963, 1970, 1971, 1975, 1976, 2005 y 2011) 106 Capítulo 4 - Discusión y conclusiones generales 111 1. Análisis espacial 111 2. Sensibilidad a la sequía 114 3. Severidad de las sequías 115 4. Conclusiones 120 Tabla de figuras 122 Lista de tablas 124 Bibliografía 125 4 Introducción La sequía es un fenómeno natural que corresponde la falta de precipitaciones en un lugar con respecto a lo esperado. Esta carencia en agua, liquido de absoluta necesidad para todas las formas de vida terrestre, conlleva a deficiencia del recurso en diferentes índoles, causando problemas sociales y ambientales. De hecho, la ocurrencia de sequías repetidas podría ser el factor que originó el colapso de la civilización Maya en el siglo IX (Medina-Elizalde & Rohling, 2012), también, la sequía fue responsable de hambrunas en la sierra chihuahuense a principios de esta década (Villalpando, 2012). Por esto, es de vital importancia el estudio, análisis e investigación de la sequía con el objetivo de prevenir esta problemática de interés primordial para el país. El interés en la sequía es también relativo a su posible aumento por teorías actuales del cambio climático por causas antropogénicas derivadas de la revolución industria (Seneviratne et al., 2012)Parece entonces necesario estudiar su evolución espacial y temporal durante las últimas décadas, viendo el fenómeno en la fase de calentamiento global e identificando inicios de cambio. La península de Yucatán, México, es una zona de clima tropical, cálido mayormente subhúmedo. El 97.2% del espacio pertenece a climas de tipos Aw y Ax’(w) (0, 1 o 2) de la clasificación de climática de Köppen modificada para México por García (2004) mientras la franja costera noroeste (1.4%) es árida y semiárida BS(0 y 1) y la zona campechana limítrofe con Tabasco así como la isla de Cozumel (1.4%) tienen un clima más húmedo Am (Vidal, 2005). La sequía es un elemento normal y frecuente de estos climas: la zona está sometida a un régimen de lluvias principales de verano –de manera casi exclusiva en zona de clima Aw en el occidente de la península y, con una distribución intra-anual un poco más homogénea de las lluvias en el oriente (Vidal, 2005) con clima Ax’(w)– mientras que los vientos alisios, regulares el resto del año dan muy pocas precipitaciones por lluvia convectiva en razón de la falta de orografía y de la posición a sotavento que tiene la península con respecto a los alisios (McGregor & Nieuwolt, 1998). Las sequías recurren anualmente con variaciones espaciales y temporales. Hernández-Cerda y Valdez-Madero (2004) cartografiaron la intensidad general del fenómeno en México usando el Índice de Severidad de la sequía meteorológica. En la península de Yucatán, indican una severidad “leve” (1er rango de 6) en la zona cercana a la laguna de Términos, Campeche, a “severa” (4o rango) en la costa noroeste. Esta última franja, particularmente seca, debe su aridez a la presencia cercana de un centro de altas presiones por su ubicación cerca del trópico de Cáncer, en la zona de contacto de las celdas de Ferrel y Hadley. Varios estudios paleoclimáticos fueron realizados para estudiar las sequías en la península de los últimos milenios, lo que llevó a encontrar la correspondencia entre períodos secos y el 5 ciclo solar de Suess (Hodell, Brenner, Curtis, & Guilderson, 2001) o la responsabilidad del colapso maya a un período plurianual con repetición de sequías particularmente severas (Medina-Elizalde & Rohling, 2012). La relación entre la proporción de agua utilizada y la cantidad de recurso hídrico renovable disponible por habitante es una recta inversa con escala logarítmica (Margat, J., 2005) lo que significa que el hombre se adapta alas condiciones normales de su medio requiriendo menos agua en zonas áridas a pesar de la fuerte evapotranspiración. En la península de Yucatán, la población rural, todavía muy numerosa –aproximadamente 741,000 personas en 2010 según el Instituto Nacional de Estadística y Geografía(INEGI, 2018)–, permanece dependiente de la agricultura de temporal, principalmente de maíz –el Servicio de Información Agroalimentaria y Pesquera (SIAP), reporta 427,528 ha de cultivo de temporal de los cuales 340,560 ha de maíz en primavera-verano en la península para el año 2017 (SIAP, 2018)–, excepto en la zona de clima árido. Este sector está acostumbrado a los fenómenos de sequía, pero sigue sujeto a sufrir el efecto de oscilaciones notables o de una tendencia más general. Lo anterior lo sugieren la mayor parte de los escenarios de cambio climático. Los agricultores yucatecos señalan sequías más intensas vinculadas con un retraso en la ocurrencia de las lluvias, todo esto afectando a las cosechas. También, la sequía tiene un fuerte impacto en las actividades pecuarias; la ganadería porcina y bovina son muy demandantes en recursos hídricos y, son sectores de peso creciente en la economía regional. La sequía es un término para el cual no existe verdaderamente consenso porque puede referir a situaciones distintas según la persona que lo está empleando. El Glosario Hidrológico Internacional (World Meteorological Organization, 2015) retiene la definición de “Ausencia prolongada o escasez acusada de precipitación.” Pero la realidad es más compleja: Charre (1977) considera la sequía como un déficit bastante importante de agua con respecto a las condiciones normales del medio de tal forma que provoque dificultades de tipo socioeconómico. Según él, “las necesidades que dependen de caracteres económicos, sociales y técnicos de las sociedades, están establecidas a partir de los recursos habitualmente disponibles; tienen tendencia a aumentar mientras que los recursos pluviométricos, corregidos por la evaporación y el escurrimiento, no tienen esta tendencia en la misma escala temporal, pero presentan fluctuaciones interanuales: de la contradicción entre estos dos modos de evolución nace el riesgo de sequía.” Esta idea es de suma importancia porque si bien la demanda de agua sigue aumentando como lo señaló Charre desde 1977 –el 95% de los pastizales del estado de Yucatán tienen riego (Delgado Carranza, 2010) tanto como medida de mitigación de la sequía como para la intensificación del uso–, es probable que la sequía tenga tendencia a ser más intensa, aumentando entonces el riesgo e implicando una relativa inadaptación de las actividades humanas en la zona. La idea de desequilibrio esta retomada por la definición de la Organización de las Naciones Unidas, que explica a la sequía como: “fenómeno que se produce naturalmente cuando las lluvias han sido considerablemente inferiores a los niveles normales registrados, causando un agudo 6 desequilibrio hídrico que perjudica los sistemas de producción de recursos de tierras” en su Convención de Lucha Contra la Desertificación (ONU, 1994). Como se observa, un concepto generalizado como es el de “escasez de precipitación” con valores promedio, en un tiempo determinado, denominado como “comportamiento normal”. También existen diferencias semánticas ya que es un concepto operacional, que explica como la sequía impacta al ecosistema en diferentes formas, desarrollando así diferentes conceptos que varían entre el inicio y fin del momento del suceso, su efecto y su elevada frecuencia. Otras definiciones son más acotadas y determinan conceptos específicos. Wilhite y Glantz (1985) como Lambert (1975) definen varios tipos de sequía que no son idénticos. Según los primeros existen: - la sequía meteorológica: está referida al grado de desviación de la precipitación en comparación a un comportamiento “normal”, de una serie de tiempo preestablecida. Sin embargo, la magnitud de la desviación y del tiempo no son fijos, más bien dependen de la forma como regionalmente evalúan el fenómeno, llevando al establecimiento de umbrales específicos para diferentes regiones del mundo que consideren totales de lluvia o periodos sin lluvias importantes o bien valores estadísticos. - la sequía agrícola se relaciona con el impacto en los cultivos, considera el proceso en términos de balance hídrico y considera la especificidad del cultivo en cuanto a sus requerimientos de humedad, en función de la etapa de crecimiento y la biología de la planta. Este tipo de sequía puede ocurrir después de una sequía meteorológica. - la sequía hidrológica toma en cuenta la falta de agua en toda la red hidrológica, incluyendo al subsuelo, y la falta de recarga de acuíferos, lagos y presas. - la sequía socioeconómica se refiere al suministro de agua y demanda por la sociedad. Este tipo considera la suma de los anteriores puesto que considera la agricultura como las extracciones en capas freáticas, por tanto, está muy relacionada con los efectos de corto y largo plazo de los otros tipos de sequía. De hecho, se sabe que la demanda de agua del subsuelo más profundo está muy elevada en la península de Yucatán, por la ausencia de ríos y la elevada contaminación de la capa más superficial. La demanda no es significativamente más elevada que la oferta de agua, pero ambas son ampliamente superiores a la recarga de las reservas hidrogeológicas (Villasuso & Méndez Ramos, 2000). Según Lambert (1975) o en Tallaksen & Van Lanen (2005) (Figura 1), “la sequía puede ser atmosférica: es el aire que está seco y absorbe más agua de lo que aporta a la superficie del suelo”. Si el suelo está seco, ocurre “la sequía edáfica”. La “sequía hidrológica” se refiere a la disminución de agua en los ríos, el estiaje. Mientras si “las capas subterráneas se encuentran por debajo del nivel de los pozos” hay “sequía freática” y esta se convierte en “sequía geológica [cuando] la falta de agua afecta el substrato”. Este autor es de los primeros en referirse al elemento del medio que está afectado por la situación seca. Cada tipo de sequía no es independiente de los demás sino existe una cadena de causalidades entre ellos. 7 Para la península de Yucatán, se puede hacer la distinción entre las etapas resultando de los efectos de la ausencia prolongada de lluvia en los diferentes elementos del ciclo de agua. Al origen, ocurre la sequía atmosférica cuando la atmosfera extrae más agua de lo que aporta, es decir cuando la evapotranspiración es superior a la precipitación. Esta sequía atmosférica lleva a los suelos a desecarse así que lluvias débiles son interceptadas y retenidas en las partes más superficiales sin recargar el suelo en profundidad ni alimentar las reservas freáticas. Prolongándose, la sequía atmosférica termina por desecar el suelo quitando la totalidad de su agua capilar conllevando a la disminución de la actividad vegetativa y hasta alcanzar eventualmente el punto de marchitez de las plantas; es la sequía fisiológica. En consecuencia, las reservas freáticas de agua dulce disminuyen por el proceso de vaciado y la presión que ejerce en ellas la intrusión salina. Figura 1: Propagación de la sequía en el ciclo hidrológico sin el efecto sinérgico del cambio climático; en Tallaksen & Van Lanen, (2005) (traducido y modificado) Como lo hacen notar en el National Drought Mitigation Center (2013), la definición de la sequía debe de ser tanto conceptual como operacional. Así, se han establecido estadísticas 8 que permiten diferenciar un periodo seco. Gaussen y Bagnouls (1957) definen la sequía con la relación P<2T donde P es normal de precipitación del mes en milímetros y T es la normal de temperatura en grados centígrados. La representan gráficamente en el diagrama ombrotérmico. Otros autores discuten este valor de 2T como aproximación de la evapotranspiración,no obstante, es muy difícil calcular de manera eficiente este fenómeno (Delgado Carranza, Bautista, Orellana Lanza, & Reyes Hernández, 2011). Para México, García, Hernández y Cardoso (1983) modificaron el sistema de Gaussen con la aplicación de P<2T+14, 21 o 28 según el régimen de lluvias y la tasa de lluvias invernales. En la Península de Yucatán, se encuentran los tres tipos de zonas con P<2T+14 en la mitad oriental de la península, P<2T+21 en Cozumel y la zona cercana al río Usumacinta, y, P<2T+28 en la porción occidental de la zona de estudio. Cabe señalar que anualmente ocurren dos episodios secos importantes. Primero, la sequía pre-estival es de larga duración, pero de evolución lenta; inicia en otoño o invierno y culmina generalmente en el mes de mayo. Segundo, la sequía intra-estival, también llamada canícula, es de duración inferior a un mes, generalmente alrededor de dos semanas, pero con intensidad extrema. Esta última puede tener unos efectos catastróficos por desarrollarse en pleno periodo vegetativo de los cultivos, así como por favorecer el desarrollo de enfermedades como el dengue (Oswald Spring, 2016). mientras la otra se vincula más con la situación atmosférica en el océano Pacífico (Magaña, Amador, & Medina, 1999) y entonces también particularmente con El Niño. La sequía pre-estival es típica de todos los climas tropicales que tienen estación de lluvias marcada. Ocurre cuando la zona está sometida a los vientos alisios, pero por su situación lejana en estas estaciones de la zona de convergencia intertropical (ZCIT), no recibe aportes de precipitaciones por las ondas tropicales. La sequía es débil en otoño o invierno por las temperaturas que no son muy altas, así como por las entradas de “Nortes”. En primavera, aumenta la temperatura y, mientras la ZCIT va hacia el Norte, los frentes fríos no pueden penetrar hacia el sur. Estos dos factores llevan a una sequía intensa, las fases intensas de la Oscilación del Sur El Niño (Salas-Flores, Hernández-Cerda, Villicaña-Cruz, Azpra-Romero, & Lomas Barrié, 2014) también afectan las precipitaciones de esta temporada. En cuanto a la canícula, está asociada a “la invasión, al territorio nacional, de una enorme celda de alta presión que acarrea la entrada de aire seco continental, subsidente, que impide la formación de nubes y consiguientemente de precipitaciones pluviales sobre la mitad oriental de la República Mexicana” (Mosiño & Reyna-Trujillo, 1992). Para la península, esta celda de aire caliente y alta presión, como parte del centro anticiclónico del Atlántico Norte genera vientos con polvo del Sahara, disminuyendo las probabilidades de precipitaciones de los alisios (Curtis & Gamble, 2008). El diagrama siguiente (Figura 2), realizado según las modificaciones antes presentadas, muestra que la sequía pre-estival puede empezar desde noviembre, pero es intensa 9 únicamente a partir de marzo para terminar “normalmente” a principios de mayo. En cuanto a la sequía intra-estival aparece como “relativa”, el diagrama permite verla como un periodo de lluvias menos excesivas en promedio en julio y agosto con respecto a junio y septiembre. Esto significa que, en Chetumal, la sequía intra-estival ocurre generalmente durante estos meses menos húmedos y, principalmente en agosto. A partir de la gráfica, se puede evaluar la canícula gracias al Índice de Sequía Relativa. Figura 2: Diagrama ombrotérmico de Chetumal (fuente: CONAGUA) El método de los diagramas ombrotérmicos modificados considera un mes seco si P < 2T+28, 21 ó 14 según la proporción de lluvia invernal para los climas de tipos A. Pero, si bien es interesante ver las condiciones promedio del medio, el objeto de este estudio es analizar la variabilidad interanual que no se refleja de esta manera. Entonces, usando valores mensuales reales de precipitación (p) y de temperatura (t) en lugar de promedios, con la finalidad de tomar en cuenta las irregularidades, Lambert (1975) estima posible evaluar la intensidad de las sequías. Según él, cuando p < 4t corresponde a la sequía atmosférica mientras hay sequía edáfica para p < 3t y sequía freática cuando p < 2t durante varios meses consecutivos. Las diferentes formas de definir la sequía, sus diversas fuentes de información e interpretación de las regiones de este país dan como resultado diversas formas de evaluar el proceso de esta. En esta gama de variables se considera a conceptos como corrientes superficiales, evaporación, temperatura, tiempo, capacidad de almacenamiento del suelo, etc. Internacionalmente, los Índice de Severidad de Sequía de Palmer (PDSI por sus siglas en inglés) e Índice Estandarizado de Precipitación (SPI) son los más utilizados. El PDSI se elabora con un balance de humedad mensual, usando la precipitación y la evaporación de Thornthwaite. Conceptualiza la capacidad de almacenamiento de agua en el suelo en dos capas con profundidad no definida y con propiedades hídricas propias; es decir, 10 Palmer señala que la capa superficial del suelo almacena una pulgada de agua como límite (25 cm) y asume a todos los casos con un valor constante, por lo que la segunda capa subsuperficial almacena toda la capacidad potencial del suelo, según diferentes factores, menos 25 mm. También considera que la primera capa tiene un comportamiento hídrico como de buffer. Los suelos de la península de Yucatán son muy irregulares y, a veces, tan delgados que es difícil considerar valido la presencia de una capa de suelo con una capacidad de 25mm de agua. Alley (1984) investigó sobre las limitaciones y supuestos del método Índice de Severidad de Sequía de Palmer, sin embargo, la falta de exactitud y rigurosidad en sus resultados, delatan ineficiencia en la cuantificación de algunas propiedades de la sequía, en factores como: intensidad, inicio y fin, por lo que la necesidad de corrección de este método es innegable. No obstante, hasta hoy, sigue siendo este, uno de métodos más aplicados de evaluación al nivel global. El Índice Estandarizado de Precipitación es muy sencillo de elaborar, y, por la estandarización permite una comparación con otros lugares y/o momentos. Se considera que empieza una sequía cuando el SPI alcanza -1 y se termina cuando es >0. Guttman (1999) realizó una comparación entre los métodos PDSI y SPI para diferentes escalas de tiempo y concluyó que SPI es más fácil de interpretar. De la misma manera, Keyantash & Dracup (2002) realizaron una comparación de índices de sequía y concluyeron que el método SPI es el correcto para analizar la severidad de sequía. Por otro lado, en México se da seguimiento a la sequía con el método de SPI en el Centro de Investigaciones sobre la Sequía del Instituto de Ecología (CEISS-INECOL) y, para Estados Unidos de América, en el Centro Nacional de Mitigación contra la Sequía. No se ha trabajado directamente acerca de las sequías para la península de Yucatán, solamente se contempla en trabajos nacionales por ejemplo (García de Miranda, Hernández Cerda, & Cardozo, 1992; García, 2004; Hernandez-Cerda & García de Miranda, 1997). Por sus reservas subterráneas de agua erróneamente consideradas como inagotables, no existe en la península una cultura de la prevención de este riesgo. Parece entonces necesario entender las variaciones espaciales y temporales de las sequías durante los últimos años paradójicamente poco estudiadas con el objetivo de deducir mejor las probables situaciones futuras. Esta tesis se propone identificar, usando herramientas especialmente adaptadas para el propósito, el comportamiento espacial y la evolución temporal de las sequías pre-estivales e intra-estivales desde 1950. El patrón espacial de la sequía que muestra un gradiente en la intensidad con una situación extrema en el Noroeste y una disminución de la severidad hacia el Este o el Suroeste no es representativo de las variacionesespaciales interanuales en el desarrollo de las sequías. De hecho, se ha denotado la ocurrencia de épocas “secas” marcadas por grandes oscilaciones como el ciclo de Suess y se supone que también existen ciclos más cortos a los cuales se añade 11 el efecto sinérgico del “cambio climático”. Espacialmente, es preciso identificar otros patrones propios a los eventos extremos, así como la tendencia de evolución de la amplitud de las sequías. A escala geológica, los climas del mundo han estado cambiando de manera muy frecuente “como consecuencia de procesos astronómicos, geológicos y atmosféricos” (Orellana, Islebe, & Espadas, 2003). Su evolución es naturalmente constante, pero puede estar afectada por las actividades humanas, sea la deforestación que puede llevar a procesos de desertificación o por la liberación de gases de efecto invernadero como el dióxido de carbono o el metano. Existen varios posibles escenarios de cambio climático que han sido aplicados a la península de Yucatán (Orellana, Espadas, Conde, & Gay, 2009). De estos, muchos apuntan hacia un aumento de los fenómenos de sequías. No obstante, todavía no se han comprobado cambios en la zona ni se sabe cuál va a ser la respuesta global o local a los cambios de concentración de estos gases. En consecuencia, es necesario abordar el estudio de la sequía desde una perspectiva espacial, temporal y así diferenciar las intensidades de sequía. Los datos de medición relativos a la sequía se extraen de estaciones meteorológicas, las cuales tienen siempre una dispersión heterogénea en el territorio (Mendelsohn, Kurukulasuriya, Basist, Kogan, & Williams, 2007). Con el objetivo de tener conocimiento del comportamiento espacial de los elementos críticos que determinan la severidad de la sequía (temperatura y precipitaciones) (Tallaksen & Van Lanen, 2004), realizamos en el primer capítulo un estudio de las variaciones escalares para el modelamiento geoestadístico de los datos climatológicos, temperatura y precipitaciones por separado, observando las escalas correspondientes a su variación espacial y evaluando la distancia de la zona de influencia, zona donde existe autocorrelación espacial. Este conocimiento es útil para modelar espacialmente la sequía y estimarla, con error mínimo, donde no se puede medir. En el segundo capítulo se desarrollan herramientas metodológicas para la evaluación espaciotemporal de la sequía, principalmente de la canícula en la península de Yucatán ya que se dedica a la determinación de la constante de disminución de la humedad del suelo por datos satelitales y datos tomados in situ. De hecho, la constante extraída corresponde a la velocidad de vaciado de la reserva de agua del suelo, o sea a la velocidad de desarrollo de la sequía y, su variación espacial puede ser un proxy para vulnerabilidad de la zona a la sequía. También, por usar datos satelitales, la cobertura de los datos puede extenderse a toda la península mientras los datos de humedad del suelo in situ son muy escasos. El tercer capítulo analiza el uso de los índices SPI y PDSI para la sequía pre-estival en la península de Yucatán y las zonas tropicales en general. Así propone dos modificaciones al SPI para su uso en zonas de clima tipo A, comparando los resultados de los índices para series de tiempo en la península de Yucatán, así como la aplicabilidad del índice. Gracias a estos índices, se cartografían las sequías extremas y evalúan las tendencias para cada estación. Finalmente, se integra y se discute la información de los capítulos anteriores. 12 Capítulo 1 - Variaciones escalares para el modelamiento geoestadístico de los datos climatológicos Romero D., Ardisson P.L., Orellana R. y Hernandez-Cerda M.E., Spatial scale variability of normal mean yearly temperature and precipitation from variographic analysis of the mexican hydroclimatic daily surface data set. Sometido a la revista “Theoretical and applied climatology” Resumen Se estudió la estructura espacial de la varianza relacionada con los conjuntos de datos hidroclimáticos en escalas de tiempo de varias décadas para México y en particular para la península de Yucatán. El comportamiento de la varianza a diferentes escalas espaciales fue investigado para precipitaciones y temperaturas normales mediante análisis variográfico. El objetivo principal era encontrar escalas adecuadas para estudiar las variaciones espaciales de los elementos climáticos clave, es decir, la precipitación y la temperatura medias anuales. Isotropía, anisotropía, 12 modelos diferentes y 50 distancias de 0.1 a 100% del diámetro de la zona de estudio fueron probados para temperaturas crudas y normales de precipitación, así como para los residuos resultantes del detrimento polinómico. Se evaluó cada variograma y se utilizaron los adecuados para describir y comprender la variación espacial de los fenómenos. La estructura de varianza en los datos de cada elemento climatológico varía en función de la escala, relieve y detrimento porque la evolución espacial de los fenómenos es compleja. Varias distancias de mesoescala resaltan diferentes estructuras que tienen significados físicos. Además, las distancias de retardo activas relacionadas con las estructuras que mejor se ajustan a los datos raramente corresponden al radio de la zona de estudio. En general, las distancias cortas pertenecientes a la escala meso- son más adecuadas. También se puede notar que las escalas que permiten una buena descripción de la evolución espacial de cada elemento hidroclimático para México no son necesariamente aplicables al subconjunto de Yucatán. 13 Abstract The spatial structure of the variance related to hydroclimatic datasets over multi-decadal timescales were studied for Mexico and in particular the Yucatan peninsula. The behaviour of variance at different spatial scales was investigated for precipitations and temperature normals by variographic analysis. The main objective was to find adequate scales for studying spatial variations of key climatic elements, i.e. mean yearly precipitation and temperature. Isotropy, anisotropy, 12 different models, and 50 distances from 0.1 to 100% of the zone study diameter were tested for raw temperature and precipitation normals as well as for the residuals resulting from polynomial detrending. Each variogram was evaluated and the suitable ones were used to describe and understand the spatial variation of the phenomena. The variance structure in the data of each climatological element varies as a function of the scale, relief and detrending because spatial evolution of the phenomena is complex. Various mesoscale distances highlight different structures that have physical meanings. In addition, the active lag distances related to structures that best fit the data rarely correspond to study zone radius. In general, short distances belonging to meso- scale are more suitable. We could also note that the scales permitting a good description of the spatial evolution of each hydroclimatic element for Mexico are not necessarily applicable to the Yucatan subset. 1. Introduction Knowledge of the spatial behaviour of hydro-climatic information is critical for understanding climate (Millán, Estrela, & Miró, 2005), particularly within the context of climate change and for the related socio-economic issues. Multi-decadal data sets are useful for long-term climate prediction. All climate change models ideally use the normal period as a base to calculate anomalies (Hulme et al., 1995). This information can be obtained by the spatial analysis of observed data from the meteorological stations network. Methods from geostatistics, for example studying the variogram used for kriging interpolation, can provide valuable information on an aleatory variable´s spatial behaviour (Baccour, Slimani, & Cudennec, 2012). However, these stochasticmethods are not generally used to study and interpolate regional climatic data (Hong, Nix, Hutchinson, & Booth, 2005), particularly temperature (Aalto, Pirinen, Heikkinen, & Venäläinen, 2012). Geostatistics are particularly underused for climatic stations data. In contrast, kriging is mainly used in the analysis of satellite climate data products (Kottek & Rubel, 2007) or with the combination of both types of data sets (Florio, Lele, Chang, Sterner, & Glass, 2004; X. Sun, Manton, & Ebert, 2003; T. Wu & Li, 2013). Because of a high spatial variation, climate data are often considered noisy. Therefore, smoothing methods, including the thin-plate smoothing spline interpolation technique, are applied (Cuervo-Robayo et al., 2014; Tait, Henderson, Turner, & Zheng, 2006; W. Wu, Xu, & Liu, 2015) to reduce the noise and establish surfaces for daily, monthly or annual average 14 rainfall and temperature. These interpolations are carried out at meso-γ or meso-α climatological scales (Oke, 1987; Orlanski, 1975). However, each climatic station is representative of a specific location and its climate results from the integration of micro to macro-scales. Therefore, local spatial variation should not be considered to be noise because it has a real climatological meaning at a microscale. (Perica & Foufoula-Georgiou, 1996) denote the heterogeneity of rainfall at a local scale as a function of the physical characteristics of the rainfall events. Climate time series are studied with spectral analysis for time scale variation (Mudelsee, 2014; Yiou, Baert, & Loutre, 1996) but variographic analysis is an ideal method for spatial scale observations (Esbensen, Friis-Petersen, Petersen, Holm-Nielsen, & Mortensen, 2007). Furthermore, (Pereira, 2002) recommended that spatial scale be studied using extent and grain; he and other authors (King, 1997; Wiens, 1989) refer to extent as the spatial area of the study object and to grain as the elementary sampling unit (Legendre & Legendre, 1998). The problem with regional variographic analysis of temperature and rainfall data from climate stations is their high spatial variation, dispersion and the required number of data. Although authors have calculated variograms with a reduced sample size (Baccour et al., 2012; Diem, 2003), using only 10 and 23 points, Webster and Oliver (R. Webster & Oliver, 1993) contemplate a minimum sample size of 100 and ideally at least 150 points for a robust isotropic variogram. The same authors recommend 250 points for an anisotropic variogram. This suggests the need for a rather high number of points, preferably well spatially distributed. Indeed, the positioning of weather stations at optimal locations provides better results in climate modelling than a high-density network (Arsenault & Brissette, 2013). In previous studies in Mexico, authors used manual techniques (Gómez et al., 2008; Vidal, 2005) or the thin plate spline interpolation method (Cuervo-Robayo et al., 2014; Sáenz- Romero et al., 2009; Téllez-Valdés, Hutchinson, Nix, & Jones, 2011) to establish surfaces for temperatures and precipitation for Mexico or Mexican regions. Therefore, the scale depending spatial variations of the climate data variance remain unknown. As part of a synoptic characterization of the climate of Mexico, this paper presents a variographic analysis of the mean yearly precipitation and temperatures for Mexico as a whole and for the Yucatan Peninsula. The main interest in studying both regions is due to differences in relief. The effects of mountains dominate Mexican climates, which is definitely not the case for the Yucatan Peninsula. The Yucatan peninsula is a tropical and flat (altitudes < 450 m) region between the Gulf of Mexico and the Caribbean Sea (86°45' - 92°30' W and 17°45' - 21°45' N). This Mexican region encompasses approximately 145,000 km2. The main aim of the present paper is to analyse the behaviour of variance at different spatial scales for datasets over multi decadal timescales. The main objective is to find adequate scales for studying spatial variations of key climatic elements, i.e. mean yearly precipitation and temperature, in Mexico and in particular the Yucatan peninsula. The examination of theoretical variogram models is useful for describing the spatial structure of climatological processes. The shapes of the variogram and models denote the 15 spatial behaviour of the variables and so supply important information (Christakos, 2005) with differences between temperatures and precipitation and both spatial domains. In particular, the sequence of different models along the pattern of Active Lag Distances makes it possible to determine climatological scales (Orlanski, 1975) corresponding to the spatial structure of each climatological phenomena. The present study is organized as follows: climatic data will be described along with the methodology for the variograms calculations and the relative results evaluation. Subsequently, we presented the spatial structure of temperature and precipitations as well as the scale variations in the variograms. Finally, we determined climatological scales relative to the spatial evolution of the climatic variables. 2. Data and Methodology 2.1 Data The National Meteorological Service has integrated the daily and monthly hydroclimatic surface data set for Mexico in the CLImate COMputing project database (CLICOM). The CLICOM database is composed of 5389 climate stations distributed throughout Mexican territory. Their daily readings are observed at 08:00 LST and the value reported for the daily observation represents data collected during the previous 24 hours, ending at 08:00. The CLICOM time-series starts in 1902 and ends in 2013. However, the surveying periods of each station vary from 30 days to 107 years with a median of 29 years. The data included are daily and monthly values of observed, minimum and maximum temperature (° Celsius), precipitation and evaporation (millimetres), the daily occurrence of electric storms, hail, fog, frost and two levels of cloud cover. A station catalogue, which contains longitude, latitude (in Degrees Minutes Seconds format) and altitude (metres), is part of the database. Despite the important number of observation stations with short periods of data, we used the last climate normal period (1981-2010), a three-decade average applied to temperature and precipitation in order to characterize the climate at a precise location (Arguez & Vose, 2011). We computed the annual averages of precipitation and temperature for each station, i.e. the mean yearly precipitation and temperatures, elements that are often used for interpolation and zonal climate description (Li & Bai, 2012; Zuo, Lü, & Hu, 2004). The monthly data were averaged by month for the entire data period when the number of years exceeded 20, since this is the minimum recommended period to calculate climate normals (World Meteorological Organization, 2009); shorter periods were not considered in the present study. The twelve mean monthly values were subsequently averaged to establish annual average precipitation (P) and temperature (T). For the variographic analysis, a total of 2117 climate stations were retained for P data and 1973 for T data; their spatial distribution is shown in Figure 1. From each of these datasets, 102 stations belong to the Yucatan Peninsula. 16 Figure 1: Elevation map of Mexico. Black dots correspond to climatological stations with precipitation data (a) and temperature data (b). 17 In order to calculate the ground distance between stations, the geographical coordinates of the station database need to be converted to Cartesian values (Olea, 2006). The coordinates of the CLICOM stations are geographical; their latitude and longitude are DMS with the International Terrestrial Reference Frame 1992Datum. The r-package rgdal (R. Bivand, Keitt, & Rowlingson, 2015) was used to change the map system to the projected system most suitable for Mexico and used by the Instituto Nacional de Estadística y Geografía (INEGI) for national maps; the Lambert Conformal Conic (LCC) and metric coordinates are calculated for each climate station. The Lambert Conformal Conic Projection for Mexico has standard parallels at 17.5° and 29.5° N, a longitude of central meridian at 102° W, a latitude of projection origin of 12° N, a false easting of 2500 km and no false northing. 2.2 Data exploration and pre-treatment In order to analyse the global spatial autocorrelation, we computed Moran's Index (Moran, 1950) and produced Moran's plots and maps for P and T databases. We used R version 3.2.1 (R Core Team, 2015) and package GeoXp (R. S. Bivand, Pebesma, & Gómez-Rubio, 2013; Laurent, Ruiz-Gazen, & Thomas-Agnan, 2012) to evaluate Moran’s Index, which was calculated considering the 4th nearest neighbour weighted with a row standardized coding scheme (R. Bivand, Hauke, & Kossowski, 2013; R. Bivand & Piras, 2015; Tiefelsdorf, Griffith, & Boots, 1999). Topography has an important effect on temperature and precipitation. The adiabatic lapse-rate has been measured since the early 20th century (Brunt, 1933). Furthermore, rainfall is known to increase with elevation; for a mountainous island, authors evaluate precipitation using cokriging with DEM data (P. Goovaerts, 2000; Sarangi, Cox, & Madramootoo, 2005) or regression kriging for southern Europe (Bajat et al., 2012; Tadić, 2010) and a mountainous region of Mexico (Boer, de Beurs, & Hartkamp, 2001). Other authors note that precipitation does not have the same linear relationship with elevation as temperature, particularly in tropical regions, and it depends on crest, valley and orientation effects (Wotling, Bouvier, Danloux, & Fritsch, 2000). Therefore, in order to remove the geographical tendency of data and to explore the relationship between altitude and P or T we applied regressions considering LCC coordinates and elevation. We applied linear and polynomial regressions to original datasets and judged the models considering P-value (Murtaugh, 2014) for each factor and model general Akaike information criterion (Akaike, 1974; Claeskens & Hjort, 2008). We retained residuals datasets resulting from the most suitable models. The regionalized variable theory assumes second order stationarity in the data (Oliver & Webster, 2015), where the mean has to remain constant and independent of the location within the region of stationarity, i.e. data should be detrended. Considering that the data structure depends on climatological phenomena and relief, we expect differences between the spatial structure of raw data and residuals. In order to explore these spatial structure differences between the original and the detrended data, we also conserve the original data for variographic analysis. 18 2.3 Variogram calculation Variograms were calculated using r-package geoR (Diggle & Ribeiro Jr., 2007; Ribeiro Jr & Diggle, 2015). This open source software particularly allows extended spatial pattern exploration and has been used for climatological studies (Fowler & Ekström, 2009; Wilson & Silander, 2014). In order to determine the data variance at different scales, we performed isotropic and anisotropic variograms for each of the climatic datasets for 50 equidistant log- scale active lag distances (ALDs). The ALDs tested were established by a geometric progression which has a common ratio of exp[ln(1000)/49] from 0.1% to 100% of the zone diameter (3564.86 km, distance between South-eastern and North-western stations: "Tecnológico Chetumal" (88°18'03" W - 18°31'08" N) and "Presa Rodríguez - Tijuana" (116°54'28" W - 32°26'49" N). We decided to test a large range of ALDs despite the fact that (Arnaud & Emery, 2000) reported that an experimental variogram is not reliable for distances greater than the zone radius because pairs of values are not numerous enough, hence the highest interval variance values are over estimated. Furthermore, we were aware that the shape of the study area, with large peninsulas in the far east and west (Figure 1), might increase this problem. We had the same issue with the minimum lag distance values. Although an elevated number of spatial data could not compensate for these limitations, it still proved interesting to explore the variance structure at these scales, since finding the limitations of the applied methodology was part of the objectives of the present study. In addition, this problem can be controlled by establishing a minimum threshold of data pairs to be compared. Hence, we considered at least 15 data pairs for each of the 15 lags. The lag class distance interval used in each variogram remained at 1/15 of the active lag distance as the default parameter for the geoR package; this could be considered a wide interval but the aim was to have many pairs of data from climate stations in order to calculate the variance represented by each point on the graph. The fact that a wide interval reduces the sharpness of the structure for the first points in a variogram was not relevant for this study since it considers variograms for different scales. Because of the eventual presence of a spatial trend or zonal anisotropy in the data and zonal anisotropy in the residuals, anisotropic variograms were also calculated for every direction, i.e. each integer degree direction from 0 to 179 with a 22.5° tolerance. The same methods used to elaborate isotropic variograms were applied. Angles between 180 and 359 give the same results as from 0 to 179° for any pair of measurements. The models tested all corresponded to positive definite functions. These were Cauchy, circular, cubic, exponential, Gaussian, Gneiting, linear, power, powered exponential, pure nugget, spherical and wave (Diggle & Ribeiro Jr., 2007). We chose to test 12 models since this provided us with greater flexibility and enabled us to define the spatial behaviour of our values. It is worth noting that the Gneiting model was developed for atmospherical analysis (Gneiting, 1999). Each model was fitted using the least Weighted Sum of Squares (WSS) of the residuals between the model and the variogram. We chose WSS weighted by the Cressie 19 method (Cressie, 1985) because of the assumption that the variogram consistency increases with a greater number of data and the fact that the first lags are the most important, before reaching the sill (Pierre Goovaerts, 1997). We applied the same method to the data and residuals for Mexico as a whole and for the Yucatan Peninsula to explore the presence of structure in the variance at any scale. With regards to the Yucatan Peninsula, we elaborated the variograms for adapted distances i.e. 25 ALDs from 2595 to 648743 m (i.e. 1/250 to 1 zone diameter) respecting a geometric progression. 2.4 Model evaluation The outputs from the variograms constructed with LCC coordinates for data from Mexico consisted of 108,000 anisotropic variograms and 600 isotropic variograms (+54,000 and 300 respectively for the Yucatan peninsula) for each climatic variable tested (P, T, P residuals and T residuals). Each graph is accompanied by the values of the weighted residual sum of squares (WSS), coefficient of determination (R2), nugget (C0), amount of the structured variance (C), model's sill (C+C0), proportion of the structured variance [C/(C+C0)], range (A), number of pairs and lags. Such a quantity of data leads to establish rules in order to select the suitable variograms. First, since we considered ALDs from 0.1 to 100% of the study zone diameter, we discarded the variograms that did not have variance estimated for each of the computed lags; this was particularly common for the low ALDs and anisotropicvariograms. The second rule established concerns the relationship between the theoretical range and the active lag distance. If a variogram is bounded and covariance exists, (Journel & Huijbregts, 2004) consider that there is a second-order stationarity, hence the sill value is equal to the variable variance. Inversely, when the model does not reach the sill within the active lag distance, the variance cannot be defined. That is why we excluded variograms with an inexistent range or one that exceeded the active lag distance. On the other hand, taking into account the fact that three points are needed to evaluate an alignment, we discarded variograms when A / ALD < 0.2. The R2 and WSS values indicate how well the model fits the data. The latter determines the amount of variance in the data that is not explained by the fitted model (Draper & Smith, 1998). However, WSS mainly vary with active lag distance because the total of variance explained is higher for large ALDs than for small ALDs. Consequently, the use of WSS to evaluate the optimum scales for the climate data is not appropriate but it can help to determine the most relevant models for each active lag distance. For anisotropic models, we determined the direction by using the axis of maximum continuity based on the best structure and largest range (Isaaks & Srivastava, 1989), i.e. A / ALD must ideally be close to 1. Finally, we applied a cross-validation to all the models resulting in R2 ≥ 0.8. This technique allowed us to judge how well the model fits the data (Delfiner, 1976). In particular, we 20 retained the goodness of fit and the Mean Standard Error (MSE) of the estimates resulting from the model. As this process is time consuming, for anisotropic models, we first validated 100 random points before a validation using all the points concerning the 200 models with the least MSE. We applied the same methods for the subset of data corresponding to the climate stations in the Yucatan Peninsula and used the 102 points for validation of all the available models. In order to confirm the independence of the variogram data from the spatial transformation of Geographical coordinate values to the Lambert Conformal Conic System in a such extended territory as Mexico is, we also applied the methods of this study, except cross-validation, after having changed the coordinates to Universal Transverse Mercator (UTM). UTM is a frequent projection used in Mexico for regional studies. Zone 14 was chosen because of its central position in the country. 3 Results 3.1 Data exploration The distribution of T for Mexico shows data near normality: skewness=-0.2, kurtosis=- 0.91 whereas the distribution of P for Mexico shows that data are not normal: skewness=2.32, kurtosis=8.08. The distribution was obtained in accordance with Barger and Thom (Barger & Thom, 1949) who suggested the incomplete gamma function for the representation of rainfall. Scatter plots of data against X or Y coordinates denote spatial trends (Figure 2), with data decreasing towards the North (b) and (d) and increasing towards the West with regards to P (a) and to a lesser extent T (c). 21 Figure 2: Relationship between precipitations normals (P) (a) (b) and temperatures normals (T) (c) (d) and LCC coordinates X and Y Data are spatially correlated; the Moran´s Index evaluation gave values of 0.79 and 0.83 for P and T respectively. P-values were equal to 0.0001 for both datasets. Moran's maps, elaborated from the Moran scatterplots (c.f. Supplementary material Figures S1, S2) (Anselin, 1996), show that there are no real outliers. A few data points seem to be outliers but they belong to climate stations situated in transition zones, particularly on the side of the "Cordilleras Madres" where elevation variation can exceed 2000 m within a very short distance (Figure 1). Linear regression P-values show that P and T are globally correlated with coordinates and elevations. For T, elevation explains 73.75% of the data; hence, we were able to estimate a general adiabatic lapse-rate of about 4.416°C/1000 m for all of Mexico. However, polynomial models permitted a better definition of the variables evolution regarding to coordinates and elevation (Table 1). 22 Table 1: Factors of polynomial regressions between data and LCC X and Y coordinates (m) and elevation (m), Residual Standard error and R2 are 424.6 and 0.5003 for precipitation normals (P), and 1.412 and 0.8914 for temperature normals (T) respectively. P (mm) T (°C) Estimate Std. Error T-value P-value Estimate Std. Error T-value P-value Intercept 1261 190.4 6.625 4.4E-11 11.73 1.085 10.807 <2E-16 X 7.19E-04 1.46E-04 4.925 9.09E-07 Y -2.19E-03 1.23E-04 -17.763 < 2E-16 Elevation -0.313 0.04 -7.829 7.74E-15 -3.32E-03 2.18E-04 -15.231 <2E-16 X^2 -7.84E-11 2.671E-11 -2.936 3.36E-03 1.72E-11 1.25E-12 13.705 <2E-16 Y^2 6.52E-10 5.136E-11 12.688 < 2E-16 Elevation^2 6.25E-05 1.55E-05 4.038 5.59E-05 -6.50E-07 1.22E-07 -5.33 1.09E-07 X^3 -7.80E-18 6.44E-19 -12.109 <2E-16 Y^3 -1.23E-17 1.83E-18 -6.734 2.16E-11 Y^4 1.53E-23 2.86E-24 5.346 1.01E-07 Elevation^4 1.45E-14 7.16E-15 2.022 4.33E-02 X^5 4.81E-31 5.37E-32 8.952 <2E-16 Y^5 -7.01E-30 1.53E-30 -4.584 4.84E-06 X^6 -6.12E-38 8.07E-39 -7.586 5.08E-14 Y^6 1.15E-36 2.75E-37 4.175 3.11E-05 P residuals become necessarily more centred than the original data and the number of values close to 0 is particularly high: skewness = 2.30, kurtosis = 13.56. Residuals were also clustered; Moran's Index was 0.6752 with a P-value equal to 0.0001. The T residuals present a near-normal distribution with skewness = 0.05 and kurtosis = 8.21. Residuals were also slightly clustered; Moran's Index was 0.2282 with a P-value equal to 0.0001. The lower spatial autocorrelation in the residuals than in the data, particularly for T, is due to the fact that linear regression removed a significant amount of the variation in the data, dependent on geographical position and elevation, leaving only unexplained variation in which noise and microclimate fluctuations become important. We also calculated the spatial correlation and residuals for the Yucatan Peninsula data. Moran’s I was low for P 0.4244 and particularly low for T 0.1062, indicating important random local variability. With respect to the removal of spatial tendencies, P-value and goodness of fit values denote a lower explanation of the data because the region was much smaller than Mexico and flat. However, trends were present on the three axes for both datasets. The main tendencies shown were the South-North reduction in rainfall with - 1.321 mm/km and P-value = 1.8e-10, which is related to elevation. A spatial correlation test for residuals gave reduced Moran’s I values of 0.2533 for P residuals and -0.0703 for T. Concerning the T residuals, the results indicated a random spatial pattern; therefore, 23 geostatistical analysis of these data may not provide relevant results. We consequently excluded the dataset from the variographic analysis. 3.2 Variographic scale analyses 3.2.1. Mexico variograms For Mexico, including the Yucatan Peninsula, the minimum distance tested as ALD was 3242 m. However, variograms did not have 15 lag points for the ALDs less than 10 km because of the lack of pairs for shortest distances. Taking into account the 15 pairs per point minimum for the validity of a variogram, the shortest available ALDs are 6560 m for isotropic variograms of precipitation, 10014 m for isotropic variograms of temperature and 11530 and 13276 m respectively for the anisotropic variograms. Graphs showing the variation of the average standard deviation of variogram points for lag ranges (c.f. Supplementary material Figures S3 – S6) demonstrate the spatial response of the variables. Ifthe lowest values were found for short distances and the highest values for long distances, the increase in average standard deviation is irregular; there is evidence of steps and a decrease in the variance from 400 km for T. For P, values increase notably from 1500 km. In addition, these boxplots show high relative variation in the average standard deviations for the first and last ranges. In order to explore the spatial scale related variogram and model variations, we plotted variograms with the most adequate model for 10 selected distances (c.f. Supplementary material Figures S7 – S14). We also plotted the sequence relative to ALDs of the least WSS and the highest R2 from the valid variogram and its models (Figure 3). The increase in WSS is regular and marked by steps (Figure 3 (b) and (d)). In contrast, the higher values of R2 indicated a better fit for median scales. The reduced number of pairs for the shortest and longest ALDs resulted in a poor fit to the model line. In general, R2 values for the models concerning residuals were lower than for raw data (Figure 3 (a) and (c)), particularly for T. However, WSS may simultaneously denote a lower discrepancy between the variogram data and the model. 24 Figure 3: Lowest RSS and highest R2 for valid variograms for each Active Lag Distance, relative to Precipitation (a) (b) and Temperature (c) (d). From all the isotropic variogram models tested for P data, 142 models were validated by filtering. These correspond to ALDs between 40 and 600 km and greater than 1000 km. Gaussian-like models (Gaussian, Cauchy and Gneiting) represent half of the available models and are present for each of these ALDs except 1049 km for which there is only a wave model. It is interesting to note that Gaussian or Gneiting sill values for ALD ≈ 120 km are similar to nugget values for ALD ≈ 1200 km, about 140,000. The exponential model is found in valid variograms from 60 to 600 km and results in the best R2 for distances between 250 and 450 km. The models with the lowest validation MSE (217.3) are the Cauchy models applied to the shortest ALDs (41 and 47.2 km). For both, the sill is reached at 71% ALD, C / (C0+C) is about 0.85 and they result in a cross validation R2 = 0.87. These Cauchy models also give good results for longer distances (54-72 km). For longer distances, a powered exponential model, which corresponds to an ALD of 519 km i.e. 16% of the maximum ALD, gives a MSE = 224. Furthermore, this model has a low nugget (2578) and consequently a high proportion of structured variance 0.989. The sill is reached at 34.4% of the ALD. The available isotropic variogram models tested for P residuals correspond to ALDs between 37.1 and 603.8 km and between 1003 and 1138 km. The number of suitable models (96) is lower than the number of models we found for raw data, also there are fewer Gaussian 25 like models compared to previous variograms. This mainly occurs for low and high ALDs. The circular model is the most frequent (17 times), found for each ALD between 54.2 and 169.9 km, 248.5 and 468.5 and for 1138 km. Gaussian or Gneiting sill values are similar to nugget values for larger ALDs (ALDs between 37.1 and 116.1 km vs. greater than 1000 km). The model with the lowest validation MSE (227.6) is an exponential model, which corresponds to a 54.2 km ALD i.e. only 1.5% of the maximum ALD. It has an average nugget (22735) and a good proportion of structured variance 0.878. The sill is reached at 93.5% of the ALD. This models also provide the best cross validation coefficient of determination with adjusted R2 = 0.858. The other models that fit the data best can be found for ALDs = 47.8, 61.6 km and correspond to Gaussian and Gneiting models for the first ALD and to exponential for the last one. From the study of isotropic variograms of P and its residuals, the use of Gaussian like models for short ALDs, generally less than 70 km, proved important. In addition, the best validation results were obtained from raw data. From all the anisotropic variogram models tested for P data, 15469 models proved to be applicable after filtering process with all ALD values from 26.8 km. The most frequent ALD and azimuth are 127 km and 125°. Circular, cubic and spherical models are present for all these ALDs except 35.6 km, while Cauchy, Gaussian and Gneiting models can only be found until 2445 km and powered exponential models for median ALDs (110-912 km). The variograms with the best R2 values correspond to Cauchy models at 388 and 912 km with an azimuth of about 98°. The proportion of structured variance is also good with values greater than 0.86. The circular model results in the lowest validation MSE (257.1) and corresponds to an ALD of 912 km and 95°. The proportion of structured variance is 0.87. The sill is reached at 95.1% of the ALD. The cross validation R2 is 0.82. The number of available anisotropic variogram models tested for P residuals is 10232. These correspond to ALDs between 15.3 and 17.3 km and between 32.7 and 1665 km. The azimuth mode differs little from that previously cited with 131° and the ALD mode is equal to that noted for P data with 131.8 km. Gaussian and Gneiting models are the most common followed by the circular model. Cauchy, Gaussian, Gneiting and wave are the only models applicable to 17.6 km with an azimuth of 80°. The model with the lowest validation MSE (225.0) is an Exponential model, which corresponds to an ALD of 131.8 km and an azimuth of 124°. This model has a low nugget (10026) compared to the sill and a good proportion of structured variance 0.93. The sill is reached at 48.2% of the ALD. The adjusted R2 between the model of the residuals and the original data is equal to 0.86. The other models that best fit the data are all exponential, with ALD between 69.9 and 282.1 km, tested for azimuth values between 113 and 133°; also, the MSE is lower than 230 for 98 of these models. Other exponential model for a shorter distance 61.6 km and a different angle 158°, provides a low MSE value too (225.5). 26 In contrast to our findings for isotropic variograms, the validation of an anisotropic variogram applied to P residuals gives better results than for raw data, particularly when using an exponential model for 131.8 km. We found 179 available isotropic variogram models for T data. These are related to ALDs between 20 and 1602 km. Of these models, 27 are circular for ALDs = 26.8-47.2, 62.6-1050 and 1391 km. This type of model also gives the best R2 (0.998) and low WSS (33.5) for 146 km. Cauchy and Gneiting are the only models present for the minimum ALD, and Gaussian-like models can be applied up to 687 km. With regards to the spherical model, this can mainly be applied to long distances (193-1391 km). A better validation is obtained using this model when applied to 391 km, with an MSE = 1.61 and R2 = 0.86. The spherical model for an ALD = 340 km, Cauchy for ALDs = 41, 72, 83, 110, 126 or 145 km and circular model for 295 km also result in a MSE ≤ 1.65. We considered 36 isotropic variogram models for T residuals. The ALDs applied to these models are between 69.9 and 248.5 km. Spherical models are the most frequent within our results followed by Circular, and Cubic. For the shorter ALDs, Gaussian like models can also be applied. Six models correspond to the lowest validation MSE (between 1.28 and 1.30). These are three powered exponential models, which corresponds to ALDs of 69.9, 79.4 and 149.6 km. The nugget values for these models vary from 0.986 to 1.456, with a low proportion of structured variance; it is only superior to 0.5 for the two best models. Also, the sill is reached at 77.3% of the ALD except for 149.6 km with 61.9%. The final adjusted R2 value, considering the linear regression, is equal to 0.91. The differences in the considered models MSE andR2 are low, 0.056 and 0.008 respectively. The best validation results for isotropic variograms of T are obtained with a powered exponential model and a short range of ALDs applied to T residuals. In addition, in contrast to our findings for P, the most applicable models and distances for T are not identical to those identified for T residuals, with a reduction in the available distances and the absence of Gaussian like models while working with residuals. From the anisotropic variogram models tested for T data, 26403 can be applied. These correspond to ALDs between 17.6 and 1844 km. The most frequent ALD is longer than previously encountered for P (391 km); the azimuth mode is 118°. The circular model is available for all these distances while the spherical model cannot be used at 20.3 km. Cubic, exponential, Cauchy, Gaussian and Gneiting models are also of the most frequent. In particular, Gaussian-like models can be applied to every angle for ALDs between 54 and 450 km. The highest R2 values are obtained for Cauchy models for azimuth values of 40 or 73° and distances of about 256 and 450 km. The application of the exponential model for 340 km and 14° gives the best MSE (1.63); the sill is reached at 158 km and the proportion of structured variance is 0.99. The best models are Cauchy for 146 km and 173°, and for 256 km and 40°. For all of these, the MSE is less than 1.75. We found 852 available anisotropic variogram models for T residuals. These models are linked with ALDs values between 61.6 and 1292 km. The mode values for ALD and the 27 azimuth are 248.5 km and 144°. Exponential and powered exponential are the only models that can be applied to the shortest ALD and to 320.3 km; however, circular and spherical models are the most frequent. Also, Gaussian like models cannot be applied to distances superior to 250 km. The most elevated cross validation adjusted R2 and least MSE values are found for Exponential and powered exponential variograms for 248.5 and 282.1 km and about 40°. These are combined with a notable nugget effect (1.8) that reduces the proportion of structured variance (0.6). Widely, 65 models provide a validation MSE inferior to 1.28 for angles about 0 and 40° and ALDs of 116.1 km and from 218.9 to 282.1 km. It is worth noting that the larger distance only corresponds to 7.9% of the maximum ALD. These models have an average nugget (1.20) but a low sill (2.51), resulting in C / (C0+C) = 0.52. The sill is reached at only 57.1% of the ALD. Finally, with regards to anisotropic variograms of T and its residuals, the treatment of residuals makes it possible to obtain a better validation. However, the lowest MSE is similar to the best one obtained from an isotropic variogram. Concerning the regionalized variables tested, the variograms and their corresponding model attributes resulting from the application of geostatistical methods to the same datasets with UTM coordinates are similar to values obtained with LCC coordinates. Relative variations for WSS, R2, C0, C and A were less than 0.5%. 3.2.2 Yucatan Peninsula variograms We counted 4049 experimental variograms with 15 points and a minimum of 225 pairs for the Yucatan Peninsula from 103 to 515 km; however, the number of applicable models is low and varies according to the variable being studied. In fact, for temperature, there are only 8 valid models. For precipitation, we found 13 isotropic variogram models and 835 anisotropic variogram models. With regards to the residuals, we encountered 9 and 707 models respectively. The isotropic variograms for P data correspond to the ALDs between 205 and 325 km and to 515 km. At 205 and 325 km, data can be modelled using wave or Gaussian-like models. The wave model is also the only model valid for 515 km. For 258 km, Cauchy, circular, exponential, Gaussian, Gneiting and spherical models are applicable. The best model is the Gaussian model (MSE = 156.6) applied at 205 km and reaching the sill at 125 km. This model has an elevated nugget effect (19108), which suggests a moderate proportion of structured variance (57%). Although the model fits the variogram (R2 = 0.93), it fits the data poorly (R2 = 0.46). Concerning P residuals, similar ALDs proved to be applicable (205, 258, 352 and 515 km). Only the wave model can be applied at a distance of 515 km, whereas the wave and the Gaussian-like models can be applied at 205 and 325 km. Only the powered exponential model can be applied at 258 and 325 km and only the circular model at 515 km. For an ALD of 256 km, Cauchy, circular, exponential, Gaussian, Gneiting and spherical models are valid. The Cauchy model for 409 km results in a better validation (MSE = 151.4; R2 = 0.48). Its nugget 28 and sill are almost identical to Gaussian models found for P data. The second-best model is the circular model at 515 km, which primarily differs from the previous model by its range of 407 km instead of 140 km (MSE = 152.9; R2 = 0.49). Anisotropic variogram models are applicable to P data from 130 to 515 km. However, only the cubic model is available for the minimum distances and only the circular or Gaussian-like models for 515 km. The distances with the greatest number of available models are 205 and 325 km. The azimuth mode is 96°. Nevertheless, the lowest MSE (150.5) is obtained for 23° at 325 km. We can particularly note that this azimuth corresponds to the longitudinal orientation of the climatic zones defined for the Peninsula (García, 2004a). It is also interesting to see that the nugget effect (13287) is lower than for isotropic models, despite the fact that the range is much larger (237 km) and the model similar (Cauchy). Nine models produce MSE < 151; all Gaussian-like models for 21 to 26° and for 258 and 325 km. The valid anisotropic variograms for P residuals correspond to the ALDs between 163 and 515 km, the majority of the models tested proved to be applicable at 205, 325, 409 and 515 km while only the cubic model could be applied at 163 km and only the circular at 258 km. The distance with the greatest number of available models is 409 km and the azimuth mode is 27°. The best models are Gaussian, both for 409 km and 77° and for 325 km and 23° (MSE = 149.4; R2 = 0.49). We found 67 valid models resulting in MSE < 150, all of them were Cauchy, Gaussian or Gneiting models and correspond to azimuth ranges of 21-28° and 73-94°. We particularly note that for 515 km only the first angle range gives low MSE. The nugget effect varies between 13800 and 19200 and the range between 174 and 247 km. The valid models for temperature correspond to anisotropic variograms of T data. Seven of them are applied at 258 km with an angle between 32 and 38°. The best MSE are obtained using Gneiting, Gaussian and Cauchy models (0.68), however, this value is greater than the linear regression standard error (Table 1); the validation R2 is also very low at 0.1. 4 Discussion and Conclusions We applied variographic analysis to mean yearly precipitation and temperature, corresponding to a normal period for Mexico and the Yucatan Peninsula. Variograms were calculated using distances from several kilometres to the zone’s diameter and by applying 12 models. The minimum number of pairs per variogram point used in this study (15) was low compared to the value (30) recommended by (Cressie, 1991; Pierre Goovaerts, 1997). Nevertheless, it highlighted data structure without being used for interpolation. The structures highlighted were definitely dependent on the data and the study scale but could not be related to the geographical transformation of the coordinates and projection used. The objective of this paper was to identify an adequate observational scale for studying spatial variations in the average annual precipitation and temperature in Mexico and in particular the Yucatan Peninsula.This was accomplished using available data. A geostatistical overview of climatic data is useful for the analysis of the spatial behaviour of the variables 29 and allows us to see that these data could be studied using different scale models and orientation. In addition, different scales are applicable to T and P. The most suitable distance for highlighting precipitation structure in Mexico was approximately 41 km. Therefore, we can consider the spatial evolution of precipitation to be a meso- phenomenon. Relief influenced P by generating variation in elements such as local orientation. Although linear regression can explain 50.0% of the data variation, variographic analysis of the residuals highlighted using isotropic variograms give a similar variance structure but presented a lower quality in the structure definition. At the contrary, for anisotropic analysis, the scale changes between the application to data and to residuals, the first being more suitable for meso-α scale and the second for meso-. In addition, variographic analysis show a preferable direction in spatial data structure (about 120°); which is frequent for raw data but more clearly identifiable for P residuals, despite of the fact that anisotropical variograms were lower rated than isotropic variograms from raw data. Although geostatistical techniques are appropriate for the spatial analysis of precipitation (Grimes & Pardo-Igúzquiza, 2010), we detected a high error with an MSE of at least 217 mm, i.e. 25.6% of the stations’ average P value, as reported by (Feidas et al., 2013). This is due to the previously mentioned micro-scale variation. Fixing this problem requires the application of methods used for physiographically sensitive mapping (Daly et al., 2008). Due to the flat relief of the Yucatan Peninsula, the data structure and associated scales are different. For data as for residuals, isotropic or anisotropic variograms, the distances that best define the structure of the data correspond to meso-α scale distances; however, optimum ALDs tend to be larger for residuals than for raw data, as we could have expected. In addition, contrary to what we find for Mexico, the best validation resulted from the application of anisotropic variograms to residuals for distances between 325 and 409 km. This result is in accordance with those found by other authors for China and the application of a local regression-kriging technique and for a region in Spain (Moral, 2009; W. Sun, Zhu, Huang, & Guo, 2014). As reported by Webb et al. (Webb, Hall, Kidd, & Minansy, 2015), linear regression was particularly adapted to T with a 20.6% gain in the MSE. While the distance that highlights T data structure is about 350 km for isotropic or anisotropic variograms, the structure of residuals can be discovered at other scales (248.5-282.1 km for anisotropic variograms, a meso-α scale distance; and from 69.9 to 149.6 km for isotropic variograms, meso- scale distances). It is also interesting to note that the most effective models were exponential and powered exponential models, i.e. those that most demonstrate a strong local dependence since the variance increases near the origin. Concerning the Yucatan Peninsula, for precipitation, anisotropic variogram models of the residuals are preferable, while geostatistical methods do not appear to be efficient for T. With regards to the models, for the shortest ALDs (i.e. ≤ 15 km), Cauchy or other Gaussian- like models best fitted the variograms of the residuals and of the original data. Associated with a nugget effect, the slow increase in variance near the origin delimits the maximum scale for microclimate variations at the base of theses variograms i.e. about 3 km. The slight 30 increase in variation near the origin of the Gaussian-type models found for the large-scale study of precipitation may correspond to the sills of other models, which, if the density of stations allowed, could express the meso-γ and micro-α scale climatic variation. We also found these models for other scales demonstrating the presence of a complex structure. For long distances (e.g. 515 km for P residuals), wave models testify the presence of cyclicality. At this scale, this can correspond to the existence of the two “Cordilleras Madres” and their influence on the climate, particularly precipitation. Anisotropic variogram analysis of the four variables tested demonstrated a zonal anisotropy (Wackernagel, 2003). The values of the models’ sills and range varied with the azimuth, however, maximum and a minimum variance directions are not perpendicular as they should be (Deutsch & Journel, 1997). Therefore, we can infer that the irregularity of the topography in Mexico causes the directional variations of the variance sill and range. The conclusion of this study is that the variance structure of climatological data varies as a function of the element studied, scale and relief because spatial evolution of the phenomena is complex. Various mesoscale distances highlight different structures that have physical meanings. In addition, the ALDs related to structures that best fit the data rarely correspond to study zone radius. 5 References Aalto, J., Pirinen, P., Heikkinen, J., & Venäläinen, A. (2012). Spatial interpolation of monthly climate data for Finland: comparing the performance of kriging and generalized additive models. Theoretical and Applied Climatology, 112(1–2), 99–111. https://doi.org/10.1007/s00704-012- 0716-9 Akaike, H. (1974). A new look at the statistical model identification. IEEE Transactions on Automatic Control, 19(6), 716–723. https://doi.org/10.1109/TAC.1974.1100705 Anselin, L. (1996). The Moran scatterplot as an ESDA tool to assess local instability in spatial association. Spatial analytical perspectives on GIS, 111, 111–125. Arguez, A., & Vose, R. S. (2011). The Definition of the Standard WMO Climate Normal: The Key to Deriving Alternative Climate Normals. Bulletin of the American Meteorological Society, 92(6), 699–704. https://doi.org/10.1175/2010BAMS2955.1 Arnaud, M., & Emery, X. (2000). Estimation et interpolation spatiale : methodes deterministes et methodes geostatistiques. Paris: Hermes. Recuperado de http://publications.cirad.fr/une_notice.php?dk=487412 31 Arsenault, R., & Brissette, F. (2013). Determining the Optimal Spatial Distribution of Weather Station Networks for Hydrological Modeling Purposes Using RCM Datasets: An Experimental Approach. Journal of Hydrometeorology, 15(1), 517–526. https://doi.org/10.1175/JHM-D-13-088.1 Baccour, H., Slimani, M., & Cudennec, C. (2012). Structures spatiales de l’évapotranspiration de référence et des variables climatiques corrélées en Tunisie. Hydrological Sciences Journal, 57(4), 818–829. https://doi.org/10.1080/02626667.2012.672986 Bajat, B., Pejović, M., Luković, J., Manojlović, P., Ducić, V., & Mustafić, S. (2012). Mapping average annual precipitation in Serbia (1961–1990) by using regression kriging. Theoretical and Applied Climatology, 112(1–2), 1–13. https://doi.org/10.1007/s00704-012-0702-2 Barger, G., & Thom, H. (1949). Evaluation of drought hazard. Agronomy Journal, 519–526. Bivand, R., Hauke, J., & Kossowski, T. (2013). Computing the Jacobian in Gaussian Spatial Autoregressive Models: An Illustrated Comparison of Available Methods. Geographical Analysis, 45(2), 150–179. https://doi.org/10.1111/gean.12008 Bivand, R., Keitt, T., & Rowlingson, B. (2015). rgdal: Bindings for the Geospatial Data Abstraction Library. Recuperado de http://CRAN.R-project.org/package=rgdal Bivand, R., & Piras, G. (2015). Comparing Implementations of Estimation Methods for Spatial Econometrics. Journal of Statistical Software, 63(18), 1–36. Bivand, R. S., Pebesma, E., & Gómez-Rubio, V. (2013). Applied Spatial Data Analysis with R. New York, NY: Springer New York. Recuperado de http://link.springer.com/10.1007/978-1-4614-7618-
Compartir