In this paper,we consider the clustering of bivariate functional data where each random surface consists of a set of curves recorded repeatedly for each subject.The k-centres surface clustering method based on marginal functional principal component analysis is proposed for the bivariate functional data,and a novel clustering criterion is presented where both the random surface and its partial derivative function in two directions are considered.In addition,we also consider two other clustering methods,k-centres surface clustering methods based on product functional principal component analysis or double functional principal component analysis.Simulation results indicate that the proposed methods have a nice performance in terms of both the correct classification rate and the adjusted rand index.The approaches are further illustrated through empirical analysis of human mortality data.
The extraordinarily high temperatures experienced during the summer of 2022 on the Tibetan Plateau(TP)demand attention when compared with its typical climatic conditions.The absence of precipitation alongside the elevated temperatures resulted in 2022 being the hottest and driest summer on record on the TP since at least 1961.Recognizing the susceptibility of the TP to climate change,this study employed large-ensemble simulations from the HadGEM3-A-N216 attribution system,together with a copula-based joint probability distribution,to investigate the influence of anthropogenic forcing,primarily global greenhouse gas emissions,on this unprecedented compound hot and dry event(CHDE).Findings revealed that the return period for the 2022 CHDE on the TP exceeds 4000 years,as determined from the fitted joint distributions derived using observational data spanning 1961-2022.This CHDE was directly linked to large-scale circulation anomalies,including the control of equivalent-barotropic high-pressure anomalies and the northward displacement of the subtropical westerly jet stream.Moreover,anthropogenic forcing has,to some extent,promoted the surface warming and increased variability in precipitation on the TP in summer,establishing conditions conducive for the 2022 CHDE from a long-term climate change perspective.The return period for a 2022-like CHDE on the TP was estimated to be approximately 283 years(142-613 years)by the large ensemble forced by both anthropogenic activities and natural factors.Contrastingly,ensemble simulations driven solely by natural forcing indicated that the likelihood of occurrence of a 2022-like CHDE was almost negligible.These outcomes underscore that the contribution of anthropogenic forcing to the probability of a 2022-like CHDE was 100%,implying that without anthropogenically induced global warming,a comparable CHDE akin to that observed in 2022 on the TP would not be possible.
Background: Bivariate count data are commonly encountered in medicine, biology, engineering, epidemiology and many other applications. The Poisson distribution has been the model of choice to analyze such data. In most cases mutual independence among the variables is assumed, however this fails to take into accounts the correlation between the outcomes of interests. A special bivariate form of the multivariate Lagrange family of distribution, names Generalized Bivariate Poisson Distribution, is considered in this paper. Objectives: We estimate the model parameters using the method of maximum likelihood and show that the model fits the count variables representing components of metabolic syndrome in spousal pairs. We use the likelihood local score to test the significance of the correlation between the counts. We also construct confidence interval on the ratio of the two correlated Poisson means. Methods: Based on a random sample of pairs of count data, we show that the score test of independence is locally most powerful. We also provide a formula for sample size estimation for given level of significance and given power. The confidence intervals on the ratio of correlated Poisson means are constructed using the delta method, the Fieller’s theorem, and the nonparametric bootstrap. We illustrate the methodologies on metabolic syndrome data collected from 4000 spousal pairs. Results: The bivariate Poisson model fitted the metabolic syndrome data quite satisfactorily. Moreover, the three methods of confidence interval estimation were almost identical, meaning that they have the same interval width.
In this paper,we consider a system which has k statistically independent and identically distributed strength components and each component is constructed by a pair of statistically dependent elements with doubly type-II censored scheme.These elements(X1,Y1),(X2,Y2),…,(Xk,Yk)follow a bivariate Kumaraswamy distribution and each element is exposed to a common random stress T which follows a Kumaraswamy distribution.The system is regarded as operating only if at least s out of k(1≤s≤k)strength variables exceed the random stress.The multicomponent reliability of the system is given by Rs,k=P(at least s of the(Z1,…,Zk)exceed T)where Zi=min(Xi,Yi),i=1,…,k.The Bayes estimates of Rs,k have been developed by using the Markov Chain Monte Carlo methods due to the lack of explicit forms.The uniformly minimum variance unbiased and exact Bayes estimates of Rs,k are obtained analytically when the common second shape parameter is known.The asymptotic confidence interval and the highest probability density credible interval are constructed for Rs,k.The reliability estimators are compared by using the estimated risks through Monte Carlo simulations.
In the present work, we are interested in studying the joint distributions of pairs of the monthly maxima of the pollutants used by the environmental authorities in Mexico City to classify the air quality in the metropolitan area. In order to obtain the joint distributions a copula will be considered. Since we are analyzing the monthly maxima, the extreme value distributions of Weibull and Fréchet are taken into account. Using these two distributions as marginal distributions in the copula a Bayesian inference was made in order to estimate the parameters of both distributions and also the association parameters appearing in the copula model. The pollutants taken into account are ozone, nitrogen dioxide, sulphur dioxide, carbon monoxide, and particulate matter with diameters smaller than 10 and 2.5 microns obtained from the Mexico City monitoring network. The estimation was performed by taking samples of the parameters generated through a Markov chain Monte Carlo algorithm implemented using the software OpenBugs. Once the algorithm is implemented it is applied to the pairs of pollutants where one of the coordinates of the pair is ozone and the other varies on the set of the remaining pollutants. Depending on the pollutant and the region where they were collected, different results were obtained. Hence, in some cases we have that the best model is that where we have a Fréchet distribution as the marginal distribution for the measurements of both pollutants and in others the most suitable model is the one assuming a Fréchet for ozone and a Weibull for the other pollutant. Results show that, in the present case, the estimated association parameter is a good representation to the correlation parameters between the pair of pollutants analyzed. Additionally, it is a straightforward task to obtain these correlation parameters from the corresponding association parameters.
Juan A. Vazquez-MoralesEliane R. RodriguesHortensia J. Reyes-Cervantes
Background: The signal-to-noise ratio (SNR) is recognized as an index of measurements reproducibility. We derive the maximum likelihood estimators of SNR and discuss confidence interval construction on the difference between two correlated SNRs when the readings are from bivariate normal and bivariate lognormal distribution. We use the Pearsons system of curves to approximate the difference between the two estimates and use the bootstrap methods to validate the approximate distributions of the statistic of interest. Methods: The paper uses the delta method to find the first four central moments, and hence the skewness and kurtosis which are important in the determination of the parameters of the Pearsons distribution. Results: The approach is illustrated in two examples;one from veterinary microbiology and food safety data and the other on data from clinical medicine. We derived the four central moments of the target statistics, together with the bootstrap method to evaluate the parameters of Pearsons distribution. The fitted Pearsons curves of Types I and II were recommended based on the available data. The R-codes are also provided to be readily used by the readers.