Browsing by Author "Zhou, X"
Now showing 1 - 4 of 4
- Results Per Page
- Sort Options
Item Open Access A note on Bayesian inference after multiple imputation(American Statistician, 2010-05-01) Zhou, X; Reiter, JPThis article is aimed at practitioners who plan to use Bayesian inference on multiply-imputed datasets in settings where posterior distributions of the parameters of interest are not approximately Gaussian. We seek to steer practitioners away from a naive approach to Bayesian inference, namely estimating the posterior distribution in each completed dataset and averaging functionals of these distributions. We demonstrate that this approach results in unreliable inferences. A better approach is to mix draws from the posterior distributions from each completed dataset, and use the mixed draws to summarize the posterior distribution. Using simulations, we show that for this second approach to work well, the number of imputed datasets should be large. In particular, five to ten imputed datasets "which is the standard recommendation for multiple imputation" is generally not enough to result in reliable Bayesian inferences. © 2010 American Statistical Association.Item Open Access Analysis of main risk factors causing stroke in Shanxi Province based on machine learning models(Informatics in Medicine Unlocked, 2021-01-01) Liu, J; Sun, Y; Ma, J; Tu, J; Deng, Y; He, P; Li, R; Hu, F; Huang, H; Zhou, X; Xu, SBackground: In China, stroke has been the first leading cause of death in recent years. It is a major cause of long-term physical and cognitive impairment, which bring great pressure on the National Public Health System. On the other hand, China is a big country, evaluation of the risk of getting stroke is important for the prevention and treatment of stroke in China. Methods: A data set with 2000 hospitalized stroke patients in 2018 and 27583 residents during the year 2017 to 2020 is analyzed in this study. With the cleaned data, three models on stroke risk levels are built by using machine learning methods. The importance of “8+2” factors from China National Stroke Prevention Project (CSPP) is evaluated via decision tree and random forest models. The importance of more detailed features and their SHAP values are evaluated and ranked via random forest model. Furthermore, a logistic regression model is applied to evaluate the probability of getting stroke for different risk levels. Results: Among all “8+2” risk factors of getting stroke, the decision tree model reveals that top three factors are Hypertension (0.4995), Physical Inactivity (0.08486) and Diabetes Mellitus (0.07889), and the random forest model shows that top three factors are Hypertension (0.3966), Hyperlipidemia (0.1229) and Physical Inactivity (0.1146). In addition to “8+2” factors the importance of features for lifestyle information, demographic information and medical measurement are evaluated via random forest model. It shows that top five features are Systolic Blood Pressure (SBP) (0.3670), Diastolic Blood Pressure (DBP) (0.1541), Physical Inactivity (0.0904), Body Mass Index (BMI) (0.0721) and Fasting Blood Glucose (FBG)(0.0531). SHAP values show that DBP, Physical Inactivity, SBP, BMI, Smoking, FBG, and Triglyceride(TG) are positively correlated to the risk of getting stroke. High-density Lipoprotein (HDL) is negatively correlated to the risk of getting stroke. Combining with the data of 2000 hospitalized stroke patients, the logistic regression model shows that the average probabilities of getting stroke are 7.20%±0.55% for the low-risk level patients, 19.02%±0.94% for the medium-risk level patients and 83.89%±0.97% for the high-risk level patients. Conclusion: Based on the census data from Shanxi Province, we investigate stroke risk factors and their ranking. It shows that Hypertension, Physical Inactivity, and Overweight are ranked as the top three high stroke risk factors in Shanxi. The probability of getting a stroke is also estimated through our interpretable machine learning methods.Item Open Access Efficient rare event simulation for failure problems in random media(SIAM Journal on Scientific Computing, 2015-01-01) Liu, J; Lu, J; Zhou, X© 2015 Society for Industrial and Applied Mathematics.In this paper we study rare events associated to the solutions of an elliptic partial differential equation with a spatially varying random coefficient. The random coefficient follows the lognormal distribution, which is determined by a Gaussian process. This model is employed to study the failure problem of elastic materials in random media in which the failure is characterized by the criterion that the strain field exceeds a high threshold. We propose an efficient importance sampling scheme to compute the small failure probability in the high threshold limit. The change of measure in our scheme is parametrized by two density functions. The efficiency of the importance sampling scheme is validated by numerical examples.Item Open Access Moderate Deviation for Random Elliptic PDEs with Small Noise(2017-04-23) Li, X; Liu, J; Lu, J; Zhou, XPartial differential equations with random inputs have become popular models to characterize physical systems with uncertainty coming from, e.g., imprecise measurement and intrinsic randomness. In this paper, we perform asymptotic rare event analysis for such elliptic PDEs with random inputs. In particular, we consider the asymptotic regime that the noise level converges to zero suggesting that the system uncertainty is low, but does exists. We develop sharp approximations of the probability of a large class of rare events.