论文部分内容阅读
目的了解初级抽样单元(PSU)数量与入样比对抽样误差近似估计和统计推断的影响,为今后调查的抽样设计提供参考。方法收集2010年中国慢性病及其危险因素监测中的98 587条收缩压测量数据开展二阶段模拟抽样;采用泰勒级数线性化法估计每个样本在考虑有限总体校正(FPC)和不考虑FPC情况下的均值、标准误及95%可信区间,比较估计的标准误和真实标准误间差异,分析不同设计下95%可信区间包含总体均值参数的概率。结果 PSU个数增加至10个时,抽样误差迅速从4.13 mm Hg降到1.91 mm Hg,下降了53.8%,但PSU个数增加至≥20个时,估计精度未见明显提升;在考虑FPC情况下,随着PSU入样比的增加,均值95%可信区间覆盖真值的概率波动较大:入样比<30%时,95%可信区间覆盖真值概率在94.0%上下波动;入样比>30%时,95%可信区间覆盖真值的概率呈现出震荡下降的趋势,最低到达88.2%,统计推断较敏感;在不考虑FPC情况下,95%可信区间覆盖真值概率均较考虑FPC情况高,在PSU入样比>20%时,95%可信区间覆盖真值概率较入样比<20%时出现了一个小幅跃升,统计推断较保守。结论 PSU数量的确定需同时考虑估计精度和调查可行性;PSU入样比过大时,应慎重使用基于误差近似估计的统计推断。
Objective To understand the influence of the number of sampling units (PSU) and sample inference on the approximate estimation and statistical inference of sampling errors, and to provide reference for the sampling design of future surveys. Methods A total of 98 587 systolic blood pressure (BP) measurements were collected during the monitoring of chronic diseases and their risk factors in China in 2010. The Taylor series linearization method was used to estimate the effect of FPC and FPC on each sample. The mean, standard error and 95% confidence interval were compared. The differences between the estimated standard error and the true standard error were compared. The probability of 95% confidence interval included in the overall mean parameter under different designs was analyzed. Results When the number of PSUs increased to 10, the sampling error rapidly decreased from 4.13 mm Hg to 1.91 mm Hg, down by 53.8%. However, when the number of PSUs increased to ≥20, the estimation accuracy did not increase significantly. When considering the situation of FPC , The probability of 95% confidence interval covering the true value fluctuates greatly with the increase of PSU sampling ratio: when the sampling ratio is less than 30%, the true probability of 95% confidence interval coverage fluctuates at 94.0% 95% confidence interval coverage truth value shows a trend of concussion decline, the lowest reached 88.2%, statistical inference is more sensitive; without considering the case of FPC, 95% confidence interval coverage of the true probability Compared with the case of FPC, the probability of 95% confidence interval coverage is slightly higher than that of the sample-in-place ratio <20% when the PSU sampling ratio is> 20%. The statistical inference is more conservative. Conclusion The determination of the number of PSUs should take into account both the accuracy of the estimation and the feasibility of the investigation. When the sampling ratio of PSU is too large, the statistical inference based on the estimation of error approximation should be used with caution.