Data collection and preprocessing
This study adopted the 2020 national resident health literacy monitoring questionnaire issued by the National Health Commission of China to the Centers for Disease Control and Prevention at all levels. It consisted of two parts, and one was the annually-updated general health literacy of residents, which was utilized by this study to measure the GHL. The other part was the special questionnaire for COVID-19, which was newly added in 2020 due to the outbreak of COVID-19. This study adopted this part to measure the CRHL, CRIRA, and CRTPR. According to the correct answers of the questionnaire, the respondent whose questionnaire score reached 80.0% or more of the total score was judged to be health literate, and the regional level of health literacy referred to the proportion of people with health literacy in the total population . The definitions of the four dimensions (CRHL, CRTPR, CRIRA, and GHL), as well as other details, were listed in Table 1.
This study carried out a cross-sectional survey from August 14, 2020 to December 11, 2020 in Liangshan Yi Autonomous Prefecture, which was part of the national resident health literacy continuous monitoring survey conducted annually since 2012 . The multi-stage sampling framework was used to select individual respondents. Firstly, according to the economic and social development status and ethnic distribution, five representative regions of Liangshan Yi Autonomous Prefecture were selected, namely Huili County, Muli County, Xichang City, Yuexi County, and Zhaojue County. Then the probability proportionate to size sampling (PPS) method was adopted to draw four towns or streets from each region, and three villages or communities from each town or street. There were 35 households randomly chosen from each village or community. Besides, one permanent resident aged 15 to 69 was selected from each household by the Kish table . Finally, there were 2100 individuals sampled by this study.
Since some respondents had too low literacy to complete the questionnaire by themselves, this study conducted a household questionnaire survey of face-to-face interviews to ensure the survey quality. In order to effectively control the survey quality and facilitate the follow-up study, the name of respondent was mandatory. Considering privacy protection, both respondents and investigators signed informed consents, which emphasized the confidentiality of the respondents’ information and explained the purpose, importance, and main content of the investigation to the respondents. The ethics committee at Liangshan Prefecture Center for Disease Control and Prevention approved this study (LS2020005).
Quality control was carried out at the pre-investigation, investigation, and data processing stages. At the pre-investigation stage, all members in the study team received strict professional training followed by a preliminary investigation. When it came to the investigation stage, the research associates checked the questionnaires every day and called for amendments if any violation of the study protocol was detected. Finally, during the data processing stage, a double-recording comparison was used for data input, and complete case analysis was adopted to deal with logical errors, outliers and missing data problems .
Analyzing the influence of demographic characteristics on each dimension
There may be differences in CRHL, CRTPR, CRIRA, and GHL among people with different demographic characteristics. Some studies have reported that education, race/ethnicity, age, income, occupation, gender, etc. were associated with health literacy or psychological distress related to COVID-19 [15,16,17]. Therefore, it was necessary to explore whether different characteristics of people in impoverished regions have statistically significant impacts on the levels of the above four dimensions, so as to make policies, strategies, and measures more targeted and accurate.
To explore whether the levels of CRHL, CRTPR, CRIRA, and GHL were different among people with different demographic characteristics, we conducted the multivariate logistic regression analysis. Demographic characteristics were taken as independent variables. With GHL, CRHL, CRIRA, and CRTPR as study outcomes respectively, the medians of their measures were taken as the cut-off points . According to the hypotheses below, the GHL was covariate of the model with CRHL as the study outcome. Similarly, the GHL and CRHL were covariates of the CRIRA model. The GHL, CRHL, and CRIRA were covariates of the CRTPR model. The stepwise regression method was adopted to explore the demographics that had statistical significance on each dimension.
Exploring relationships among different dimensions
In order to improve the COVID-19 health education strategy, one important step was to reveal the quantitative relationships among different dimensions under the COVID-19 health education framework. Regarding the literature reviews and related hypotheses mentioned below, this study proposed the theoretical framework of the relationships among the four dimensions (CRHL, CRTPR, CRIRA, and GHL), as shown in Fig. 1. Due to the aim of this study, these relationships could be classified as three parts, i.e., direct effects involving CRHL, direct effects not involving CRHL, and indirect effects among dimensions.
Hypotheses on direct effects involving CRHL
Figure 1 showed that increase of CRHL may reduce CRTPR, that was:
H1: CRHL negatively affects CRTPR.
People with higher CRHL were less likely to experience excessive tension because they had a better understanding of COVID-19 related health information. Researchers revealed that the knowledge or confidence of the COVID-19 epidemic showed an inverse correlation with psychological distress [19, 20]. To this end, this study put forward the first hypothesis above.
In addition, Fig. 1 also indicated that there may be two other dimensions directly associated with CRHL. Firstly, the high uncertainty of an unknown disease may arouse people to try reducing uncertainty by obtaining related information to make sound decisions. People with high CRHL may actively and intensively search for disease-related information. There were studies that indicated a clear association between misinformation belief and poorer COVID-19 knowledge [21, 22]. Therefore, we proposed the following hypothesis:
H2: CRHL positively affects CRIRA.
Another associated factor with CRHL was GHL. Generally, individuals with higher GHL level had more knowledge about infectious diseases. Research has shown that confusion about COVID-19 was significantly higher among those who had lower health literacy . Hence, it came into being the hypothesis as followed:
H3: GHL positively affects CRHL.
Hypotheses on direct effects not involving CRHL
Furthermore, it has been highlighted that health education and improvement of health literacy were critical prevention and health promotion measures for mitigating the adverse effects of the so called infodemic [23, 24]. On these grounds, we proposed that:
H4: GHL positively affects CRIRA.
In addition, groups with good GHL may have more confidence in their own health, could actively deal with public health emergencies, and have less psychological burden. Previous studies found that people showing sufficient health literacy were less likely to suffer from psychological problems, such as anxiety, depression, and sleeping disorders [25, 26]. As a result, this study assumed that:
H5: GHL negatively affects CRTPR.
Another consideration was that the more sufficient CRIRA was, the more likely it was to obtain more positive information, and the less likely the psychological burden in the face of the epidemic will be. Researchers also revealed that high satisfaction with health information, specific up-to-date and accurate health information were significantly associated with lower stress, anxiety, and depression [27, 28]. Accordingly, we supposed that:
H6: CRIRA negatively affects CRTPR.
Hypotheses on indirect effects among different dimensions
In addition to the above direct effects, there may also be some indirect effects among the four dimensions. Therefore, five more hypotheses were proposed as below:
H7: CRHL mediates the relationship between GHL and CRIRA.
H8: CRIRA mediates the relationship between CRHL and CRTPR.
H9: CRHL mediates the relationship between GHL and CRTPR.
H10: CRIRA mediates the relationship between GHL and CRTPR.
H11: CRHL and CRIRA mediate the relationship between GHL and CRTPR.
Validation of hypotheses
This study adopted the structural equation model (SEM) analysis to quantify the aforementioned relationships among different dimensions. The estimation of SEM relied on the multivariate normal distribution of the data. The item parcel technology could be used to convert categorical scales questions of the questionnaire into continuous variables and transformed their distributions close to the normal ones. Parcels were formed by summing or averaging scores on two or more indicators, which were proven to be more continuous and normally distributed than the individual items [29,30,31,32,33,34,35]. Therefore, this study adopted the transformation method and process of parceling proposed by Cattell et al. [29, 30].
The quality of the measurement models was analyzed for their reliability and validity. Cronbach’s alpha and composite reliability were used to evaluate the reliability. The validity testing consisted of convergent validity and discrimination validity. The results showed that the design of measurement models was effective and reasonable, indicating that further structural model fitting analysis could be carried out. More details were shown in Additional file 1.
As a consequence, we incorporated the statistically significant demographic variables obtained from the logistic models as the control variables into the theoretical SEM, and then deleted those demographic variables with insignificant factor load in SEM. In addition, the correlations between the residual terms of SEM were adjusted to improve the fit of the model.
The maximum likelihood method was used to estimate the parameters of SEM. The univariate normal distributions (− 1.438 ≤ skewness ≤ 1.480, − 0.919 ≤ kurtosis ≤ 3.047) and multivariate normal distribution (Mardia’s kurtosis = 45.622) were tested separately. Due to the lack of multivariate normality, the Bollen-Stine bootstrap p procedure (performed 5000 times) was used to correct for fit statistic bias [36,37,38,39,40]. The following parameters were used to assess model fitness: relative chi-square (χ2/df), goodness of fit index (GFI), adjust goodness of fit index (AGFI), normed fit index (NFI), Tucker-Lewis index (TLI), incremental fit index (IFI), related fit index (RFI), comparative fit index (CFI), and root mean square approximation error (RMSEA).
Since the bootstrapping method was more powerful than the classical Sobel test and causal steps approach in testing mediating variables effects [41, 42], we adopted the bootstrapping (performed 5000 times) method to analyze the mediating effects of CRHL and CRIRA.
Estimating the moderating effects of the regional characteristics
It was necessary to pay attention to whether some important regional characteristic variables played significant moderating roles in the relationships among different dimensions. The regional characteristic variables concerned in this study, namely the mediating roles, mainly included the effectiveness of government prevention and control as well as the ethnicity. The effectiveness of government prevention and control of the epidemic may affect people’s use of the media, the acquisition of COVID-19 related information, and even the level of residents’ CRHL. The public’s evaluation of this effectiveness reflects the degree of recognition of government work from a personal perspective. In addition, the current epidemiological data on COVID-19 suggested that minority groups may be more susceptible to COVID-19 infections . To this end, we proposed the last two hypotheses:
H12: The moderating variables moderate the relationships among GHL, CRHL, CRIRA, and CRTPR.
H13: The moderating variables moderate the relationships among GHL, CRHL, CRIRA, and CRTPR via CRHL and CRIRA.
To be specific, we constructed moderation models and moderated mediation models to explore the moderating roles of regional characteristic variables.
All the above statistical tests were conducted at the statistically significant level of 0.05 and the statistical analyses were conducted by using SPSS 22.0 (IBM Corp, Armonk, New York, USA) and Amos 26.0 (IBM Corp, Armonk, New York, USA) software.