|
|
||||||||
ORIGINAL RESEARCH COMMUNICATION |
1 From the Jean Mayer US Department of Agriculture Human Nutrition Research Center on Aging at Tufts University, Boston (PKN and KLT), and the National Institute on Aging, National Institutes of Health, Baltimore (DM)
2 Any opinions, findings, conclusions, or recommendations expressed in this publication are those of the authors and do not necessarily reflect the view of the US Department of Agriculture. 3 Supported by the US Department of Agriculture under agreement no. 58-1950-4-401; the US Department of Agriculture, Agricultural Research Service (contract number 53-3K06-01); the National Institutes of Health, National Institute on Aging Intramural Program; and the General Mills Bell Institute of Health and Nutrition. 4 Reprints not available. Address correspondence to PK Newby, Jean Mayer USDA Human Nutrition Research Center on Aging at Tufts University, 711 Washington Street, 9th floor, Boston, MA 02111. E-mail: pknewby{at}post.harvard.edu.
| ABSTRACT |
|---|
|
|
|---|
Objective: Our main objective was to compare patterns derived from the cluster and factor analysis procedures with measures of plasma lipids.
Design: This cross-sectional study included 459 healthy subjects who participated in the Baltimore Longitudinal Study of Aging and had measures of diet and plasma lipids. Eating patterns were derived by using both factor and cluster analysis methods.
Results: In separate multivariate-adjusted regression models, subjects in the healthy cluster had lower plasma triacylglycerols than did those not in the healthy cluster (ß = 15.97; 95% CI: 29.51, 2.43; P < 0.05), and factor 1 (reduced-fat dairy products, fruit, and fiber) was inversely related to plasma triacylglycerols (ß = 7.02 mg/dL for a one-unit increase in z score; 95% CI: 12.92, 1.12; P < 0.05). Those in the alcohol cluster had higher total cholesterol concentrations than did those not in the alcohol cluster (ß = 12.81; 95% CI: 2.74, 22.88; P < 0.05), and factor 2 (protein and alcohol) was also directly associated with total cholesterol (ß = 1.59 for a one-unit increase in z score; 95% CI: 0.55, 2.63; P < 0.05). The multivariate model containing all of the clusters was not significantly different from the model containing all of the factors in predicting each lipid outcome.
Conclusion: Our study provides evidence of comparability between cluster and factor analysis methods in relation to plasma lipid biomarkers.
Key Words: Dietary patterns food patterns cluster analysis factor analysis dietary assessment cholesterol triacylglycerols HDL LDL
| INTRODUCTION |
|---|
|
|
|---|
Despite the growing use of patterning methods in nutritional epidemiology, to our knowledge, a direct comparison of the factor and cluster analysis procedures has not been performed. Given that both methods are currently used in nutritional epidemiology to derive eating patterns, such a study would be useful in moving the field forward. Furthermore, examining the associations between eating patterns and plasma lipids will provide additional evidence as to whether empirically derived patterns are biologically meaningful. Our main objective was to compare patterns derived from cluster and factor analyses with plasma total cholesterol, LDL cholesterol, HDL cholesterol, the ratio of total to HDL cholesterol, and triacylglycerols by using the patterns we derived from cluster and factor analyses in our previous studies (9, 10). Our secondary objective was to compare factor and cluster solutions directly to illustrate the similarities and differences between the methods.
| SUBJECTS AND METHODS |
|---|
|
|
|---|
4 d of diet records at the same visit at which they had blood drawn for the measurement of plasma total cholesterol, HDL cholesterol, LDL cholesterol, and triacylglycerols. Methods for dietary assessment and eating pattern derivation are described briefly below; the reader is referred to the original studies (9, 10) for more detailed information. The Institutional Review Boards of the Johns Hopkins Bayview Medical Research Center and the Gerontology Center approved the BLSA protocol, and all subjects gave written informed consent to participate.
Dietary assessment and pattern analysis
Dietary intake was assessed by the use of 7-d dietary records. The subjects were instructed by trained dietitians how to assess portion size, weigh foods, and complete the records; reports detailing dietary collection methods and dietary intake in the BLSA population were published previously (1214). Individual foods and food ingredients from the dietary records were first aggregated into groups. We formed 40 food groups according to macronutrient composition (eg, fat or fiber content) and culinary use; several foods (eg, pizza and eggs) comprised their own groups. When possible, foods were separated into full- and reduced-fat groups (eg, high-fat and reduced-fat dairy products). Because of the small variation in and low intakes of low-fiber cereals we initially observed (9), we collapsed the high- and low-fiber cereal groups into one food group (ready-to-eat cereals) in our factor analysis study (10) and in the present study. To derive patterns proportional to energy intake, food group intakes were converted to percentages of daily energy intake for entry into the cluster and factor analysis procedures. In this study, we use the 5 cluster and 6 factor solutions we derived previously to compare the 2 solutions.
To derive dietary patterns by using cluster analysis, the PROC FASTCLUS procedure in SAS (version 8.2; SAS Institute Inc, Cary, NC) using the K-means method was used to classify subjects into 5 nonoverlapping groups. The clusters were named according to which foods contributed the highest and lowest percentages of energy per day relative to the other clusters. To derive food patterns by using factor analysis, the PROC FACTOR procedure in SAS (version 8.2; SAS Institute Inc) using principal components analysis and orthogonal rotation (varimax option in SAS) was used to derive 6 noncorrelated factors. A factor score was then calculated for each subject for each of the 6 factors, in which the standardized intakes of each of the 40 food groups were weighted by their factor loadings and summed; the sum was then standardized (mean = 0, SD = 1) (15). The food patterns derived from factor analysis were named according to both the foods that loaded most positively on the factor and how the factors correlated with nutrients (10).
Plasma lipid measurements
An antecubital venous blood sample was drawn from the study subjects after they had fasted overnight. As previously described (16, 17), concentrations of triacylglycerols and total cholesterol were measured by the use of an enzymatic method (ABA-200 ATC Biochromatic Analyzer; Abbott Laboratories, Irving, TX). HDL cholesterol was measured by a dextran sulfatemagnesium precipitation procedure (18), and LDL cholesterol was estimated by the Friedewald formula (19).
Covariate assessment
Demographic data were collected from each study participant at the first visit and were used to adjust for potential confounding in regression analyses. Race/ethnicity, physical activity, smoking habit, education, lipid medication use, and vitamin supplement use were determined by questionnaire at the time dietary records were collected. Physical activity was measured by the use of an adapted version of the Harvard Alumni questionnaire, which asked participants about all daily activities (eg, activities at home and at work and during recreation or sports). The amount of time spent on each activity was summed across all activities to determine the daily energy output per kilogram body weight (kJ/kg) and was described previously (12, 20). Anthropometric measurements were made by following standardized procedures (21) and are fully described elsewhere (22). In summary, weight and height were measured for each subject at each visit, from which body mass index (BMI; in kg/m2) was calculated.
Statistical analyses
Because cluster analysis results in mutually exclusive dietary patterns, mean (±SD) intakes were calculated for each of the 40 food groups. Because food patterns derived from factor analysis are not mutually exclusive, these patterns are described by presenting the factor loadings for each of the 40 food groups. For each cluster, means (±SEMs) were calculated for total cholesterol, LDL cholesterol, HDL cholesterol, the ratio of total to HDL cholesterol, and triacylglycerols. Factor scores were first divided into quintiles, and mean (±SEM) lipid measurements were calculated for each quintile for each factor.
To facilitate comparison between the clusters and factors, we used linear regression analysis to examine individual associations between each cluster and each factor with each of our outcome variables (total cholesterol, LDL cholesterol, HDL cholesterol, total:HDL cholesterol, and triacylglycerols). Indicator variables were created for each cluster, whereas factors remained as continuous variables (z scores). A separate regression model was built for each individual cluster or factor for each outcome. For example, the healthy dietary pattern (cluster 1) was tested in 5 separate regression models to examine the associations between the healthy cluster and total cholesterol, LDL cholesterol, HDL cholesterol, total:HDL cholesterol, and triacylglycerols, respectively. Although factors can be tested together in the same model because they are not correlated, we tested each factor in its own model, as we did with the clusters, to maintain the same number of independent variables in each model. The regression analysis of each pattern was performed twice. The first analysis was adjusted for energy, age, sex, and BMI, and the second analysis was further adjusted for ethnicity, physical activity, smoking, vitamin supplement use, lipid medication use, and education. We created cross-product terms for each pattern and sex and added them to the multivariate model to check for interactions.
In a final set of regression analyses, all clusters were included in one model (omitting one cluster as the reference group because clusters are categorical variables) and all factors were included in another model to see which model better predicted each of the plasma lipid outcomes. Pitman's test was used to examine which solution better predicted each of the lipid outcomes. To further compare the clusters and factors, we computed mean (±SD) factor scores for each cluster. In secondary analyses, we repeated the factor and cluster analysis by using split-half samples to assess the reproducibility of each solution. All analyses were performed by using SAS for WINDOWS (version 8.2; SAS Institute Inc).
| RESULTS |
|---|
|
|
|---|
|
|
|
|
|
|
|
|
When we repeated the factor analysis procedure by using split-half samples, we observed both similarities and differences between the factor solutions. For example, whereas factor 1 in each solution contained many of the same healthful elements, including reduced-fat dairy products, cereal, nonwhite bread, and fruit, the strength of the factor loadings differed across solutions. Specifically, cereal had a factor loading of 0.13 in the first half and 0.39 in the second half, compared with 0.53 in the total sample. For factor 2, the strongest factor loadings in the first half were for cereal (0.68), fruit juice (0.61), and reduced-fat dairy products (0.46), suggesting a breakfast pattern, compared with the highest factor loadings for poultry (0.59), whole grains (0.47), and soups (0.37) in the second half. The factor loading for alcohol in factor 2 in each of the split-half samples was <0.10, compared with 0.33 in the total sample. For factor 3, alcohol had the strongest factor loading in the first half (0.63), compared with sweetened juices (0.57) in the second half (data not shown).
| DISCUSSION |
|---|
|
|
|---|
Differences in the effects observed between the 2 methods are likely explained by methodologic differences. First, the cluster and factor solutions produce patterns of differing food composition because they are statistically different procedures. Whereas the similar foods between cluster 1 and factor 1 likely explain the similarity of effects, the differences we observed between the remaining clusters and factors are not surprising. For example, whereas alcohol clearly dominated the alcohol cluster, seafood, poultry, vegetables, and alcohol together were the largest contributors to the protein and alcohol factor. These differences in food composition may explain why the protein and alcohol factor was more strongly related to HDL cholesterol than was the alcohol cluster. Because alcohol (23, 24) and fish (25, 26) consumption are independently related to HDL cholesterol, a pattern high in both may show a stronger relation to HDL cholesterol than a pattern dominated by alcohol. Furthermore, the protein and alcohol factor was inversely related to carbohydrate (10), and substituting protein for carbohydrate can increase HDL cholesterol (27). Similarly, the sweets factor and sweets cluster had notable differences in food intakes, which could explain the differences in effects on plasma lipids.
A second methodologic difference between the cluster and factor analysis procedures is the types of patterns, and hence exposure variables, the 2 procedures create. Factor analysis creates patterns of food intake based on the way foods correlate with each other, for which every individual receives a standardized factor score for each pattern derived. Because the factors are linear, continuous variables that are not mutually exclusive, an individual's dietary pattern is represented by looking at his or her scores for all derived factors (10). On the other hand, cluster analysis creates patterns that are mutually exclusive (ie, categorical variables) and that are defined by maximizing differences in mean intakes of food (groups). Differences in findings between the healthy cluster and the reduced-fat dairy products, fruit, and fiber factor (factor 1) may therefore be explained in part by the fact that individuals who scored high on factor 1 may vary considerably in their other factor scores, therefore reflecting differing overall dietary patterns. Overlap in factor scores may explain inconsistent results whether comparing factors with clusters, as we do here, or comparing factors across studies. Findings from cluster analysis are easier to interpret because an individual is in one cluster only, outcomes are specific to individuals within each cluster, and each cluster has a specific food and nutrient composition. Despite these methodologic differences, however, we found no significant difference in predicting lipid outcomes when we compared a model containing all clusters with a model containing all factors.
Several studies using factor analysis have examined the relation between food patterns and various plasma lipids (1, 3, 4, 2830). A cosmopolitan pattern (high in vegetable oils, garlic, vegetables, rice, pasta, chicken, fish, and wine) was inversely related to total cholesterol in women and traditional (high in meat, potatoes, saturated fat, and beer) and refined-foods (high in French fries, sugary beverages, mayonnaise, and salty snacks) patterns were positively associated with total cholesterol among both men and women; the cosmopolitan and traditional patterns were directly related to HDL cholesterol, and the refined-foods pattern was inversely related to HDL cholesterol in both sexes (28). In another study (31), a diet high in fruit, vegetables, rice and pasta, egg and cheese dishes, cereals, and fish was inversely related to total cholesterol and positively related to HDL cholesterol, although a convenience diet high in snacks (such as beer, chips, soda, nuts, and cheese) showed similar relations, and the meat and vegetable pattern was most strongly related to HDL cholesterol (r = 0.19). Williams et al (3) also found marginally significant relations (P < 0.01) between a diet high in fruit, salad, fish, poultry, pasta, and rice and HDL cholesterol (r = 0.19) and triacylglycerols (r = 0.18).
Several reports from the Health Professionals Follow-Up Study have examined associations between food patterns and plasma lipids (4, 29, 30). A validation study (30) showed inverse correlations between a prudent pattern (high in vegetables, legumes, whole grains, fruit, oils, salad dressing, and fish) and total cholesterol (r = 0.25) and triacylglycerols (r = 0.17) and positive correlations between a Western pattern (high in meat, butter, high-fat dairy products, refined grains, eggs, and French fries) and total cholesterol (r = 0.18) and triacylglycerols (r = 0.10), although significance levels for these associations were not reported. However, a study using the larger cohort found that the percentage of men with hypercholesterolemia increased with prudent pattern score and decreased with increasing Western pattern score (4). Fung et al (29) found that the Western pattern was positively associated with HDL cholesterol (r = 0.17) and was inversely associated with the ratio of total to HDL cholesterol (r = 0.13), but no significant trends were observed in the association of the prudent or Western pattern with total cholesterol, HDL cholesterol, the ratio of total to HDL cholesterol, or triacylglycerols.
We are aware of only one study that examined the association between clusters and plasma lipids (1). That study found that the alcohol pattern was associated with significantly higher HDL-cholesterol concentrations than were the meat, healthy, and refined sugars clusters, although no significant differences between clusters were found for total or non-HDL cholesterol. The authors (1) also saw that mean HDL cholesterol was lowest in the sweets cluster, which is consistent with our findings and with the results of other studies showing an inverse relation between HDL cholesterol and total carbohydrate, sucrose, and starch (23). High-carbohydrate, low-fat diets are also related to lower total cholesterol and lower HDL cholesterol than are diets high in saturated fat (32), and very-high-carbohydrate diets can also reduce LDL cholesterol (33).
In summary, certain eating patterns may be related to lower total cholesterol, higher HDL cholesterol, and lower triacylglycerols; findings between patterns and LDL cholesterol are sparse. The results are difficult to synthesize, however, given that several patterns appear to be related to plasma lipids and unexpected findings may indicate reverse causality if individuals with high cholesterol changed their diet in response. Although we adjusted for BMI in our models, residual confounding by obesity may be responsible for the observed effects because obesity is related to both diet and plasma lipids. Like our study, several studies in humans (28, 34) and animals (35, 36) have observed different effects of diet on cholesterol in men and women, possibly as the result of hormonal and sex differences in cholesterol metabolism (34, 36).
Few reports have assessed the reproducibility of eating patterns over time (30) and across populations (37). In secondary analyses, we found that clusters showed better reproducibility than did factors in randomly generated split-half samples. However, our reproducibility analyses were limited by the small sample size of the present study. Additional research is needed to further examine the reproducibility of both clusters and factors both within samples and between populations.
Our study is strengthened by the use of dietary records, which is considered the gold standard of dietary assessment. In addition, although both patterning methods are empirically derived and involve subjective decisions on the part of the investigator, as previously discussed (9, 10), our study design allowed us to compare these 2 methods in a way that has not previously been done. Our study is limited because of the lack of variation in ethnicity and education in this highly educated and mainly white population, a limitation that may decrease the generalizability of our findings because eating patterns differ according to these variables (38, 39). However, it is unlikely that socioeconomic status or demographic variables would modify the biological effect of diet on plasma lipids. This work should be reproduced in populations with greater variation in sociodemographic characteristics and eating patterns. We were also unable to adjust for the role of genetics in our models, which may modify the relation between diet and plasma lipids as the result of genetic variation in lipoprotein metabolism (40). Our study did not have adequate power to examine pattern-disease associations, and additional research should compare these methods with disease outcomes.
In conclusion, our study provides evidence of comparability between empirically derived eating patterns using cluster and factor analysis procedures in relation to plasma lipid biomarkers. The healthy cluster and reduced-fat dairy products, fruit, and fiber factor showed similar associations with plasma triacylglycerols, whereas the alcohol cluster and protein and alcohol factor showed similar associations with total cholesterol and its fractions. Comparing factor and cluster analysis methods in relation to other biomarkers or disease outcomes is needed to better understand the utility of these methods in nutritional epidemiologic research.
| ACKNOWLEDGMENTS |
|---|
PKN was responsible for the design and analysis of this report and drafted the manuscript. DM contributed to data collection and data preparation. KLT contributed to the analysis and oversees the Tufts collaboration with the Baltimore Longitudinal Study of Aging. All authors made critical comments during the preparation of the manuscript and fully accept responsibility for the work. No author had a financial interest or professional or personal affiliation that compromised the scientific integrity of this work.
| REFERENCES |
|---|
|
|
|---|
This article has been cited by other articles:
![]() |
K. M Hendricks, D M. Mwamburi, P. Newby, and C. A Wanke Dietary patterns and health and nutrition outcomes in men living with HIV infection Am. J. Clinical Nutrition, December 1, 2008; 88(6): 1584 - 1592. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M Berg, G. Lappas, E. Strandhagen, A. Wolk, K. Toren, A. Rosengren, N. Aires, D. S Thelle, and L. Lissner Food patterns and cardiovascular disease risk factors: The Swedish INTERGENE research program Am. J. Clinical Nutrition, August 1, 2008; 88(2): 289 - 297. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Esmaillzadeh and L. Azadbakht Food Intake Patterns May Explain the High Prevalence of Cardiovascular Risk Factors among Iranian Women J. Nutr., August 1, 2008; 138(8): 1469 - 1475. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A Nettleton, M. B Schulze, R. Jiang, N. S Jenny, G. L Burke, and D. R Jacobs Jr A priori-defined dietary patterns and markers of cardiovascular disease risk in the Multi-Ethnic Study of Atherosclerosis (MESA) Am. J. Clinical Nutrition, July 1, 2008; 88(1): 185 - 194. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. L. Austin, L. S. Adair, J. A. Galanko, C. F. Martin, J. A. Satia, and R. S. Sandler A Diet High in Fruits and Low in Meats Reduces the Risk of Colorectal Adenomas J. Nutr., April 1, 2007; 137(4): 999 - 1004. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Esmaillzadeh, M. Kimiagar, Y. Mehrabi, L. Azadbakht, F. B Hu, and W. C Willett Dietary patterns, insulin resistance, and prevalence of the metabolic syndrome in women Am. J. Clinical Nutrition, March 1, 2007; 85(3): 910 - 918. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M Velie, C. Schairer, A. Flood, J.-P. He, R. Khattree, and A. Schatzkin Empirically derived dietary patterns and risk of postmenopausal breast cancer in a large prospective cohort study Am. J. Clinical Nutrition, December 1, 2005; 82(6): 1308 - 1319. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. K. Kant and B. I. Graubard A Comparison of Three Dietary Pattern Indexes for Predicting Biomarkers of Diet and Disease J. Am. Coll. Nutr., August 1, 2005; 24(4): 294 - 303. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |