Copyright 2003 Royal Meteorological Society. International Journal of Climatology. 23: 1725–1741 (2003). Published online in Wiley InterScience ( DOI: 10.1002/joc.969


We evaluate the ability of the National Centers for Environmental Prediction (NCEP)–National Center for Atmosphere Research (NCAR) reanalysis to represent the synoptic-scale climate of the Midwestern USA relative to radiosonde data. Independent, automated synoptic classifications, based on rotated principal component analysis (PCA) of 500 hPa geopotential heights, 850 hPa air temperatures, and 200 hPa wind speeds and a two-step clustering algorithm, result in a 15-type NCEP–NCAR synoptic classification and a 14-type radiosonde classification. The classifications are examined in terms of similarities and differences in the modes of variance manifest in the PCA solutions, the spatial patterns and variability of input variables within each weather type, and the temporal variability of the occurrence of each weather type. The classifications are then compared in terms of these characteristics and the degree of mutual class occupancy. Although the classifications identify a number of the same weather types (in terms of the input data, PCA solution, and mutual occupancy), the correspondence is imperfect. To assess whether the differences in the classifications are due to errant assignment of data to clusters or to differences in the fundamental modes present in the data sets as represented by the PC loadings and scores, a third targeted classification is undertaken that categorizes the NCEP–NCAR reanalysis data according to the radiosonde PCA solution. This classification exhibits a higher degree of similarity to that derived using the radiosonde data (in terms of both interpretability and mutual class occupancy), but the solutions still exhibit considerable differences. It is probable that the discrepancies are partly a function of the differing data structures and densities, but they may also reflect differences in the intensity of synoptic-scale phenomena as manifest in the data sets.