Explanation of Star Ratings | STS Public Reporting

For STS General Thoracic Surgery Database (GTSD) participants, the star rating is derived by testing whether the participant’s score in a composite domain is significantly different from the overall STS average.

Example
For each of the two composite domains (absence of morbidity and absence of mortality), if a participant’s estimated score is lower than the overall STS average but the difference between the participant’s score and the STS average score is not statistically significant, the ratings would each be two stars. However, for the overall composite score, if the participant’s estimated score is lower than the STS average, AND the difference is statistically significant, the overall participant star rating is one star.

The fact that statistical significance was achieved for the composite score but not the individual domains reflects the greater precision of the composite score compared to individual endpoints. This precision is achieved by aggregating information across multiple endpoints instead of a single endpoint.

Because the star rating depends upon how the database participant compares to the STS average for a given time period, and the STS average is subject to change each time the analysis is performed, there is not a prior morbidity or mortality level that a participant needs to attain in order to become a three-star institution. This also is true because the volume of cases at a given institution impacts the comparison of its performance to the STS average.

Statistical significance is based on a 95% Bayesian certainty criterion. A participant receives three stars if there is at least 97.5% Bayesian probability that the participant’s score exceeds the STS mean score (95% credible interval plus the 2.5% upper tail). A participant receives one star if there is at least 97.5% Bayesian probability that the participant’s score is less than the STS mean score. Otherwise, the participant receives two stars.