Procedure-Type Risk Categories for Pediatric and Congenital Cardiac CatheterizationClinical Perspective
Background— The Congenital Cardiac Catheterization Project on Outcomes (C3PO) was established to develop outcome assessment methods for pediatric catheterization.
Methods and Results— Six sites have been recording demographic, procedural and immediate outcome data on all cases, using a web-based system since February 2007. A sample of data was independently audited for validity and data completeness. In 2006, participants categorized 84 procedure types into 6 categories by anticipated risk of an adverse event (AE). Consensus and empirical methods were used to determine final procedure risk categories, based on the outcomes: any AE (level 1 to 5); AE level 3, 4, or 5; and death or life-threatening event (level 4 or 5). The final models were then evaluated for validity in a prospectively collected data set between May 2008 and December 31, 2009. Between February 2007 and April 2008, 3756 cases were recorded, 558 (14.9%) with any AE; 226 (6.0%) level 3, 4, or 5; and 73 (1.9%) level 4 or 5. General estimating equations models using 6 consensus-based risk categories were moderately predictive of AE occurrence (c-statistics: 0.644, 0.664, and 0.707). The participant panel made adjustments based on the collected empirical data supported by clinical judgment. These decisions yielded 4 procedure risk categories; the final models had improved discrimination, with c-statistics of 0.699, 0.725, and 0.765. Similar discrimination was observed in the performance data set (n=7043), with c-statistics of 0.672, 0.708, and 0.721.
Conclusions— Procedure-type risk categories are associated with different complication rates in our data set and could be an important variable in risk adjustment models for pediatric catheterization.
The primary goal of the Congenital Cardiac Catheterization Project on Outcomes (C3PO) is to develop outcome assessment tools for congenital cardiac catheterization procedures. To accomplish this aim, 6 institutions in 2006 agreed to prospectively collect data on all catheterization procedures performed at each center, including the occurrence of adverse events.1 Prospective data collection commenced in February 2007 and has resulted in the creation of a large multicenter data set for congenital cardiac catheterization procedures, which continues to accrue case information at the participating centers. At this point, we think there is enough data to develop outcome assessment tools.
Clinical perspective on p 194
Catheterization procedures in pediatrics and for congenital heart disease encompass a broad range of procedures, some of which occur infrequently, precluding the ability to assess risk for individual procedure types. Further, there is great variation in the frequency of different procedures between centers and practitioners, with interventions having a large spectrum of potential adverse outcomes. Therefore, to account for procedural diversity, we followed a previously established method used to develop groups of surgical procedures with similar risk.2 In fact, previous work from a single institution demonstrated a strong relationship between consensus-based procedure-type categories and the occurrence of adverse events.3,4 Thus, our goal was to develop procedure-type risk categories, based on the occurrence of adverse event (AE) outcomes using both consensus and empirical methods in a multicenter data set.
Preliminary Consensus Opinion-Based Categories
In 2004, we developed a procedure pick list to encompass the wide range of diagnostic and therapeutic interventions performed at Children's Hospital Boston. From this list, we defined 84 different procedure types. Some of the procedures were given age qualifiers such as aortic valvuloplasty to distinguish distinctly different populations such as critical aortic stenosis from a generally more elective aortic valvuloplasty. Other qualifiers included number of pulmonary artery or vein dilations performed, timing of the catheterization procedure relative to surgery, and techniques such as pressure used for angioplasty.
The panel for this project consisted of 11 physicians with expertise in diagnostic and interventional catheterization from the original 6 C3PO participating centers (online-only Data Supplement 1). The panelists were asked to assign each of 84 procedure types to a risk category, based on their judgment of anticipated risk for an AE. The risk assignments were facilitated through a custom software application and responses were aggregated electronically. The average score (mean for the 11 panel members) was calculated for each procedure type. The procedure types were then sorted by the average score, from the one with the lowest score to the highest. To create preliminary risk categories, threshold or cut-points were designated in the list, based on the mean score for each procedure-type category.
A meeting was convened in May 2006 to review and discuss the procedure type list and preliminary category designations. The panel members were given the option to move procedure types between categories based on expert opinion and panel consensus. Special consideration was given to procedure types positioned near the cutoff points to decide whether they ended up in the appropriate group. The panel chose to move 6 procedure types that were near the transition zones into adjacent categories. They also collapsed 2 procedure types into 1 by removing an age qualifier and expanded 1 category to 2, based on a qualifier for extracorporeal membrane oxygenation (ECMO) status. By the end of the meeting, the panel achieved consensus on the 6 procedure-type risk group categories (online-only Data Supplement 2).
After institutional review board approval was obtained, 6 participating centers started collecting patient and procedural information and the occurrence of adverse events on diagnostic and interventional catheterization procedures starting February 2007. One of the 6 institutions did not start collecting data on biopsy cases until March 2009. Two additional sites became participants in April 2008 and June 2009, respectively. Thus, currently 8 sites are contributing data to the registry. Primary electrophysiology cases are excluded. Data collected, validated, and audited through April 30, 2008, were used to refine the categories via both consensus and empirical methods (Derivation Data Set). Final model performance was assessed in a second data set prospectively gathered between May 2008 and December 2009, which was not used to inform the development of the risk categories (Performance Data Set) (Figure 1).
The methods for data collection and validation were previously published.1 The sponsor provided an exception report to a designated person at each site every month to facilitate review of missing data or data out of range requiring validation. To ensure complete data capture and entry all sites received, a list of cases was entered in the database to check against institutional records and was required to provide confirmation of complete case capture. To prevent coding variations in the primary outcome, all adverse events were reviewed for proper application of seriousness and preventability definitions by the principal investigator and designee. Any misapplication of definitions was reported to the participant and disagreements resolved.
In May 2008 and August 2009, an independent audit of a random 10% of cases was performed at each site by the sponsor. The accuracy and completeness of data entry was assessed by comparing information recorded in the database to the medical record, including the postcatheterization period, and the next admit to the hospital when present to screen for events identified after the case. Complete case capture was confirmed for all sites including the one site that required consent. Missing data were rare but occurred in some cases on the documentation of precase hemoglobin or the use of ultrasound modalities such as transthoracic or transesophageal echo. All interventions when performed were recorded correctly.
Among the 784 cases audited, 149 adverse events were identified on record review. Results of the 2 audits are combined as the findings were similar for both audits. Eighty-five percent of the events were recorded in the database. All 8 level 4 events were captured. For severity level 3 events, 2 events related to sedation and airway management—laryngospasm and hypotension with induction—were not recorded, and 1 late identification of a groin fistula requiring surgical repair, the remaining 31 level 3 events were captured in the database. A 91% event capture rate was observed among high severity (levels 3, 4, and 5) events. Low severity (levels 1 and 2) events had less reliable reporting with a capture rate of 78% (83 of 107). These lower severity events included transient hypotension, metabolic acidosis, rebleed, pulse loss, stridor, emesis, hypoglycemia, and pulmonary edema.
Adverse events were defined as any anticipated or unanticipated event for which injury could have occurred, or did occur, potentially or definitely as a consequence of performing the catheterization. Events were recorded at the time of identification, either at the time of the case or later if determined to be related to the procedure. We used previously established and tested definitions for adverse event severity ranging from severity level 1 to 5 (Table 1).1,3 Any misapplication of definitions was reported to the participant and disagreements resolved.
Data collected, validated, and audited through April 30, 2008, were used to refine the categories via both consensus and empirical methods (Derivation Data Set). Final model performance was assessed in a second data set prospectively gathered between May 2008 and December 2009, which was not used to inform the development of the risk categories (Performance Data Set) (Figure 1). A 15-month data set with cases performed from February 1, 2007, to April 30, 2008, was used to refine the original consensus-based procedure-type risk categories based on 3 types of AE outcomes. Rates of having any AE, any level 3, 4, or 5 severity AE, and any level 4 or 5 severity AE were estimated for cases assigned to a particular procedure type; 95% confidence intervals were calculated using the exact binomial method. General estimating equations (GEE) models were used to explore the predictive ability of the risk categories for each of the 3 AE outcomes; risk category 1 was used as the reference group and binary covariates for each other category were included in the models. GEE models were used to account for the correlation among patients within the same site. Initially, rates of AE were calculated for cases undergoing a single procedure type, and GEE models were used to explore optimum categorization. Procedures with multiple procedure types were then evaluated according to highest recorded category or highest category plus 1, then incorporated based on the option which maximized the c-statistic. Odds ratios and 95% confidence intervals are reported for the final risk categories. The discrimination of the risk categories for predicting outcomes was assessed by the area under the receiver-operator characteristic curve (c-statistic). The validity of the final categories was tested in a prospective data set collected from May 2008 to December 2009. Odds ratios, 95% confidence intervals, and areas under the receiver-operator characteristic curve are reported for each of the 3 models.
Single or multiple procedure types were assigned to 3756 cases in the derivation data set and 7043 cases in the performance data set. There were some notable differences in the patient and procedural characteristics (Table 2). In the performance data set, the population distribution included more patients >1 year of age (77% versus 71%, P<0.001) and with no structural heart disease (29% versus 22%, P<0.001), many of which had undergone heart transplantation. There was a higher percentage of biopsy cases (25% versus 18%, P<0.001), electively performed procedures (82% versus 78%, P<0.001), and fewer cases performed while spontaneously breathing (29% versus 36%, P<0.001). The overall adverse event rate was lower in the performance data set (12.2% compared with 14.9%, P<0.001). The incidence of higher severity levels 3, 4, 5 events and life-threatening events was also lower in the performance data set (Table 3).
Consensus-Based Risk Categories
There were 3855 cases in the derivation data set, from which we excluded 62 with no procedure type designation and 37 in which a procedure-type risk category was not assigned. Those procedures that did not have a risk category included interventions that the group thought was too rare and the risk not predictable such as coronary stenting in children and novel procedures during the study such as placement of percutaneous pulmonary valves. Of the remaining 3756 cases, 3341 (89%) underwent a single procedure. GEE models using the 6 consensus-based risk categories as covariates displayed moderate discrimination of AE occurrence; the c-statistic for any AE was 0.644, any level 3, 4, or 5 AE 0.664, and 0.707 for any level 4 or 5 AE.
Empirical Assessment and Risk Group Refinement
In July 2009, the panel was convened to further refine the groupings, based on the already collected data and expert judgment; AE rates including 95% confidence intervals for each individual procedure type were examined. The panel noted that in the initial descriptive assessment of outcomes previously reported by case type, biopsy cases had a much lower event rate than either diagnostic or interventional cases.1 Thus, the first step was to separate biopsies into a 7th category, which was considered to have the lowest risk. Next, the panel noted that diagnostic cases represented a large proportion of the procedure types (n=987) and were a potentially heterogeneous group in terms of outcome by age. Rates for the AE outcomes for diagnostic cases were calculated for patients <1 month, ≥1 month but <1 year, and ≥1 year and then age groups were placed into categories according to similar risk. Model discrimination for all 3 AE outcomes improved with these changes. Categories were then explored for similar AE rates and 2 categories were collapsed according to the panel's judgment. The panel was then allowed to move individual procedure types, based on judgment and empirical AE rates; 7 procedure types were moved to a higher category and 1 was moved to a lower category. Next, cases with multiple procedure types were added, based first on the highest procedure type recorded and then highest recorded plus 1. Model discrimination was not significantly different for these 2 options (c-statistic: 0.708 and 0.700 any AE, both 0.728 for level 3, 4, and5 AE, and 0.765 and 0.757 for levels 4 and 5). Thus, the panel chose to include the multiple procedure types by highest recorded procedure type. Finally, different possibilities for collapsing categories were explored, and 4 final categories received consensus approval from the panel (Table 4).
Final 4 Risk Categories
Procedure types in the 4 final procedure-type risk categories are shown in Table 3. There was a reduction in the number of procedure types from 84 to 57, as the panel originally thought some qualifiers such as age, time from surgery, or ECMO status were going to represent distinct differences in risk, but these factors did not end up resulting in any differentiation. The rate of any AE increases from 6% in category 1% to 33% in risk category 4 (Figure 2). Life threatening (severity levels 4 or 5) events were observed in <1% of cases categorized in group 1 or 2 and occurred at a much higher frequency in risk categories 3 and 4 (4% and 7%, respectively). The odds of any AE for risk category 4 were 7-fold higher than risk category 1 and any life-threatening event was nearly 16 times more likely (Table 5 and Figure 3). The final models had improved discrimination for predicting AEs compared with the original consensus-based categories (c-statistics: 0.699, 0.725, and 0.765).
Performance Data Set
Between May 1, 2008, and December 31, 2009, an additional 7224 cases were recorded in the database, of which 181 could not be assigned to a risk group and were excluded. In the remaining 7043 cases, 655 (9%) had multiple procedure risk group designations and were assigned to the highest risk group. We observed a higher percentage of category 1 cases in the performance data set compared to the derivation data set (44% versus 36%). Also, AE rates for category 3 and 4 cases were lower in the Performance Data Set compared with the derivation sample (Figure 2). For example, for procedure-type risk group 4, performance and derivations set AE rates were 25% versus 33% for any AE, 12% versus 12% for highest severity level 3/4/5 AE, and 5% versus 7% for highest severity level 4/5 AE. The models had slightly lower discrimination in the performance data set (c-statistics: 0.672, 0.708, and 0.721) (Table 5 and Figure 3).
Cardiac catheterization in pediatrics and for congenital heart disease encompasses a wide range of both diagnostic and therapeutic procedure types. However, assessing outcomes for a provider or institution when the frequencies of many procedure types are quite low is statistically problematic. Nevertheless, this becomes feasible when the entire population can be categorized into groups of similar risk. We used both consensus and empirical methods to develop 4 procedure-type risk categories that have good discrimination for the outcomes any AE, any clinically important AE (severity levels 3, 4, and 5), and life-threatening events (severity levels 4 and 5).
In previous work, we applied the consensus-based risk groups to a data set from a single institution to better understand risk associated with different procedure types.3 There was overlap in the groups, and the 6 categories were collapsed to 3 distinct groups of similar risk. Although these groups showed similar discrimination (c-statistic: 0.651 for any AE and 0.681 for high severity AE) to the consensus models in this data set (c-statistic: 0.644 and 0.707), they were considered preliminary, awaiting empirical refinement, and had limited generalizability since they were assessed using a single institutional experience. The C3PO was created to overcome these limitations.1 In this study, we used a prospectively collected multicenter database designed to facilitate the assessment of outcomes in pediatric and congenital cardiac catheterization. In doing so, we believe the risk categories can be applied more broadly in the field of cardiac catheterization for congenital heart disease.
The large C3PO database, now with information on >15 000 cases and with >3000 adverse events recorded, has allowed us to assess and improve the original consensus based categories. The participant panel made some important decisions based on the data from the derivation data set such as considering biopsy as a separate lowest category of risk, separating diagnostic cases by age, collapsing groups, and moving certain procedure types among categories. The model for the derivation data performed well, but most importantly when the categories were assessed in a separate prospectively collected cohort (Figure 3) the procedure-type categories showed good discrimination for the 3 outcomes assessed: any AE, any clinically important AE (severity levels 3, 4, and 5), and any life-threatening AE (severity levels 4 and 5), (c-statistics: 0.672, 0.708, and 0.721).
There are 2 important differences between the 2 data sets. First, the distribution of cases among categories is different with category 1 cases comprising a higher proportion of the total cases in the performance data set. This may be due to the addition of 2 sites not contributing data in the derivation data set and the addition of biopsy cases in 2009 by 1 of the first 6 sites. Also, AE rates were lower in category 3 and 4 groups, whereas the reasons for this are difficult to know for certain; it may be due to either a difference in distribution of case types within the categories having a lower risk or a reduction in AE rates due to improved performance. Rates among lower category 1 and 2 cases did not change significantly, and our most recent audit (June 2009) would not support a change in the reliability of event capture, particularly among the higher severity events used in this analysis. We believe that these differences highlight the strengths of the risk categories, as this second data set used to evaluate the performance of the risk categories is distinctly different both in contributing sites, distribution of cases, and rates of events.
As previously reported, we observed higher AE rates among interventional cases as compared with both diagnostic or biopsy cases, similar to the findings of other reports.1,5–9 By creating procedure-type risk categories, we have further refined our understanding of risk by defining high- and lower-risk interventional procedure types. Our investigators believe that the final categories have face validity. As an example, high-risk ventricular septal defect and perivalvar leak closures are in category 4, in contrast with closure of venous collaterals in category 1 (Table 4). We have also made distinctions in certain populations such as angioplasty, in which dilation of multiple vessels is in the higher risk category 4 compared with angioplasty of less than 4 vessels or all dilations at low pressure in category 3. To differentiate critical aortic stenosis from elective aortic valvuloplasty, the procedure type of aortic valve dilation is qualified by age, <1 month in category 4 and >1 month in category 3. As supported by the empirical data, it is also the judgment of our panel that procedure types within categories represent procedures of similar risk.
We anticipate a broad range of uses for these procedure-type risk categories. First, they can be used to understand variations in case mix complexity both among providers and institutions. Practically, they can be used to track trends in provider, hospital, or national populations of patients undergoing catheterization for congenital heart disease. Second, the categories may be useful as a variable in research studies on catheterization for congenital heart disease to understand population characteristics and adjust for risk. Third, the categories may allow a better estimate of risk when discussing and consenting patients and families for a procedure. Further, when planning for a case, higher-risk procedures may necessitate special resources such as rapid ECMO capability or available surgeons. Finally, procedure-type risk category will ultimately be an important variable to consider in risk adjustment methods to standardize outcomes among institutions and providers to allow equitable comparisons.
One important limitation of these groups is the assumption that procedural risk is intrinsic to the procedure and not operator experience. However, if the groups are applied in performance assessment methods, this will become evident in lower standardized rates for more experienced operators. Also, although it is clear from these models that the type of procedure performed is an important factor to consider when assessing risk for adverse outcomes, incorporating other aspects of the procedure as well as intrinsic patient characteristics in a multivariable model may be more predictive. Finally, the categories should be tested in other national or international data sets.
Case mix complexity must be accounted for when evaluating outcomes in pediatric and congenital cardiac catheterization. Because the field encompasses such a wide range of procedures, it is not feasible to adjust for isolated procedure types, and instead we elected to designate 4 categories of procedures with similar risk for adverse outcomes. We used consensus-based methods and then refined our categories based on empirical data. Our models show good discrimination between risk categories. We anticipate that procedure-type risk categories will be an important variable in risk adjustment models that incorporate additional patient and procedural characteristics for the purpose of evaluating and comparing outcomes in congenital cardiac catheterization.
Sources of Funding
A web-based application for data entry was developed in 2006 with funding support from the Children's Heart Foundation (Chicago, IL). The application was deployed on a Microsoft Internet Information Server (IIS) obtained with funding support from the American Heart Association. The American Heart Association Physicians Roundtable Award (AHA-PRA) provides support for the project and career development plan for Dr Bergersen (2006 to 2010).
Dr Bergersen is PI on for the C3PO project, which receives grant support from the AHA-PRA.
We thank the study coordinators and personnel who made this project feasible: Gary Piercey, BS, Sephanie Valcourt, BA, Anniece Woods-Brown, RN, BSN, Joanne Chisolm, RN, Sharon Hill, MSN, ACNP-BC, Terri Mclees-Palinkas, MS, CCRC, and Cyndi Murphy, RN, BSN. Also, we thank the Keane Operating Fund at Children's Hospital Boston for providing the resources necessary to perform site visits and independent audits.
The online-only Data Supplement is available at http://circinterventions.ahajournals.org/cgi/content/full/CIRCINTERVENTIONS.110.959262/DC1.
- Received August 23, 2010.
- Accepted January 12, 2011.
- © 2011 American Heart Association, Inc.
- Bergersen L,
- Marshall A,
- Gauvreau K,
- Beekman R,
- Hirsch R,
- Foerster S,
- Balzer D,
- Vincent J,
- Hellenbrand W,
- Holzer R,
- Cheatham J,
- Moore J,
- Lock J,
- Jenkins K
This Congenital Cardiac Catheterization Project on Outcomes (C3PO) currently includes 8 sites collecting data on all cardiac catheterization cases performed at the pediatric institutions. For this project, the participants categorized 84 procedure types into 6 categories using consensus methods based on expert opinion regarding the anticipated risk of an adverse event. After data were collected on nearly 4000 cases, the group used empirical methods to refine the categories and derived 4 categories of procedures with distinct differences in risk of having a clinically important adverse event. We anticipate a broad range of uses for these procedure-type risk categories. First, they can be used to understand variations in case mix complexity both among providers and institutions. Practically, they can be used to track trends in provider, hospital, or national populations of patients undergoing catheterization for congenital heart disease. Second, the categories may be useful as a variable in research studies on catheterization for congenital heart disease to understand population characteristics and adjust for risk. Third, the categories may allow a better estimate of risk when discussing and consenting patients and families for a procedure. Further, when planning for a case, higher-risk procedures may necessitate special resources such as rapid extracorporeal membrane oxygenation capability or available surgeons. Finally, procedure-type risk category will ultimately be an important variable to consider in risk adjustment methods to standardize outcomes among institutions and providers to allow equitable comparisons.