What is the relationship of our Quality Adjustment to GRADE?


Applying the GRADE procedures identifies four levels of ‘evidential quality’ (Balshem et al., 2011). These are given labels and associated numerical icons (stars): High (4); Moderate (3); Low (2); Very Low (1). The numbers appear to have only ordinal status. In GRADE only interventional studies are capable of being graded High or Moderate, observational studies, such as case-control or cohort studies, qualifying as at best Low and possibly Very Low. Other types of study do not constitute even Very Low 'evidence'.

It is important to note immediately that we are not, here, at all concerned with the supplementary GRADE procedures, insofar as they relate to the use of evidence to produce graded recommendations or guidelines (Guyatt et al., 2011). MCDA is our technique for combining evidence for option performance on specified criteria with importance weights for those criteria to produce preference-based opinions.

Problems with GRADE

There are two major problems with the GRADE approach and resulting evidence quality grades from our – or in fact any MCDecisionAnalysis - perspective.

First, to be useful as quality modifiers the above verbal quantifications of the magnitude of evidential quality need to be translated into ratio measures on a numerical 0 to 1 scale.

Second, most decisions at both clinical and policy levels involve person/patient-important criteria for which there is no evidence of even Very Low standard according to the GRADE definition of evidence. The unavoidable assessment of option performance on these criteria will therefore be made largely, or significantly, via some form and quality of expert opinion. Note that ‘expert opinion’ is not evidence for GRADE, functioning only as a potential modifier of the grading of studies in the context of recommendation assessment or guideline production and only that context.

We have therefore developed what can either be seen as a highly modified GRADE procedure for assessing evidential quality or, perhaps more diplomatically, as a completely distinct, 5 dimensional scale for the Quality of Evidence: QE5D.


AS with GRADE the QE5D score applies to the set of ratings for a criterion across the options, not the set of ratings for an option across the criteria. On this we are in complete concordance.

We do not provide verbal descriptors for the QE5D scale, which simply runs from 0 (worthless as a central point estimate of the performance rating of the set of options on a criterion) to 1 (a perfect central point estimate of that rating). The number assigned on the scale can best be interpreted as the assessor’s or assessors’ probability that the set of BEANs (N = Now) for the criterion would turn out to be the BEAT (T = Then), where the latter is the result of an hypothesised future best possible study or studies with the precision of the ratings set to two decimal places (i.e. as a percentage).  A quality rating of .5 accordingly indicates a 50% probability that the BEANs for the criteria being quality graded would be equal to the BEATs.  Intermediate verbal labels on the scale serve no useful function and can only distract from the required numerical probability expression.

The criteria for evidential quality rating in QE5D are the 5 dimensions in GRADE, with the weights assigned to them - our value judgments regarding their relative importance - in brackets

·         Bias (40%)

·         Indirectness (20%)

·         Inconsistency (20%)

·         Imprecision (10%)

·         Publication bias (10%).