Comments on the EMA’s 'Reflection paper on use of real-world data in non-interventional studies to generate real-world evidence'
1. General comments
- IQWiG appreciates the opportunity to provide comments on the reflection paper.
- IQWiG strongly supports that the reflection paper discusses the design of non-interventional studies (NIS) based on the research questions that should be addressed with a given study.
In our experience, the relevance of NIS is often discussed without a clear specification of the research question under investigation. In particular, no distinction is made between NIS for descriptive study objectives and those aimed at investigating causal effects. As the latter in particular require a number of study design elements that may not be easy to implement, the feasibility of conducting NIS to estimate causal effects is often overestimated.
- The reflection paper describes the most relevant study design elements required in NIS with causal objectives. However, we miss a clear statement that a causal interpretation of NIS results is not possible if the biases and confounding inherent in non-randomised studies cannot be adequately prevented or controlled. The burden of proof lies with the sponsor of the trial who wishes to demonstrate causal treatment effects based on a NIS.
2. Specific comments on text
Line number(s) of the relevant text | Comment and rationale; proposed changes |
---|---|
71-72 | Comment: The example given in this bullet point is unclear with regard to how (at which level of evidence) the validation of outcomes has been performed. It seems to suggest that surrogate validation can (as a standard) be achieved by NIS. The standard methods for surrogate validation include not only a correlation of surrogate and clinical outcomes on the individual patient level but also a correlation of treatment effects. Therefore, surrogate validation is basically a scientific question about causal effects. Given the inherent difficulty of estimating causal effects from NIS which is also addressed in the reflection paper (e.g. in lines 59-61), surrogate validation from NIS at the study level seems difficult. Proposed change (if any): If this example should be kept in the list, describe more clearly how surrogate validation from NIS using RWD was performed to allow for an understanding of the level of evidence that was achieved. |
110-111 | Comment: We support the clear distinction between NIS having descriptive objectives from those having causal objectives because this has major consequences for the study design, data collection and analysis. Therefore, the description of study objectives in the study protocol should clearly state if the objectives are descriptive or causal. Proposed change (if any): Please see comment on lines 154-156 |
150 | Comment: In the list of documents for regulatory requirements the ICH E9 (R1) Addendum is referenced. It is incomprehensible why the ICH E9 Guideline itself is not referenced. Proposed change (if any): Add the ICH E9 Guideline to the list of documents for regulatory requirements (before the Addendum). |
154-156 | Comment: Please clarify that the research questions should distinguish between descriptive and causal objectives and that the methodological consequences of causal objectives should specifically be addressed. According to our understanding the ENCePP Checklist for Study Protocols does not require an explicit distinction. Therefore, this should be stated in this reflection paper. Proposed change (if any): The design of the NIS should be primarily driven by the need to obtain reliable evidence regarding the research question. If a research question results in a causal objective this should be clearly stated and the corresponding methodological requirements should be addressed. It is the MAH’s … |
223-224 | Comment: We agree with that the concept of target trial emulation is useful when aiming for causal interpretation of effect estimates from NIS. The reflection paper describes that the second step of this concept is to design a NIS as close as possible to the hypothetical trial. According to our understanding, this requires that the NIS can still conceptually answer a causal question. It does not mean that the NIS should only consider what is feasible (in a given data source). Therefore, this step may result in the conclusion that given the research question and the potentially available data, no meaningful trial emulation can be provided. This should be clearly stated. We indeed see from submissions of such NIS that study designs are not trying to mirror the hypothetical trial conceptually but are already adopted to what seems feasible, e.g. by limiting confounders to be considered in the analyses to those available in the selected data sources. We do not consider this an appropriate trial emulation. Proposed change (if any): The second step is to design a NIS as close as possible to the hypothetical trial using epidemiological methods. This second step should be performed independent of the availability of data in the selected data source. |
251 | Comment: Confounding is here described as “difference in underlying disease risk between the treatment groups”. In line 333, however, confounders are stated to be “risk factors for the outcome of interest”, which is more appropriate, as confounding relates to future outcomes. Proposed change (if any): difference risk of developing outcomes between the treatment groups |
252-253 | Comment: Confounders should be systematically identified and assessed, if possible, via a literature search and a structured process. Therefore, we propose to add “systematically” to reflect this requirement. Proposed change (if any): These sources of bias should be systematically identified and clearly stated at the design stage |
321-323 | Comment: Here it is stated to define the timing of study entry and start of treatment to avoid time related biases. Most importantly, however, an index time point (i.e. start of the observation time) should be defined that applies to all included patients in all treatment groups (e.g. time of treatment decision). The start of treatment does not control sufficiently for time related biases (e.g. when comparing CAR-T-cells with chemotherapy), the study entry date might also be dependent on the treatment given. Proposed change (if any): to define at the design stage an index time point (i.e. the start of the observation) that applies to all patients independent of treatment groupt, the timing of study entry, start of treatment |
333-334 | Comment: Confounders should be systematically identified and assessed, if possible, via a literature search and a structured process. Therefore, we propose to add “systematically” to reflect this requirement. In addition, examples of the various sources should be listed and should reflect the need to use several sources (e.g. for validation) to provide a comprehensive set of confounders. Proposed change (if any): Potential confounders (risk factors for the outcome of interest) should be systematically identified from various sources (e.g., disease knowledge and previous studies identified through systematic literature search) to plan […]. |
336-338 | Comment: It is stated that any potential confounders should be identified irrespective of availability of measured confounders in the available RWD and that it is important to identify potentially important unmeasured confounders. However, the consequence of important unmeasured confounders remains unclear. For studies with causal objectives, the following points should clearly be stated (1) It is required to systematically identify all important confounders which have to be taken into account in the analysis and (2) The consequence of the fact that an important confounder is not available is that the considered RWD source is not fit-for-purpose. Proposed change (if any): “It is particularly important to identify potentially important unmeasured confounders. In this case, the considered RWD source is not fit-for-purpose to make causal interpretations of the estimated treatment effect. Either the corresponding study has only descriptive objectives or a different RWD source has to be found in which the important confounders are available.” |
477-478 | Comment: It is stated that for NIS, use of estimates quantifying the magnitude of the effect and of confidence intervals describing the precision of these estimates is essential to support decision making derived from the data with reference to Wasserstein & Lazar (2016). However, Wasserstein & Lazar (2016) do not support this statement. The ASA Statement on p-values says that p-values alone should not be used to describe study results (neither in RCTs nor in NISs or in any other study) and that other approaches should be applied, e.g., confidence, credibility, or prediction intervals; Bayesian methods; alternative measures of evidence, such as likelihood ratios or Bayes Factors; and other approaches such as decision-theoretic modeling and false discovery rates. Nevertheless, the statement above is useful but both for RCTs and NISs. A better reference would be Gardner & Altman (1986). Proposed change (if any): “As in RCTs also for NIS, use of estimates quantifying the magnitude of the effect and of confidence intervals describing the precision of these estimates, both overall and in important subgroups, is essential to support decision making derived from the data (Gardner & Altman, 1986).” Reference: Gardner MJ, Altman DG. Confidence intervals rather than P values: Estimating rather than hypothesis testing. BMJ 1986; 292: 746-750. |