The Delphi technique: FAQs - Axe Valorisation des données

delphi-image — Image retrieved from: Pereira, Raphael Dias de Mello, & Alvim, Neide Aparecida Titonelli. (2015). Delphi technique in dialogue with nurses on acupuncture as a proposed nursing intervention. Escola Anna Nery, 19(1), 174-180. https://dx.doi.org/10.5935/1414-8145.20150024

FAQs

Definition

The Delphi method was designed after WWII to build expert consensus (forecast).

Introduction by Thomas Foth: Consensus methods are defined as a systematic means to measure and develop consensus. These methods are particularly useful when empirical evidence is lacking, limited, or contradictory. Consensus methods are based on the premise that accurate and reliable assessment can be best achieved by consulting a panel of experts and accepting group consensus. They aim to determine the extent to which experts agree about a particular issue, with the ultimate goal of providing a unified expert opinion in the absence of objective evidence. There exist important areas of inquiry that are plagued by high levels of uncertainty and limited evidence-based literature that can be leveraged to support decision making. Consequently, consensus group methods are particularly relevant and valuable to medical research because of their presumed capacity to extract the profession’s collective knowledge, which is often tacit and difficult to verbalize and formalize.

The Delphi involves six stages: (1) identifying a research problem, (2) completing a literature search, (3) developing a questionnaire of statements, (4) conducting anonymous iterative postal or email questionnaire rounds, (5) providing individual and/or group feedback between rounds, and (6) summarizing the findings (Jones et al., 1995; Murphy et l., 1998). The process is repeated until the best possible level of consensus is reached or a pre-determined number of rounds are completed. Participants never meet or interact directly. Benefits of the Delphi method include its capacity to potentially include a large number of participants who are geographically dispersed, its relatively minimal support structure needs thus making it relatively inexpensive, and the avoidance of undue dominance by particular individuals through anonymity. Conversely, the number of modifications that can be implemented to the Delphi method has led to considerable confusion surrounding application and outcomes.

Delphi and other consent methods are considered structured interactions and share several foundational principles that distinguish them from informal consensus meetings. These foundational principles include anonymity, iteration, controlled feedback, statistical group response, and structured interaction.

Selected Readings

Context

The aim of our study is to develop one or two survey instruments to assess the integration of mental health services in primary care. There are currently few surveys that comprehensively assess a range of dimensions of integration and fewer that focus on the delivery of mental health services in primary care. The survey(s) that we design will be guided by a conceptual framework on integrated care with seven dimensions of integration, including whether mental health care is person-centered. We will develop a version of the survey to be completed by primary care clinicians and managers, and are also considering developing a patient version to be completed by primary care patients with mental health problems. To develop the initial survey(s), we will begin by reviewing the literature on tools and measures of integrated care in the hopes of identifying relevant items and measures. If necessary, new items will be generated in order to ensure coverage of all dimensions of our framework. To refine this initial list of items and achieve consensus on the most important items to retain in the survey(s), we plan to use a Delphi approach.

FAQs

Participants and recruitment

1. What number of participants overall and per group should we strive to achieve?

Our initial proposal called for us to recruit 15-20 participants but we also mentioned wanting to recruit participants from different groups (e.g. clinicians, researchers, administrators, policymakers). Would it be more rigorous to increase the number of participants in order to achieve a minimal number of respondents in each group (e.g. 10 clinicians, 10 researchers, 10 administrators, etc.), thus potentially enabling us to do analyses across groups? What number of participants overall and per group should we strive to achieve?

Pierre Pluye: The difficulty is in defining the population of experts (who are the experts, or what are the eligibility criteria for being expert?). For each type of expert, there is no « magic number », but the greater the better. The ideal number may depend on the size of the population of experts. In research, all experts are invited to participate to a Delphi exercise when consensus is needed about a new method (the rare experts being those who published or were funded w.r.t. this method). In your Delphi for instance, assuming that types of experts are GMF psychiatrists, psychologists and family physicians (among other types): (a) only few (between 5 and 10) psychiatrists work (part-time or full-time) in primary care settings in Quebec (GMF), and all of them can be invited in your Delphi (use personal invitation as you know almost all of them); (b) many psychologists work in GMFs (between 100 and 300), and all of them can be invited (e.g., via GMF directors) as few may accept ; and (c) numerous family physicians work in GMF, and all can be invited (e.g., via Réseau-1) as the response rate will be very low (anticipate 5%).

Quan Nha Hong, PhD: There is no rule of thumb regarding the sample size in Delphi. The size will depend on several factors such as the objectives of the study, the nature of hte topic, the composition of the sample (homogeneous vs heterogeneous), the response rate, the resource available, etc. This book address this question: Keeney, S., Hasson, F., McKenna, H.: The Delphi technique in nursing and health research. Wiley Online Library, Chichester, UK (2011).

Thomas Foth: In our research (medical and nursing education) we realized that the definition of experts was often very narrow (and I agree with Dr Pluye’s response). Most of the studies used only physicians (or only nurses), only few used multiple groups. We did not assess the appropriateness of those choices. What is concerning, however, is that in several studies it was not possible to determine who the participants were. I believe that the decisions regarding the size of a panel and the specific criteria/characteristics that determine if an expert will be included or excluded cannot be determined in general (see Hong’s comment). I would like to emphasize that authors should provide their rationale for the type of panel they choose and to keep in mind that “decisions concerning panel members are by no means as straightforward as they appear to be when represented in the literature”. I believe that the definition of what an expert is and whom to include is rather a ‘ olitcal’ question and authors like Sackman (1974) have questioned the assumption that the quality of expert opinion is superior to the opinions obtained from informed individuals.

Michael Shuhla: Pierre’s point about response rate are important, and Quan is right about consensus. From My Research Proposal: »While the terms sampling and sample size have been used in literature regarding expert panel creation, it is important to point out that the rationale of sampling is not based on the requirements of inferential statistics (Powell, 2003). In standard survey based methods the goal of sampling is to achieve adequate statistical power to generalize results to a larger population. Conversely in a Delphi study, the goal of sampling is to ensure that criteria for expertise and knowledge are met to allow more meaningful interaction around the issues under study (Okoli et al. 2004). Panel size has varied tremendously in the Delphi literature. In general where heterogeneous samples are used the panel is much larger, depending on the number of disparate groups involved (Skulmoski et al. 2007). For homogenous samples the consensus within the literature is that a range of 10-18 experts is adequate (Okoli et al. 2004). »

2. To what extent is representativeness important?

We would like our survey to be applicable in all parts of Canada. In Delphi methods, selecting the right ‘experts’ is important as they have an important influence on the results. But in our case, to what extent is representativeness important? How important is it that we recruit researchers, clinicians, administrators or policymakers from all parts of the country? Too much representativeness may make recruitment and data collection difficult to manage but too little might open us up to criticisms that views of our experts (and potentially our survey) are not generalizable to some parts of Canada. How can we achieve a balance here?

Pierre: The statistical representativeness of the expert sample cannot be estimated unless the population of Canadian experts is known and well-defined. Consider inviting all Canadian experts when this is feasible (via mailing lists of federal agencies or pan-Canadian organizations). Alternative option: You are familiar with CFPC strategies for building committees; thus, consider a similar approach, e.g., recruit experts from the Maritimes, Quebec, Ontario, and western provinces (maximum variation sampling).

Quan Nha: This depends on your research question and the context of your study. You can enhance the external validity by having experts in all parts of Canada. At the same time, you will probably need more experts if the context is important (e.g., if your survey will include specific questions related with mental health services that can be very different among the provinces vs more generic questions that can be applied to all provinces).

Thomas: See also my response to question 1. I totally agree that the composition of the panel affects the results and I think one great advantage of Delphi could be to bring together different kinds of experts from different national/international contexts (something that is rarely done). I was wondering if it would not be an interesting idea to combine not only researchers, clinicians, etc. but to also include patients or ’stakeholders’ from patient rights movement, nurses etc.

Mike: This is a tough questions that relates to the overall strangeness of Delphi Studies. I think the paragraph I included in relation to question 1 is applicable here as well.

3. What advice on recruitment strategies could you provide us?

If we seek to recruit experts from other provinces of Canada, particularly clinicians or administrators, what advice on recruitment strategies could you provide us?

Pierre: see above (responses to Q1 and Q2).

Mike: Little to add on the mechanics of recruitment except to remember that the best way to deal with non-response and attrition is try and target respected decisions makers at the different recruitment sites who will help endorse your study. You may consider targeting actual clinics as a starting point and trying to reach out to decision makers there. « cold » emailing can be difficult.

4. How important is it to include the patient perspective in the clinician-version of an instrument?

It was not planned that patients be involved in the Delphi process for refining the clinician-version of the survey. Is this an oversight? If we move forward with a patient-version of the survey, is there still a need to inject the patient perspective in the clinician-version of the instrument and if so how should this be done?

Pierre: The difficulty is in defining the population of experts, specifically for patients. What are the eligibility criteria for being expert? age range, gender, type of health problem, type of treatment, type of patient-practitioner relationship, etc. Consider working with the ‘patient Partner’ component of the Quebec SPOR SUPPORT Unit to address this question as you are a SPOR SUPPORT fellow, and have direct access to them. Patient engagement in research is important and fruitful, but developing a Delphi for patients raises at least two issues in terms of ecological content validation (validation by questionnaire users), reliability testing and usability : (a) the construct may be similar (care delivery) but many concepts and facets are specific as patients’ and families’ perspective and experience regarding the delivery of mental health services differ from those of practitioners (nurses, pharmacists, physicians, psychologists and social workers) and managers; (b) half Canadian adults have a low literacy level, which has a direct impact on the wording of questions and the format of the questionnaire.

Thomas: I see that I seem to be in a minor disagreement with Pierre. Including different groups (interpofessional, users and providers, etc.) might result in interesting and ‘innovative’ results. I see the points raised by Pierre, but I am not sure if the definition of ‘patient experts’ is so different to the definition of medical experts… AND I would say Delphi is a hybrid of qualitative and quantitative methods – and therefore it cannot be considered in statistical terms (this is something that we criticized a little bin in our nursing article: it seems as if through Delphi qualitative data are transformed into quantitative data and I am uncomfortable with this – especially when the whole process I not made transparent). As long as the inclusion carter for the expert panel are made explicit I don’t see why the inclusion of patients should be a problem (and, as I said, I believe that would be rather strengthening the results).

Mike: I think Pierre answered the question. If you want to build a survey for patients then you need to ask patients. Even more complicated however would be that presumably your expertise criteria for the patient panel would be that they have some first hand experience with either receiving or managing mental health services for others. Just based on the subject matter this could be an ethics approval nightmare.

5. Do you recommend some initial contact before sending potential participants an email invitation to engage in the Delphi?

Some literature suggests that initial contact with participants before formal recruitment can improve response rates. We would likely conduct our Delphi online through email. Do you recommend some initial contact before sending potential participants an email invitation to engage in the Delphi?

Quan Nha: Yes, establishing an initial contact with your potential participants can enhance your response rate. Provide clear information on the nature on your project, its purpose, why you chose them, number of rounds, time commitment, consent form, etc. (see Keeney 2011).

Thomas: I am not sure what is meant by “initial contact’ but I equally important is the provision of background information to the participants at the beginning of the consensus building process. This may be less relevant if the participants are truly experts in the domain, but background information may influence the participants and so a clear description of what information was provided and in what format to participants is important (and yes, it’s Keeney 2011 – sorry, just realize that Quan Nha already replied) See, also, Murphy MK, Black NA, Lamping DL, et al. Consensus development methods, and their use in clinical guideline development. Health Technol Assess. 1998;2(3):1-88Mike: See above response to question 3.

Data collection and analysis

6. Are two rounds appropriate in some situations?

A classic Delphi has 3-4 rounds, including a first round that is more exploratory and qualitative in nature. We expect to conduct a literature review that will inform content of the first round of our Delphi, which could be more closed and confirmatory in nature. In this scenario, would a 2-round Delphi likely be enough to achieve consensus on the survey items.

Quan Nha Hong: There are different type of Delphi techniques. Hasson & Keeney (2011) listed 10 types. What you described seems to fit with the definition of « modified Delphi technique » since you will use pre-select items. The number of rounds depends on your pre-defined criteria for ending the project. For example, you can decide that the project will end after 2 rounds. You can also decide that you need to reach a predetermined level of consensus (thus, maybe more than 2 rounds will be needed). See Vernon 2009 for a list of reasons for concluding a Delphi, and Hasson & Keeney (2011) for enhancing rigour.

Mike: See quotes from my thesis: « As per Custer, Scarcella and Stewart (1991), a modified Delphi is advantageous in that it provides a grounding for the research in previously developed work, and reduces the required number of questionnaire rounds associated with the open ended first round of classic Delphi approaches. As Boulkedid et al. (2011) note, general guidelines are not to hold more than 2-3 rounds of iteration due to steep drop out rates and participant burn out. As well there is evidence to support that the majority of change in consensus happens during the first iteration of the questionnaire (Rowe et al., 1991). »

7. What standard of consensus would recommend for our study?

While the Delphi technique is a consensus-based method, there is little agreement in the literature as to what constitutes achieving consensus among participants, i.e. some articles mention that 51% agreement constitutes consensus whereas in others the percentage is 60%, 70% or 80%. What standard of consensus have you used or would recommend for our study?

Quan Nha Hong: There is indeed no consensus. Since your project is more on content validity, I would suggest to take a look at this literature. For example, Polit et al (2007) suggested a item-level content validity index (I-CVI) and found that a I-CVI of 0.78 or higher could be indicative of good content validity. There are also other indexes (such as the CVR – Content validity ratio, see Lawshe 1975 and Lynn 1986).

Polit DF, Beck CT, Owen SV: Is the CVI an acceptable indicator of content validity? Appraisal and recommendations. Res Nurs Health 2007, 30(4):459-467.
Lawshe CH: A quantitative approach to content validity. Pers Psychol 1975, 28(4):563-575.
Lynn MR: Determination and quantification of content validity. Nurs Res 1986, 35(6):382-386.

Thomas: I agree with Quan Nha’s reply. The most concerning problem we identified in our reviews was that the definition of consensus was often not reported a priori. We also examined the reporting of response rates for each round of data collection. Many of the studies we analyzed reported the number of participants solicited as participants, more than half of the studies reported the number of participants in round 1 of data collection, and only half of the studies reporting the participants for round 2. Such poor reporting has found in other analyses of consensus methods research, with 6.6% to 39% of studies reporting response rates for all rounds. Clearly, interpretation of the study results would warrant inclusion of this information. Anonymity or private decision making is considered essential to consensus group methods. Unfortunately, less than half of the studies we vetted explicitly reported that anonymity was maintained or provided enough information in the methods section to make this clear. While the authors may have assumed that readers would understand that anonymity was part of the study design, we suggest that this is a faulty assumption especially given the variability of what can be labeled as a “modified” consensus method. Another important feature of consensus group methods is controlled feedback to participants including statistical group responses and can include qualitative information These features are felt to be central to consensus methods, allowing participants to re-rank items based on the responses of others. In our study, feedback was reported in approximately one third of studies. Other researchers have also highlighted poor reporting of anonymity and feedback to participants.

Mike: I would add that it depends on how you wish to your format your Delphi in terms of response items. « Gracht (2012) outline a range of subjective and statistical approaches that have been used to measure consensus in Delphi studies, however it is important to note that there is no single accepted quantitative measure for consensus. One of the most common approaches is to present some measure of central tendency (mode, median, mean) in combination with response dispersion (standard deviation, interquartile range, coefficient of variation). The choice in approach is somewhat dictated by the response format of the questionnaire. As per Gracht (2012) use of mean scores as measure of central tendency, although common, is not a correct approach if an ordinal scale is used as a response format. As per Murphy (1998) a more robust approach is to use the median as a measure of central tendency and an interquartile range (IQR) as a measure of dispersion. »

Ethics and project management

8. Do you see value in inviting Delphi respondents after rounds to exchange further on the Delphi results?

We have proposed to have an in-person meeting with participants after the 2nd round of the Delphi. However, this may raise ethical concerns given the anonymous nature of participating in the Delphi process. There is also a feasibility problem if we invite participants from across Canada. Do you see value in inviting Delphi respondents to participate in a teleconference after the 2nd round to exchange further on the Delphi results or will their participation in the Delphi be sufficient?

Quan Nha: What is your aim for organizing an in-person meeting after the 2nd round? If it’s only for presenting the results, not sure there is a need to organize a meeting. The results can be compiled and send via email (similar to the 1st round).

Pierre: One of my PhD students’ thesis, used Delphi to build consensus between oncologists, family physicians and breast cancer survivors. Focus groups were used to provide in-depth explanations of the Delphi results from all the parties (sequential explanatory mixed methods design). You may consider virtual focus groups with each type of expert when needed.

Thomas: see my response to question 7.

Mike: As per Quan, you need to have a valid research related reason for having a focus group meeting. Is there a different or related question that your Delphi can’t address? If so then yes focus groups may be appropriate, but if is just to continue the Delphi…I don’t think so.

9. What are the most important lessons about project management that you can pass on to our research team regarding conducting a Delphi project efficiently and rigorously?

Quan Nha: Read: Keeney S, Hasson F, McKenna H: Consulting the oracle: ten lessons from using the Delphi technique in nursing research. J Adv Nurs 2006, 53(2):205-212.

Mike: Make sure you set up your data management tools efficiently. Depending on how you manage your Delphi it is common practice for the 2nd and 3rd rounds to present the data of the participant in the context of the groups overall ratings (some type of Histogram). This can be very time consuming, so don’t under estimate the amount of time it will take to analyze and manage the results and then prepare the personalized second/third round of Delphi questionnaires. Also when I was researching electronic platforms for the Delphi, I couldn’t find any that would easily allow you to present this data back to the participant. You might end up needing to email separate files that are summaries of the participants first/second round answers, and the group results.

Axe Valorisation des données
de l’Unité de soutien SSA| Québec

Département de médecine de famille
5858, chemin de la Côte-des-Neiges
3^e étage
Montréal, Québec H3S 1Z1
Tél.: (514) 399-9134
[email protected]

Liens rapides
Unité de soutien SSA| Québec
Instituts de recherche en santé du Canada (IRSC)
McGill | Département de médecine de famille

Definition

Selected Readings

FAQs

Participants and recruitment

Data collection and analysis

Ethics and project management

Definition

Selected Readings

Context

FAQs

Participants and recruitment

1. What number of participants overall and per group should we strive to achieve?

2. To what extent is representativeness important?

3. What advice on recruitment strategies could you provide us?

4. How important is it to include the patient perspective in the clinician-version of an instrument?

5. Do you recommend some initial contact before sending potential participants an email invitation to engage in the Delphi?﻿

Data collection and analysis

6. Are two rounds appropriate in some situations?﻿

7. What standard of consensus would recommend for our study?

Ethics and project management

8. Do you see value in inviting Delphi respondents after rounds to exchange further on the Delphi results?

9. What are the most important lessons about project management that you can pass on to our research team regarding conducting a Delphi project efficiently and rigorously?

Axe Valorisation des données de l’Unité de soutien SSA| Québec

5. Do you recommend some initial contact before sending potential participants an email invitation to engage in the Delphi?

6. Are two rounds appropriate in some situations?

Axe Valorisation des données
de l’Unité de soutien SSA| Québec