“ Tell me more about this …” : An examination of the efficacy of follow-up open questions following an initial account

Summary In information gathering interviews, follow-up questions are asked to clarify and extend initial witness accounts. Across two experiments, we examined the efficacy of open-ended questions following an account about a multi-perpetrator event. In Experiment 1, 50 mock-witnesses used the timeline technique or a free recall format to provide an initial account. Although follow-up questions elicited new information (18 – 22% of the total output) across conditions, the response accuracy (60%) was significantly lower than that of the initial account (83%). In Experiment 2 ( N = 60), half of the participants received pre-questioning instructions to monitor accuracy when responding to follow-up questions. New information was reported (21 – 22% of the total output) across conditions, but despite using pre-questioning instructions, response accuracy (75%) was again lower than the spontaneously reported information (87.5%). Follow-up open-ended questions prompt additional reporting; however, practitioners should be cautious to corroborate the accuracy of new reported details.


Summary
In information gathering interviews, follow-up questions are asked to clarify and extend initial witness accounts. Across two experiments, we examined the efficacy of open-ended questions following an account about a multi-perpetrator event. In Experiment 1, 50 mock-witnesses used the timeline technique or a free recall format to provide an initial account. Although follow-up questions elicited new information (18-22% of the total output) across conditions, the response accuracy (60%) was significantly lower than that of the initial account (83%). In Experiment 2 (N = 60), half of the participants received pre-questioning instructions to monitor accuracy when responding to follow-up questions. New information was reported (21-22% of the total output) across conditions, but despite using pre-questioning instructions, response accuracy (75%) was again lower than the spontaneously reported information (87.5%). Follow-up open-ended questions prompt additional reporting; however, practitioners should be cautious to corroborate the accuracy of new reported details. In both intelligence and criminal investigation contexts, interviewers commonly ask follow-up questions to elicit additional information, and to clarify reported details and inconsistencies (Evans & Fisher, 2011;Shepherd & Griffiths, 2013). Spontaneously reported information can be highly accurate but witnesses often omit critical details in their reports that may be useful in an investigation, thus interviewers may need to use follow-up questions (Hope, Gabbert, & Fraser, 2013;Roberts & Higham, 2002;Smeets, Candel, & Merckelbach, 2004). The current experiments examine the efficacy, in terms of both quantity and accuracy, of follow-up, open-ended ques-tions that prompt interviewees for further information based on their initial account.
Follow-up questions to extend and clarify witness accounts are recommended in evidence-based interviewing protocols such as the Cognitive Interview (CI; Fisher & Geiselman, 1992). After requesting an initial free narrative about the event, interviewers can prompt for further information by using various memory-enhancing techniques, including a focused-retrieval phase where open questions are used to expand on aspects of the initial account (Fisher, 1995;Fisher & Geiselman, 1992). Building on the principles of the CI, recommendations for practice have been made about the use of appropriate prompts such as questions that start with "Tell," "Explain," and "Describe" (TED questions; for a review see Oxburgh, Myklebust, & Grant, 2010).
In this context, appropriate questions are open-ended, informationseeking questions that prompt the interviewee to elaborate in depth on what has been mentioned already (Gabbert et al., 2016). In fact, in their recent description of an effective evidence-based model of interviewing for practitioners, Brandon, Wells, and Seale (2018) discuss how interviewers might prompt the reporting of additional information using elements of the CI with broad and, if needed, more specific questions.
Even when interviewees are cooperative, they are likely to omit or provide inconsistent details, particularly when reporting complex events.
Although both omissions and inconsistencies occur naturally during retrieval, both have important implications in applied contexts. Details may be omitted due to forgetting or because further retrieval support is needed to access the encoded information. It may also be the case that interviewees are unaware of what details interviewers consider to be relevant (Fisher & Geiselman, 1992). Prompting for specific omitted details after an interviewee provides a free report to an open invitation for information can enable interviewers to elicit more details directly related to investigative objectives (Brandon et al., 2018).
Witnessing complex incidents, such as events involving multiple perpetrators, may result in the reporting of inconsistent, or otherwise disjointed information. Given that both within and between-statement inconsistencies are perceived as diagnostic of the reliability of witness accounts, interviewers might use prompts to assess the accuracy of the reported detail by giving the interviewee the opportunity to clarify an inconsistency (Berman, Narby, & Cutler, 1995;Smeets et al., 2004). In sum, the use of follow-up prompts can serve a number of functions in the interviewing process, by encouraging the interviewee to retrieve more information and to elaborate on their initial account.
The notion that follow-up questions prompt retrieval is based, broadly, on the spreading activation theory, which posits that memory is represented as a network of traces that vary in strength (Anderson, 1983). With each retrieval attempt, a trace is activated and, as a result, it spreads activation throughout the associated elements in the network. Therefore, the use of additional prompts can trigger a search through the memory network, facilitating access to additional memories which were not readily available before (see also Bower, 1967). When a memory is not accessible by a particular prompt, a different prompt might be useful (see also Anderson & Pichert, 1978). The use of openended, non-leading prompts that do not introduce new information but build on a free narrative should effectively encourage retrieval, since the information included in the question can act as a cue for the interviewee (Ibabe & Sporer, 2004). Thus, additional prompts following an initial retrieval may cue more memories and elicit more information.
That asking follow-up questions can lead to the elicitation of more information is neither new nor surprising. Results from meta-analyses on the effects of the CI on memory reporting show that use of the CI, which includes various mnemonics and additional prompts, results in improved reporting of correct details compared to standard interviews. However, there is also sometimes an increase in erroneous reporting as overall reporting increases (Köhnken, Milne, Memon, & Bull, 1999;Memon, Meissner, & Fraser, 2010). One likely explanation for this increase in inaccurate reporting relates to how effectively (or not) interviewees regulate their memory outputs (Koriat & Goldsmith, 1996;Memon et al., 2010). When asked to report information from memory, interviewees face competing demands to be both informative and accurate (Goldsmith, Koriat, & Weinberg-Eliezer, 2002;Koriat & Goldsmith, 1996).
To achieve a balance between the two, research suggests that they tend to strategically monitor the amount of information they report (Koriat & Goldsmith, 1996). Specifically, in a free narrative, interviewees can decide to withhold or volunteer information based on how confident they are about the accuracy of that information.
Interviewees avoid errors by metacognitively assessing how likely it is that an answer is correct and, if it exceeds a pre-set accuracy threshold (the satisficing model; Goldsmith et al., 2002), they volunteer the answer or withhold it instead (control of report option; Koriat & Goldsmith, 1996). Thus, by controlling their responses, interviewees can be highly accurate, even after a delay in reporting (Goldsmith, Koriat, & Pansky, 2005). However, by choosing to report information that is certainly correct, there is a cost to the total amount of reported information, resulting in an accuracy-informativeness trade-off (Brewer, Vagadia, Hope, & Gabbert, 2018;Goldsmith et al., 2002;Koriat & Goldsmith, 1996). Conversely, if interviewees attempt to be more informative, they risk reporting details that they are not as confident about, resulting in an increase in erroneous reporting.
Although the increased reporting of errors in the context of elaborate memory reports can be attributed to metacognitive monitoring, we do not have a clear understanding of where errors are most likely to spontaneously occur within the interviewing process, assuming recommended practice (e.g. use of open-ended questions). Research on the benefits of the CI for recall has mostly focused on the effectiveness of the different mnemonics rather than on the use of prompts following an initial narrative (e.g. Brunel, Py, & Launay, 2013;Colomb & Ginet, 2012;Memon, Wark, Bull, & Koehnken, 1997;Paulo, Albuquerque, & Bull, 2013). Similar to the use of cues, asking follow-up questions can also further prompt interviewees to search through their memory (Fisher & Geiselman, 2010). Yet, systematic investigation into witness performance when additional prompts are applied is limited or only incidentally reported across research on the development of investigative interviewing techniques. Research frequently focuses on the reporting of an initial account when testing a specific technique or, when an interviewing protocol with mnemonics and prompts is used, the results refer to the total information output across the entire interview but not within each interviewing phase (although see Memon et al., 1997;Paulo et al., 2013;Paulo, Albuquerque, Vitorino, & Bull, 2017).
Across two experiments, the current research examined the effi- cacy of using open-ended questions following a self-administered account, provided with either the timeline technique, which uses a physical timeline format and interactive instructions to facilitate memory for multi-perpetrator events (Hope, Mullis, & Gabbert, 2013), or a free recall format. Although the timeline technique facilitates retrieval compared to free recall Hope, Mullis, & Gabbert, 2013), it has not been tested in conjunction with follow-up questions-which would likely be used in real settings. Specifically, we sought to examine the number of new details reported about a witnessed event in response to follow-up questions and the accuracy of any new information reported (Experiment 1). In Experiment 2, in an attempt to refine the questioning procedure, we tested the use of instructions designed to promote accuracy monitoring in responding.
The objectives of Experiment 1 were exploratory, in that we aimed to assess the quantity and the quality of additionally reported information. Given that there is not a strong rationale in the literature to inform a directional hypothesis, there were no specific expectations about the reporting of additional information in response to prompts following an initial account provided with either reporting format.
However, it was expected that the use of the timeline technique would elicit more correct details compared to the free recall format at the initial reporting phase (Hope, Mullis, & Gabbert, 2013). Openended questions were used as invitations to elaborate on omitted information and gaps (e.g. "Tell me more about [detail already mentioned]"; "What else can you tell me about [detail already mentioned]"; Brubacher, 2007;Gabbert et al., 2016) or inconsistencies in the written account (e.g. "You mention four perpetrators arriving at the location but three leaving, can you explain in more detail what you mean about this part?"). To ensure that the questions matched the interviewee's retrieval pattern (witness-compatible questioning; Fisher & Geiselman, 1992;Wells, Memon, & Penrod, 2006), the participant's own words were used when formulating the questions (e.g. "You mentioned there was a leader of the group. Tell me more about this leader").

| Participants
Participants who were fluent or native English speakers, and aged between 18 and 49 years old, were eligible to participate in both experiments. Participants were recruited through the department's student participation pool and through advertisements circulated across campus.

| Follow-up open-ended questions
A question protocol was composed to prompt additional information based on the initial account, in relation to omitted information, gaps, and inconsistencies/need to clarify (see Table 1).

| Coding
Coding of the interviews in both experiments was conducted by the first author following the scoring template used in Kontogianni, Hope, Taylor, Vrij, and Gabbert (2018). Each detail reported about the witnessed events was identified as a Person (P), Action (A), Object (O), and Setting (S) detail.
A detail was scored as correct if it was present in the event and described correctly. A detail was scored as incorrect if it was present in the event but described incorrectly or if it was not present in the event. Details that were subjective or vague were not coded. A secondary coding was conducted in Experiment 1 with respect to attributions of reported actions to specific actors. Person-action details were scored as correct when an action was correctly attributed to a specific actor (e.g. Male with red shirt raises the crowbar). Sequencing errors were also noted when events were reported in the wrong order. For instance, if ABCD is correct, in ACBD, C would be coded as one sequence error as it should follow B, but B would not be counted as out of sequence too.
To assess inter-rater reliability across categories, 15% of the interviews in each experiment were randomly selected and coded by an independent rater. Given the use of different reporting formats in Experiment 1, coding was blind to hypotheses and research questions, while coding in Experiment 2 was also blind to experimental conditions.
Inter-rater reliability was computed ICC based on the mean value of two raters, using an absolute agreement definition and a two-way mixed effects model, as the raters were fixed (McGraw & Wong, 1996 3 | EXPERIMENT 1 3.1 | Method

| Participants and design
Fifty participants (37 Females, Age: M = 24.64, SD = 6.99, Range 18-47 years) were randomly allocated to a timeline (n = 25) or a free recall condition (n = 25). The dependent variables were the number of correct and incorrect details reported in each interview phase (initial report and follow-up questioning), the number of correct person to action details provided in the initial report, and the accuracy rates for both types of details. Accuracy rates were calculated by dividing the number of correct details reported by total details (correct and incorrect) reported to obtain the proportion of accurate responses.

T A B L E 1
Protocol of follow-up open-ended questions to extend and clarify on the initial account 1 Tell me more about (the part when/person/object/activity)…

Stimulus event
Participants witnessed a 1 min 20 s long film of a multi-perpetrator assault and robbery (see Hope, Mullis, and Gabbert [2013], and Kontogianni et al. [2018] for previous use of this stimulus). The film starts with three males talking next to a parked car. Two other males join them. A woman with a laptop bag is seen walking in their direction. As she tries to walk past, they surround her, and one male is seen threatening her with a crowbar. One male takes her bag, which is then passed between several perpetrators, while another male films the incident on his phone. The perpetrators run away with the bag.

| Procedure
Participants were asked to take part in a study investigating factors that affect people's reports for witnessed events. Participants witnessed the stimulus event on a computer screen while wearing headphones. Although there was no audible dialogue, headphones were used to ensure that participants were not distracted by incidental background noises. Participants were instructed to pay attention because they would later be asked about the event. After watching the event, participants completed a filler task for 10 min.
In another room, the researcher then presented participants with either a physical timeline format or a free recall format to provide their account. In both conditions, participants were asked to report all the details they remembered about the event and the people involved in order to provide a complete and accurate account. All participants were instructed to not make guesses about things they did not remember. Participants in the timeline condition were instructed to use the person description cards to provide descriptive details about the people involved in the event, and the action cards to report any actions and sequence information and to show "who did what and when." After reporting their initial account ( "Describe X part in more detail." For example, "You mentioned there was a man in a red jumper. Tell me more about this man in the red jumper" or "Explain in more detail what you mean about this part where they threatened her". This procedure allowed for the interviewer to maintain the same phrasing of questions but avoid using a scripted list of cued-recall questions that did not relate to the initial account. Although not explicitly stated, participants were not forced to respond and if they answered by saying "I don't know" or "I don't remember", the interviewer asked the next question. Similarly, if participants repeated the information they had already reported, and/or responded by saying that they had nothing else to report, the interviewer asked the next question. As a final question, all participants were asked, "Is there anything else you would like to report?". During the questioning phase in both conditions, the participant's written account remained on the table and the interviewer pointed to the specific part to which the prompt referred to when asking each question. The follow-up questioning phase was audio and videorecorded, with the camera focusing on the written account placed in front of the participant. For a visual description of the interview stages, see Figure 1 in Data S1. An independent t test analysis showed that participants who used A paired samples t test showed that the accuracy rate of the reported information in the follow-up questioning phase was significantly lower than in the initial reporting phase both in the Timeline condition, t(24) = 7.34, p < .001, d = 1.89, and in the Free recall condition, t(24) = 5.98, p < .001, d = 1.64 (see Table 2).

| Discussion
The results of Experiment 1 show that a sizeable amount of additional information was elicited through follow-up questions, representing 18 and 22% of the total information reported in the timeline and free recall conditions, respectively. It is likely that the use of follow-up questions, used here as open prompts rather than directive cued-recall questions, led to further retrieval attempts focusing on different components of the witnessed event. Therefore, in line with the activation theory of memory (Anderson, 1983), the use of open-ended prompts further cued participants' memory for the event.
Despite the opportunity to provide more information in response to follow-up prompts, participants in the free recall condition still reported fewer correct details overall compared to those in the time- Koriat & Goldsmith, 1996). Although participants in the current study were not required to answer all the questions, the use of follow-up prompts in the context of an interview may have implicitly suggested an increased expectation to be informative (Grice, 1975). As a result, participants may have adopted a more liberal criterion for accuracy to still provide informative answers (Goldsmith et al., 2002). Therefore, the finding that the information provided in response to follow-up questions was not as accurate as the initially reported information may have been due to an accuracy-informativeness trade-off, in that participants were able to report new information, but were not as confident in its accuracy relative to their initial account. In order to satisfy an informativeness criterion, the interviewees likely volunteered more details while risking accuracy.
The current experiment served as a first step to examine the efficacy of follow-up open-ended questions based on a free narrative.
Given that the new information was not as accurate as the T A B L E 2 Experiment 1: Means and SDs of correct and incorrect details (and accuracy rates) provided in the initial reporting phase and in response to follow-up questions spontaneously reported information and considering the potential implications for applied contexts, a second experiment was conducted to examine whether the follow-up questioning phase could be refined using pre-questioning instructions designed to encourage accuracy by emphasising the use of metacognitive processes in reporting.

| EXPERIMENT 2
Research on decision-making mechanisms that are involved when reporting information from memory suggests that rememberers try to achieve a balance between informativeness and accuracy (Goldsmith et al., 2002;Koriat & Goldsmith, 1996). To this end, rememberers control how much information they report based on how confident they are about the accuracy of their recollection (Ackerman & Goldsmith, 2008;Koriat & Goldsmith, 1996).
Rememberers also regulate their answers by adjusting the precision of the reported information (control over grain size; Ackerman & Goldsmith, 2008;Goldsmith et al., 2002). For instance, if asked to provide quantitative information, they may offer a coarse-grain answer (i.e. broad), instead of a fine-grain answer (i.e. specific), such as reporting that an event occurred "between 17.00 to 18.00" instead of "at 17.15". According to the satisficing model of the minimum-confidence criterion (Goldsmith et al., 2002), rememberers start by retrieving a fine-grain answer, which they will volunteer if it is likely to be correct, otherwise they will provide a coarse-grain answer to preserve accuracy. Further to the satisficing model (Goldsmith et al., 2002), the dual-criterion model suggests that informativeness also mediates reporting, in that even if coarse-grain responses are more likely to be correct, they may be withheld if assessed as not sufficiently informative (Ackerman & Goldsmith, 2008;Yaniv & Foster, 1995). In an investigative context, an interviewee could report coarse details to maximise accuracy. However, if the reported details are thought of as too broad to progress the investigation, the interviewee might choose to offer more specific information, thus using both a confidence and an informativeness criterion to regulate reporting. Given the pattern of findings observed in Experiment 1, it may be that interviewees initially reported information that they assessed as probably correct but in response to follow-up questions they were more willing to risk accuracy to satisfy a demand for informativeness.
Experiment 2 examined whether instructions designed to promote the exercise of metacognitive monitoring would improve accuracy rates for reporting in response to prompting. Half of the participants were instructed that they could withhold from providing an answer (i.e. say "I don't know") and that they could regulate the precision of their answers by providing coarse-grain (e.g. he wore dark clothes) or fine-grain information (e.g. he wore a grey jumper and black jeans). Previous research applying the metacognitive monitoring framework to a forensic context has shown that by using conservative criteria, mock-witnesses can successfully maintain the accuracy of their reporting after a delay (Goldsmith et al., 2005), and after being exposed to misinformation by a co-witness (Wright, Gabbert, Memon, & London, 2008), and that they can balance informativeness and accuracy when answering cued-recall questions (Weber & Brewer, 2008). Other research examining how interviewees regulate the output and precision of their reporting in various contexts (e.g. reporting in private vs. with an audience) suggests that interviewees would often rather provide informative (i.e. fine) details. However, this tendency is reduced in the presence of an evaluative audience or when they receive penalties for inaccurate responses, in which case they report more coarse details, which are more likely to be accurate (McCallum, Brewer, & Weber, 2016). More recently, Brewer et al. (2018) showed that interviewees can use coarse-grain responses to report on a wide range of topics, from a person's appearance (e.g. hair length and hair colour) to the description of objects and locations, and that they can be provided in response to cued-recall questions even if they were not initially volunteered in a free narrative. Therefore, based on previous research, interviewees should be able to maintain accuracy in reporting by following the instructions that promote monitoring of their memory output and of the type of details they report.
Participants were also reminded that they should not guess. The use of warnings to interviewees to not guess and to reply "I don't know" or "I don't remember" throughout the interview are recommended in the use of the CI to avoid erroneous reporting (Memon et al., 2010). Similar warnings to avoid guessing are also included in other interviewing tools, to encourage interviewees to only volunteer information they are certain about (e.g. Self-Administered Interview; Gabbert, Hope, & Fisher, 2009).
There is also evidence that warnings can contribute to the interviewees controlling their reporting more carefully over time (Gawrylowicz, Memon, & Scoboria, 2014). Research by Koriat and Goldsmith (1996) also suggests that participants are more likely to maintain accurate reporting when instructed to not guess if they are uncertain about any details. Related research on metacognitive monitoring indicates that allowing "I don't know" responses and not forcing interviewees to respond to prompts, reduces guessing and increases accuracy when both answerable and unanswerable questions are asked (Scoboria & Fisico, 2013;Scoboria, Mazzoni, & Kirsch, 2008). Therefore, there is evidence that the use of warnings and instructions to control monitoring of memory output can lead to increased accuracy.
To determine whether the results regarding the accuracy of the information reported in the follow-up questioning phase of the first experiment would replicate, the procedure largely remained the same.
A different stimulus was used to increase the generalizability and the relevance of our findings for different interviewing contexts. In Experiment 2, participants witnessed a stimulus event that initially depicted a meeting of a terrorist group who then progressed to placing explosives in a target location. Given the promising results on using the self-generated cues in conjunction with the timeline technique in previous research (Kontogianni et al., 2018), a modified version of the timeline was used here to include use of the mnemonic. Self-generated cues are salient details of the witnessed event that are produced by the interviewees themselves and facilitate recall compared to interviewer-generated cues and no cues (Kontogianni et al., 2018;Wheeler & Gabbert, 2017). In keeping with the procedure of the previous experiment, the same follow-up open-ended questions were used, with the addition of specific pre-questioning instructions to encourage accurate reporting.
Confidence plays a key role in monitoring and controlling reporting (Koriat & Goldsmith, 1996) as well as in the regulation of precision in reporting (Goldsmith et al., 2002). For instance, mock-witnesses are more confident about accurate than inaccurate reported details (Fisher, 1995;Roberts & Higham, 2002), and are more likely to volunteer responses in which they are highly confident (Weber & Brewer, 2008), and withhold responses when they are not confident (Evans & Fisher, 2011). To explore whether retrospective confidence judgments correspond to the pattern of the accuracy rates for the reported information, at the end of the session, all participants were asked to rate how confident they felt about their written and spoken accounts. Unlike related research on the relationship of confidenceaccuracy, we only used two measures regarding the total output for each reporting phase. This is because we were interested in the trajectory of interviewees' confidence ratings relative to that of the accuracy rates for the reported information. For instance, if accuracy for the information provided in response to follow-up questions was lower than the accuracy of the initial account, we were interested to explore if confidence was also lower in the follow-up questioning phase relative to the initial reporting phase.
We predicted that, when interviewees received instructions to monitor the accuracy of their responses to follow-up questions, the accuracy rate of their responses would be higher than when interviewees received no additional instructions. As the current experiment focused on the efficacy of the instructions to support accurate reporting in the follow-up questioning phase, all participants used the timeline technique to provide their initial account.

| Participants and design
An a priori G*Power statistical analysis (Faul, Erdfelder, Lang, & Buchner, 2007) showed that a sample of 60 participants was required for an 80% chance of detecting a large effect size (Cohen, 1992) for the finding of improved accuracy after receiving instructions to monitor reporting based on previous related findings (e.g. Goldsmith et al., 2002;Koriat & Goldsmith, 1996;Scoboria & Fisico, 2013;Weber & Brewer, 2008). The dependent variables were the number of correct and incorrect details, and accuracy rates for both reporting phases, as well as the confidence ratings pro-

Stimulus event
Participants witnessed a 4.28 min long scripted film that depicted a meeting between four perpetrators (three males, one female) who plot a terrorist attack and then carry out the plan. At the outset, three of the perpetrators are seen waiting in a room. The film is shot from a first-person perspective to give the impression of the viewer being in the room.
Another individual, acting as the group leader, enters and delivers information about the target of the attack. The leader assigns roles to each member; overseeing the operation, placing the explosives, acting as a look out, and being the getaway driver. The perpetrators discuss the explosives to be used and how they are to be detonated and when. Next the three perpetrators visit the selected target, a park, and are walking down a pathway. One of the males walks around a café with a briefcase which allegedly contains the explosives. The other male takes photos of the park while the female looks at a map. After the first male returns without the briefcase, the female hands him a mobile phone in a covert interaction. All three are seen exiting the park. There is a brief dialogue from inside the car confirming that the explosives have been placed.

Accuracy monitoring instructions
Based on previous research, the instructions reminded participants to refrain from guessing (Gabbert et al., 2009;Gawrylowicz et al., 2014;Memon et al., 2010), to feel free to withhold an answer (Scoboria et al., 2008;Scoboria & Fisico, 2013), and to consider the level of detail they felt they could accurately report (Goldsmith et al., 2002;Koriat & Goldsmith, 1996;Weber & Brewer, 2008; see Data S1 for verbatim instructions). With respect to the level of detail in reporting, participants were asked to provide all the information they believed to be accurate from the event, regardless of whether it was fine or coarse in nature. Participants were provided with examples of fine-grain and coarse-grain details, such as describing a car as "small and dark coloured" (coarse), or as "a Volkswagen Golf, British Racing Green, 5-door hatchback, with tinted windows, and a registration number" (fine). To make sure that the instructions were clear, participants were asked to answer the practice question "what can you remember about what footwear the researcher in the room with you is wearing?", by reporting coarse and/or fine details about what they remembered.

| Procedure
Participants were invited to take part in research investigating factors that affect people's memory reports for witnessed events. Participants viewed the stimulus event on a computer screen using headphones.
Participants were instructed to imagine that they are an undercover agent that infiltrated a terrorist group and to pay attention because they would later have to provide a report on the activities of the group that would be passed on to intelligence analysts. After watching the event, participants completed a filler task for 10 minutes. In another room, the researcher then presented the participants with a physical timeline reporting format to provide their account. Following Kontogianni et al. (2018), participants were given a self-generated cues instruction to write down the first six things that they remembered from the event, without thinking too hard, to think about each of the things they listed and think about whether that memory helped them remember other things about the event. All participants received the same timeline instructions as in the first experiment. After completing their account, half of the participants were provided with the Accuracy Monitoring Instructions. These instructions were presented in written format after participants provided their initial account and prior to being asked any follow-up questions. After they had the chance to ask any questions about the instructions, the instructions were removed and the follow-up questioning phase began.

| Confidence ratings
An independent t test analysis showed that there was no significant difference between conditions with respect to confidence ratings for the information provided in the initial account, t (57)  confidence ratings for their initial account and for their responses to follow-up questions across conditions, t(58) = 0.14, p = .888, d = 0.02, 95% CI [−0.24, 0.27]. Table 4 shows the mean confidence ratings with standard deviations across conditions. A separate exploratory examination of the results for confidence was conducted to more closely examine how the mean accuracy rates provided across reporting phases were distributed at each level of confidence, as in Brewer et al. (2018). The means and SDs are shown in Table 5. The results show that most participants expressed between 60 and 80% confidence in the accuracy of their accounts although some participants appear as overconfident and others as underconfident, given the actual accuracy rates reported.

| Discussion
Contrary to our hypothesis, providing participants with instructions designed to encourage accurate reporting did not significantly increase the accuracy of the information provided in response to follow-up questions, relative to participants who received no instructions. With respect to the efficacy of follow-up questioning, the current results follow the same pattern observed in Experiment 1.
Participants reported additional information in the follow-up questioning phase: specifically, 22% (accuracy monitoring instructions condition) and 21% (no instructions condition) of the total information reported was provided in response to open-ended questions. In terms of overall accuracy, accuracy rates for the initial account were high (87.5%) and consistent with previous research (e.g. Colomb & Ginet, 2012;Evans & Fisher, 2011;Gabbert et al., 2009) but the accuracy rate observed in the questioning phase was lower (75%). judgments might be indicative of their regulation in reporting. For instance, although the administration order of the confidence ratings was used to match the way that information was reported through the session and to indirectly encourage participants to compare their reports, it may have contributed to an anchoring effect, whereby confidence estimates for responses to follow-up questions were biased towards the initial report ratings (Tversky & Kahneman, 1974). However, it is also likely that participants appeared underconfident in the accuracy of their initial reports but overconfident in their responses to follow-up questioning due to accuracy rates declining from the initial report to the follow-up questioning phase while confidence remained stable. Furthermore, retrospective ratings may not be as useful in assessing accuracy for such elaborate free reports compared to cued-recall (e.g. Gwyer & Clifford, 1997;Ibabe & Sporer, 2004). Further research could assess confidence ratings for each response provided to an open prompt, to more closely examine how interviewees consider the accuracy of their reporting.

| GENERAL DISCUSSION
Across two experiments, the results showed that follow-up, openended questions are effective for eliciting new details after an initial free report. However, the accuracy rate for responses to follow-up questions was significantly lower than the accuracy rate for spontaneously reported information. This general pattern of results was replicated across two experiments, using different stimuli which depicted multi-perpetrator events.
The results of both experiments highlight the need to better understand how interviewees' reporting might differ when asked follow-up questions, compared to when they spontaneously report information. Previous research shows that the use of various retrieval attempts, such as techniques included in the CI, can produce increased reporting of more correct details but can also result in a slight increase in the reporting of incorrect details (cf. standard interviews; Memon et al., 2010). specific "what?", "when?", "where?", "who?", "why?", and "how" probes (Oxburgh et al., 2010), to closely examine how interviewees assess their responses and to what extent that is reflected to the actual reported accuracy.
We already know that asking multiple-choice or repeated questions will likely increase the amount of erroneous reporting (Fisher, 1995;Fisher & Geiselman, 1992), and that open-ended questions are preferable and more efficient (Fisher, Milne, & Bull, 2011;Oxburgh et al., 2010) as they allow interviewees to strategically monitor their reports (Evans & Fisher, 2011). The current findings confirm that follow-up open-ended questions are efficient in gaining new information. However, they also suggest that such information might not be as accurate as an initial spontaneous report. Thus, practitioners should be cautious about the reliability of new information provided in response to follow-up questions and seek further corroboration. It is crucial that future research extends our understanding of the limitations of memory reporting, as there is a limited pool of accurate details that interviewees can recall but an unlimited pool of inaccurate details to report.

DATA AVAILABILITY STATEMENT
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.