The perceptual evaluation of spatial and immersive audio systems commonly relies on listening tests, where the role of listener-related factors is often treated as secondary. While previous studies have shown that listener expertise can influence performance in virtual audio tasks, this has not been systematically investigated in more complex mixed real–virtual and dynamic listening scenarios. This study examines the role of listener background in a six-degrees-of-freedom (6DoF) spatial detection task involving virtual and real sound sources. Eighteen participants identified the presence of a virtual speech source among concurrent targets and distractors while freely navigating a loudspeaker-based scene. Listener background was characterised by years of musical training and self-reported experience with spatial audio technologies, used to categorise participants as expert or naïve. Results show above-chance performance, with reduced accuracy in spatially adjacent conditions. Listeners with greater musical training and spatial audio experience achieved higher percent-correct scores. These findings are consistent with prior work on listener-dependent localisation performance, and extend them to a 6DoF mixed real–virtual context. The results highlight the importance of explicitly considering and reporting participant expertise in the design, analysis, and interpretation of spatial audio perception studies.