Spatial audio in extended reality (XR) has traditionally been framed as a localization tool, guiding users toward discrete virtual objects or events. This paper reframes this object-centered paradigm by presenting audio formgiving, an approach in which sound defines continuous zones demarcated by boundaries that users encounter through embodied movement. We present a mixed-reality study that investigates how participants perceive, reconstruct, and navigate such sound zones. We report our findings on reconstruction accuracy and boundary ambiguities across different sound zone shapes and sizes, and how movement trajectories relate to zone recognition, as well as participants’ strategies for navigating and identifying different types of sound zones.