This study introduces needlets, a specific class of spherical wavelets, for spatial audio applications. Needlets are constructed in the spherical harmonic domain, are mathematically well defined, possess good localisation properties, and facilitate multiresolution analysis. However, because they form a tight frame, they are redundant and therefore require sparsification for practical applications. We propose a comprehensive spatial audio framework based on needlets, spanning encoding through to head-tracking-enabled binaural rendering. In this framework, a sound scene is encoded into a redundant needlet dictionary, which is subsequently sparsified using a novel algorithm. The resulting sparse representation is then decoded for headphone reproduction. Scene rotation is achieved by applying SO(3) rotation matrices to the sparse representation. The perceptual implications of the framework’s design parameters were evaluated using objective metrics and compared with those of Ambisonics. Initial results show that the proposed framework can achieve better tonal and spatial fidelity than third- and fourth-order Ambisonics Magnitude Least-Squares decoding while using a similar number of channels. Moreover, the proposed framework has been shown to allow users to tune the reproduced sound scene while maintaining fidelity.