The multi-stimulus test with hidden reference and anchor (MUSHRA) is a prevalent method for the subjective audio quality evaluation. Despite its popularity, the technique is not immune to biases. Empirical evidence indicates that the presence of labels (quality descriptors) equidistantly distributed along the rating scale may be the cause of its non-linear warping; however, other factors could evoke even stronger non-linear effects. This study aims to investigate the hypothesis that stimulus spacing bias may induce a greater magnitude of non-linear warping of the quality scale compared to that caused by the presence of labels. To this end, a group of more than 120 naïve listeners participated in MUSHRA-compliant listening tests using labeled and unlabeled graphic scales. The audio excerpts, representing two highly skewed distributions of quality levels, were reproduced over headphones in an acoustically treated room. The findings of this study verify the postulated hypothesis and shed new light on the mechanisms biasing results of the MUSHRA-conformant listening tests.