HarmoniQ recently launched a free online perfect pitch test built on measurement principles from the most recent research on absolute pitch.
If you've ever taken an online perfect pitch test, you probably received a raw accuracy score, something like "73% correct." Sometimes the results are even as simple as a "yes" or "no." But is that enough? While framing absolute pitch as binary is very common, it is also woefully incomplete. You might wonder if there is really any meaningful difference between someone who hears D and thinks it's E♭ and someone else who hears the same D but thinks it's F♯. One is a semitone away and the other is a major third, but they're both just "wrong," aren't they? This line of reasoning dismisses mistakes as random, effectively burying any pattern in the incorrect responses.
But absolute pitch training studies don't use raw accuracy as a primary measure. In Van Hedger's landmark 2019 study researchers tracked participant progress using mean absolute deviation in semitones alongside response time. The 2019 study by Wong et al. was equally explicit:
We used the average error instead of the general naming accuracy because measuring the size of judgment errors additionally informs the precision of pitch naming performance of the individuals, which is more informative than the binary correctness of the responses as measured by general naming accuracy.
Research studies track error distance and response time because, unlike raw accuracy, they show the progress a subject has made between evaluations.
Building for Generalization
HarmoniQ's perfect pitch test uses full-quality audio samples across six timbres spanning four octaves, specifically designed to measure generalization. Someone who has developed timbre-specific absolute pitch or memorized specific low-quality recordings will have trouble when a test is optimized for generalization. Not only does a single instrument test not surface that distinction, but relative pitch is also meaningfully easier when consecutive notes share a timbre.
To account for relative pitch, every consecutive trial on HarmoniQ's test uses a different timbre and ensures at least 13 semitones between consecutive notes in actual pitch. Using relative pitch to reason toward the correct pitch is one of the most common ways perfect pitch test scores get inflated and rendering relative pitch strategies less viable leads to a more accurate assessment of pitch categories. Longer tests also statistically improve assessment quality, and many online tests including HarmoniQ's offer longer and even infinite test lengths for exactly that reason. But the value of additional trials becomes much more apparent when the results go deeper than raw accuracy.
Results!
After completing a test, you receive an in-depth multidimensional breakdown of your performance. Raw accuracy and average response time are displayed first as a summary before breaking both down by timbre to show how your recognition generalizes across instruments. The next section shows individual note accuracy color-coded by performance so you can easily see the strength of your pitch categories relative to one another.
The chart at the bottom of your results is an error distance chart, which can easily be one of the most informative visuals of your pitch categories. For each of the twelve notes it shows your average and maximum semitone distance from the correct note. Particularly if you are actively learning absolute pitch, watching your average error distance decrease over successive tests is one of the clearest signals that your pitch categories are developing, independent of raw accuracy. You can also download your test results as a PDF report.
The downloadable PDF contains everything you already see after each test, but goes a step further. It includes a confusion matrix, which plots every note you heard against every note you answered across all trials. In this view you can see if you were consistently sharp or flat, whether you regularly confuse specific notes with their chromatic or harmonic neighbors, or any systematic errors that reveal something structural about how your pitch categories are organized. The PDF also contains the complete trial-level data for every individual trial, including each note, timbre, octave, response, response time, and error distance. That granularity is useful whether you are tracking your own progress or evaluating perfect pitch as part of a research study.
Measuring these dimensions of absolute pitch recognition is neither new nor unique, but they are also almost always overlooked in public tests. Response time, error distance, and per-note accuracy are consistently used by leading absolute pitch researchers to distinguish between levels of performance and to track progress during training, and the confusion matrix is standard in any serious analysis of pitch categorization data. If you're curious where your pitch recognition stands, you can now get your own detailed analysis with the HarmoniQ perfect pitch test in as little as 5 minutes.