Neuroscience has upended the old myth that perfect pitch is a "you've either got it or you don't" gift. Modern training studies consistently show that adults can learn to name notes without a reference tone. Yet whenever these studies appear, skeptics are quick to note that each experiment trains only a handful of volunteers. Why are the sample sizes so tiny? And what happens when you scale that training to thousands of people?

Small Studies, Big Questions
To understand the criticism, it helps to look at the numbers. A 2019 case study from the University of Chicago trained just six musicians in an eight‑week program. The protocol required participants to complete three different training exercises four times a week, roughly 32 hours of work. Despite the small cohort, the results were striking: two participants (identified as S2 and S5) finished the program with near‑perfect accuracy across all timbres and octaves, while several others showed above‑chance improvements.
More recently, a team from the University of Surrey trained twelve adult musicians online for eight weeks. The researchers designed the program to teach pitch classes (note names) rather than specific octave heights and required learners to master each level multiple times. On average, participants learned to identify just over seven pitches at 90% accuracy, and two participants reached genuinely perfect‑pitch performance (all twelve notes, accurately and quickly). Commentary on the study notes that it involved at least 25 hours of training and began with a single pitch across multiple octaves before gradually adding more notes.
Even the "large" investigations aren't that large. A 2020 follow‑up involving adults who speak non‑tonal languages enrolled 43 participants; six learned to name all twelve pitches at ≥90% accuracy. Combining these three studies yields an average sample size of about twenty participants. That isn't unusual in cognitive neuroscience: for many fMRI and behavioral experiments the median sample size is around 28. Researchers often take a "small‑N" approach that focuses on detailed measurements from a few individuals rather than broad surveys.
Why So Few Participants?
There are practical reasons behind these tiny cohorts. First, training programs are demanding: participants in the Chicago and Surrey studies committed to multiple hours of practice every week, making recruitment difficult. Researchers also try to control as many variables as possible: instrument timbre, octave range, and response time limits, which requires close monitoring. That level of control becomes prohibitively labor‑intensive as sample sizes grow.
Even outside of perfect‑pitch training, neuroscience studies have been criticized for low statistical power because sample sizes are small. An editorial in The Journal of Neuroscience notes that reproducibility suffers when experiments use small cohorts and stresses that statistical power increases with larger samples. A separate analysis of fMRI research points out that collecting data from large groups is expensive: even modest studies can cost tens of thousands of dollars. Because of these financial and logistical constraints, most cognitive studies settle for samples of 20–30 people and treat individuals as their own case studies.
Small sample sizes aren't the only limitation. Most training experiments end once the eight‑week program is complete; few follow participants long‑term. Consequently, we know very little about whether other participants keep improving or whether their skills fade. Tracking dozens of individuals over years would require far more resources than most labs can spare.
Because these studies involve so few people, and though skeptical papers acknowledge meaningful pitch‑naming improvements occur, critics argue that the successes may be flukes or artifacts. Rather than dismissing these findings because of sample size, one might observe that some adults do achieve perfect‑pitch‑level performance within weeks to months of focused training. The fact that any adult can do it contradicts the long‑held belief that the skill must be acquired in childhood.
Scaling Up: What Thousands of Learners Tell Us
When I built HarmoniQ, I wasn't trying to run a study, I was a software engineer and lifelong musician who wanted to teach my kids the skill I'd learned myself. The app draws on the principles used in these academic programs but delivers them in a game‑like format. For nearly two years I've iterated based on user feedback; however, I only started collecting detailed usage data about 100 days ago when I integrated Google Analytics. Those metrics have given a window into how a much larger community learns.
HarmoniQ now has thousands of active users. Hundreds have already met what people would consider "full" perfect pitch: they can correctly and quickly label single pitches from all twelve note classes across at least seven octaves with over 95% accuracy. I've watched a few users go from zero to full perfect pitch in those three months. I don't yet have enough data to deduce an average time to mastery, but I can say this: virtually everyone who uses the app regularly for at least a week improves their pitch‑naming ability. The handful of users who don't improve is so small that it's statistically insignificant compared with the thousands who do.
There are important caveats. Unlike controlled experiments, HarmoniQ's "participants" are self‑selected. Some download the app out of curiosity and never return; others commit for months. I have little demographic data, so I can't tell you how many users are musicians or how old they are. Without a control group, I also can't rule out all alternative explanations. But the sheer volume of data, millions of trials, paints a consistent picture: deliberate training improves pitch labeling for the vast majority of learners, and learning perfect pitch is definitely possible.

Small‑scale studies are indispensable for demonstrating what is possible and for refining training techniques. They require tight control over variables and close interaction with participants, which naturally keeps sample sizes low. But they leave open questions about generalizability and long‑term outcomes.
Real‑world data from tools like HarmoniQ can fill in those gaps. While an app can't replace carefully controlled experiments, it can complement them by showing how methods scale and by revealing patterns that only emerge when you watch thousands of people learn. The evidence is cracking the perfect‑pitch "myth": when training is accessible and engaging, adults are developing absolute pitch.
If you've ever wanted perfect pitch or you're just curious, then come join the growing community of learners showing how attainable it really is.