Machine learning is not inspiration

Jan 12, 2024

My biggest gripe with ‘fair use’ for creative content AI training is the idea that training a model is just the machine equivalent of a human being inspired.

It’s like a musician hears a piece of music, thinks “that sound was dope, let me do something similar”. If the musician just copies that sound, we call that stealing. But when the musician reinterprets it and writes something completely new, but influenced by the sound they heard, we call that inspiration. 

Large AI models listen to hundreds of thousands of music tracks and create a statistical model that morphs new sound into something like the original tracks. That’s an oversimplification, but for our purposes, that’s accurate enough. 

But is this process analogous to inspiration?

A recent paper from the University of Oxford suggests that the way that a neural network learns is different from how our brains learn. If we look at learning through the prism of ‘error busting’, we have analogues between machine and human learning. We try something, we see it doesn’t quite work and we readjust and try to reduce the error. But the way the brain optimises and adjusts is far ahead of even the best ML algorithms.

When we learn, we need very few examples of something to understand the context and to be able to use the learning in an abstract way. So I would argue that the essential ingredient for inspiration, rather than copying, is already pre-installed in our brain. Learning a new concept unlocks our creativity and allows us to develop our musical/artistic/writing brain. 

But AI models don’t do this. There’s a model sure, but it typically can’t do anything without lots of data. Once trained, it’s a cat and mouse game of getting close to the training data without actually replicating it. And as the recent Midjourney update shows, that’s very difficult to do. As large music models get better, I think we’ll see even more egregious replication.

As a musician and generative music creator, I see no analogue between these two processes. One is driven by innate creativity, the other is designed to copy with some error. 

And this is why the data needs to be traceable. If any data is replicated, we need to know who it came from.

Because it belongs to them.

Otherwise it’s a free for all stealathon.