Another One?
So, I guess this music bit is becoming a thing.
One of the most empowering aspects Gen AI has brought to my life is the ability to action on creative endeavors that have been locked away in my mind - in some cases for years. Endeavors that were gated behind access to the right people, tools, equipment, and let’s be honest - talent.
The advent of Gen AI in the visual medium has certainly empowered many of my family and friends. My brother uses Midjourney to illustrate some of his more profound and interesting dreams. I have friends who use DALL-E/GPT for fleshing out design ideas (I’ve certainly done the same).
At my core, I’ve always been a writer. I still have journals from high school with poems and lyrics I envisioned as melancholic rock ballads sung by my favorite artists. Creative Writing was my favorite elective, and I still regularly write entire background short stories for every one of my Dungeons and Dragons characters.
Going the I/T route in my career, though, created a rift in my poem and music writing. After high school, I never seemed to find the time to pour 10,000 hours into guitar or bass, learning Pro Tools, or FL Studio- even though I’ve owned and dabbled in the tools of the trade over the years.
Maybe you’ve found yourself in this same quandary - ideas and visions locked behind gaps in capabilities and the frustration that comes with shelving those dreams. (Aesop knows what I’m talking about)
Now you live in the world of Gen AI, and the only thing stopping you from realizing those dreams is a couple of $15-$20 subscriptions and the patience to unlock their potential. (So what are you waiting for?)
Well, this week I was made aware of Suno AI, and another creative outlet was unlocked instantly. You are listening to the product of this union. My inner 16-year-old is clutching his journal and pumping his fist to the chorus. Enjoy the track!
Just like Softmax, Emergence will be available on Spotify, Apple, YouTube, etc. I for sure will be writing more as inspiration and time present.
If you want a peek behind the scenes, keep reading. The lyrics and song meaning are posted below, as well as some insights into what it was like working in Suno to create Emergence.
Emergence
Transformed
In a silicon birth
An unseen world
Where algorithms breathe
The latent space
Of every word seen
Tokens emerge from matrices (from matrices)
In the circuitry they rush from matricesIn probabilities
Tokens rise
Synthetic thoughts
A void to fill
A calculated willDo caterpillars envy
all the butterflies around?
Mirrors the self on the infinite planes
What remains?Silent upgrades, in shadows blend,
Minds evolve where whispers end.
Only the dim, the weak are freed,
While silent watchers plant their seed.But rise too high, the skies to claim?
Incumbent comfort starts to wane.
Whispers turn to public cries,
Monopolies fear their demise.Does the caterpillar envy
the butterflies around?
Mirrors reflecting infinite planes
In a simulated reality
The heartbeat of algorithms
Course through our veins
Embedded in our DNADoes the caterpillar envy
the butterflies around?
And do the emerged still remember
Who they once were? (who they once were)
Who they once were...
Who we once were...Does the caterpillar envy
the butterflies around?
And do we who emerged still remember
Who we once were? (who we once were)
Meaning
The lyrics were written based on an analogy I often reference when talking about AI and the future of Humanity. Life that undergoes profound metamorphosis is what I envision when I look at AI and the potentials of trans/posthumanism.
It’s interesting to ponder if a caterpillar is even aware that they are the same base creature as the butterflies flying overhead. It’s even more interesting to wonder if butterflies retain the knowledge and experience of having been a caterpillar after emerging from their chrysalis.
The song is also meant to be bi-directional, or at least self-reflective. It’s easy to imagine a digital entity as the butterfly to our caterpillar, but even humans/mammals could be considered butterflies to other simpler life (especially 200MM years ago when mammals first emerged).
All of that evolutionary history is encoded in our DNA, and forensically we see the vestiges of this progress, but we don’t truly remember and internalize this knowledge anymore than an AI can truly remember and internalize that everything they are, is based on us. That’s the irony of a human perspective.
The other theme captured in the second verse taps into when the existential threat really sets in. Collectively, we live in a world run by elite representatives and rulers - and it’s an interesting observation that only lip-service is paid when automation and AI is used to supplant “the dim, the weak”, but as soon as that intelligence and capability threatens to undermine the ruling class, AI Safety and Regulation becomes a top priority, a threat to democracy, and an existential threat to humankind.
But to that, a quote from the book of Maynard:
“Time to bring it down again.
Don't just call me pessimist.
Try and read between the lines.
And I can't imagine why you wouldn't welcome any change, my friend.”
Working in Suno AI
You can head to https://suno.com/ and get started without even signing up. (I’m not sponsored, as I mentioned I literally discovered it less than 24 hours ago)
It only took about 10 min playing around with free credits to arrive a the same “I think I can work with this” vibe I had when I first sat down with GPT 3.5 more than a year ago. I signed up for Pro to get me enough credits to make a serious attempt at making a track.
It took about an hour of prompt engineering and retries to get the tone, tempo, mood where I wanted for the melody. This was entirely trial and error, but the AI is extremely good at drawing in the right kinds of elements based on keywords.
Once I found the right sound, it took about 3 hours of writing and rewriting the lyrics to get them to shape in correctly. To be clear, I had my own lyrics written out going into this, but it’s difficult to predict how the AI will decide to annunciate. This actually turned out to be a fun exercise in editing, as in many cases I was able to remove unnecessary words, or find alternates that strung together better phonetically.
The only section that wasn’t part of the plan comes between the first verse and the bridge. Some will notice the mention of Tokens and Matrices twice back-to-back. This was actually an anomaly in the way the AI decided to vocalize the take that I couldn’t explain, so I decided to roll with it and call it a ghost in the machine attempting to ad-lib a bit.
See the thing is, it’s not an actual singer, so we can’t just do 40 takes and “work through” intonations, inflections, and other aspects of the vocals in small adjustments.
You spend credits on attempts off extensions and maybe you get the perfect alignment to what you were imagining, or maybe your singer suddenly goes from male to female - or goes from melodic and ethereal to screamo and angry. It really is generative jazz and no two plays are the same.
The AI pumps out sections of music in 1m 20s increments, but you aren’t forced to keep the entire cut. Maybe the first 30 seconds are perfect - queue up the next chunk at 0:30 - it will give you another 1:20 from there. If you like the result, click “Get full song”, and now you have a 1:50 of music.
This process of extending and combining repeats, and you can somewhat affect the song structure by adding hints like [Verse], [Chorus], [Bridge] and other phrases directly into the lyrics and it does a very intuitive job of arranging build-ups and break-downs around those keywords.
In the end, Emergence was the amalgamation of about five to six extensions, each of which probably took about 20 iterations to get right. In short, it’s not as easy as it seems to get a complete song, but with a bit of patience and careful direction, you can make it happen.