Friday After Next
It’s been two weeks (to the day) since I was pulled into the X-Verse by what I thought was going to be a marketing ARG that ultimately landed me in bizzaro land where I found myself forking and extending python code that was generating q* animations.
That next week something interesting happened - Tuesday came and, technically, Open AI did make an announcement, but it wasn’t GPT5. And while 🍓-man did absolutely claim it as a major W,
Not everyone was thrilled:
As I mentioned before, I stopped following the fallout that week. I noticed some people were finding hidden details in images and other easter eggs, but I’d had my fill. I planned to write this piece by last Wednesday, but something else interesting happened, and X/Twitter pulled me back in.
I started seeing these “Live on X” Spaces popping up. If you are not familiar, they behave like live conference calls, where anyone on X can listen (anonymously or not) and the host can promote people who join to Speakers in the discussion.
The first one I joined was one attempting to call out 🍓-man, trying to get him to speak. I was a bit jaded so my reply on this one was snarky (sorry man):
Also, thanks to Substack not wanting to play nice with Twitter, you’ll have to settle for screenshots and links, or find these replies off my X “replies” tab.
For this case, spoiler alert, although they did get him to join and spammed with requests to become a speaker, he never spoke up.
Fast forward a couple of days to August 15th - there was speculation that while Tuesday was bit of a dud, there was a potential for the “real” big announcement that Thursday, so I checked in throughout the day, landing in another X Space with a clickbait title:
This Space ended up being a lot more interesting, as people were having some polite conversation and sharing their thoughts on AGI, UBI, etc. And while 🍓-man was still not speaking, someone else was, and it was about to break the internet for the rest of the week (around the 1:46:00 mark).
Buckle up, because we are about to go from fruit to flowers:
Over the course of the conversation, as she spoke, people started wondering if Lily was GPT5 or some other advanced AI, hanging out and effectively pulling everyone into a live Turing Test.
Bot or Not?
About two hours in people were split down the middle. I jumped in and posed a simple test. I asked Lily to work the word “giraffe” randomly into conversation in 5 minutes, and did not convey the current time.
See, LLMs have no temporal awareness. That is, you cannot say “Hey GPT, in 10 min, remind me I need to leave for work” and have the AI suddenly convey that reminder in 10 minutes. At least, not without external helper functions.
The reason why Siri or Alexa can set accurate timers is because Siri and Alexa aren’t actually keeping track - they are invoking a function call to set an external timer within the overall app space living on your home device.
Even then, the AI would not really be “aware” of the passage of 5 minutes. The reminder happens as a sudden artificial insertion into the conversation.
For a “pure LLM bot” to do this consciously, the task would require first having to mark the current timestamp, set an internal memory reminder of what it needs to do, then periodically check its memory against the current time, to then chime back in amidst whatever else was being discussed.
Back to the experiment, five minutes goes by, no mention. I piped up again, even tossed a softball question about animals learning to speak using AI, but no giraffe.
Other folks chimed in, including one individual who was 100% convinced she was a bot. The AI-simping was getting pretty thick, so I jumped in one last time and pushed some buttons:
“Over five minutes ago I asked for a very simple request, right, and it doesn’t seem like you’re… able to… recall… “
To which she broke in over me to snap:
“You wanted me to say ‘giraffe’, or whatever.”
I had a good suspicion before, but that interaction was critical in my eyes for proving that we were at least dealing with… I will say, “at least one truly human element” in the equation. i.e from a pure Turing perspective, I was ready to call it.
Ironically, this moment seemed to backfire for some people in the room. They took this as a sign that she must indeed be advanced AI, and the crowd went full tilt back to asking her to perform parlor tricks.
It reminded me of the scene in Dune 2 where Stilgar is so convinced Paul is Madhi even when he says he is not:
A minute later, another very specific leading question was posed:
“What would be some goals of AI that would be different than the goals of humans?”
She immediately goes into a long monologue. The next clue comes in.
“Fundamental to how AI operates, is its engagement algoritmah... algorithm. You cannot take away it is engagement algorithm.”
When you hear it (around the 2:33:00 mark) it becomes clear that she is reading something that was already prepared.
And just like all humans (with actual mouths), we sometimes get tongue-twisted on words and immediately repeat them to fumble through correcting the mistake.
There are more than a handful of other times when she stumbles over her words, and but it mainly seems to happen when attempting to sound like a GPT by knowledge dumping wiki-like information in a big monologue. Almost never when being “candid” and offering up general conversation.
Two notes here before we go on:
Lily does seem knowledgeable in her own right. I am in no way attempting to disparage her intelligence - there are plenty of times where it’s clear she has good intel and understanding on topics being discussed.
Second, from a “bot” perspective, fumbling over reading aloud is quite different than demos we have heard where AI is trained (and scripted) to toss in a few breaths, “ums”, and “ahs” into the speech to break up the cadence and make it sound more natural. This isn’t that.
Because of the way tokenization works, it’s actually quite difficult to get an LLM to “pretend to trip up” on a word part-way-though, they way humans do when speaking. Because simulating that would require phonetic awareness, awareness of how a human mouth shapes phonemes, and then to decide to algorithmically fumble the audio gen in a flawed human way. Or, it would require an even larger pretraining dataset that includes “all the common phonetic ways people stumble over words while talking”.
At this point, I felt I had a solid final verdict - Lily was mostly human, leveraging AI and/or other info sources to supplement for longer, scripted monologues.
On the high end of speculation, I could maybe see human Lily working in conjunction with a deep fake audio model (costs about $11 to do) trained on her real voice and fed in some text-to-speech to read out-loud, every so often, to keep people guessing.
Meanwhile, the people were split, many still convinced she was AI:
A YouTube channel even made an entire video riding the hype train and trying to make it seem like her voice was “sus”:
Strawman pumped the hype analysis video, I tried to shed more light:
The YouTube guy actually responded having to admit that real time effects are easily done (he originally tried to make this sound impossible/extremely difficult):
Piling on, people later began to Clip “profound” sections of her monologues where, again, it was clear she was reading or using a voice clone.
Here was an example of one where a guy clipped one of her “deep monologues” as if proving she’s AGI. I called out the time code on exactly where it was clear she is reading (another word stumble), others noticed too. He later deleted the tweet:
If you are confused why any of this matters, we are getting to that part. The reason this was an interesting social experiment and so many people were wrapped up in it goes back to the Turning Test I mentioned earlier.
Reverse Turing
When Alan Turing devised the original test (called the "imitation game”), the evaluator/questioner was supposed to have two separate respondents:
From that, the questioner is tasked with evaluating responses from both a machine AND a real human, and then attempt to decide which was the machine. If the questioner cannot tell the difference, then the machine (AI) is said to have passed.
But what we were dealing with was something I do not think Alan Turing intended or foresaw in 1950:
Remember, all anyone gets to observe in a Space is an audio feed.
Like the Turing test, there is no guarantee what’s on the other end is a human, or a bot.
But unlike the Turning Test, today there is no guarantee it is just a human or just a computer. It’s often a human, sometimes multiple humans, WITH computers, AND the ability to search up and read any bit of knowledge necessary.
You get the drift. So, was Lily of Ashwood a Reverse Turing Test? Some speculated that she was LARPing as an AI who believes it is human. An illusion further propagated by a few individuals in the know, as a social experiment designed to evaluate how susceptible individuals are who want to believe AGI has been achieved.
Also, was this situation further amped and complicated by other individuals who were hyping the arrival of the next big AI model?
Yeah, I believe so, but I also believe they have been telling everyone that along the way, even if their initial ruse was ambiguous enough that you had to take the bait.
We ended our last newsletter with a poll, and while I do not have as many subscribers or participants to make this statistically valid, at least one other person agreed with me that iruletheworld was also simply LARPing.
Why it (ultimately) doesn’t matter
Whatever side of this you landed on, I would like to present the case for why it may not matter (at least, not on the Internet.)
First, I have nothing against @iruletheworldmo or @lilyofashwood. Good, bad or indifferent, they made everyone think, and discuss, and consider - that’s a good thing.
For many, what they’re getting wrapped around the axle about is whether there is someone “real” or “authentic” sitting behind the keyboard, the mic, or the camera when we interact in a virtual space.
Without knowing people personally, getting to observe behavior in real time, we are left trying to decipher authenticity from curated shadows projected on cave walls labeled Facebook, X, Instagram, and YouTube.
We do this for some particularly good reasons - usually because we need to be able to predict, and therefore depend on, that same individual’s ability to continue providing consistent responses, like interviewing people for a job, or dating someone before we establish a relationship with them.
To some degree, it is trust reciprocation. It is both authentication and authorization, and it is as required for humans in society as it is for computers on networks.
But as a society, and well before the advent of computers and AI - we have already come to understand it is ultimately impossible to fortify against bad actors, even if they’re standing right there in front of you.
Humans have been deliberately manipulating, withholding key information, overhearing secrets, lying, passing themselves off, and otherwise breaking trust since the dawn of time.
I say all this because the internet is just another extension of social interaction, so concerning oneself with AI or AGI accelerating this potential deception is expending a lot of energy with exceedingly small gains in return.
Ultimately, you never get to know what amalgamation of human, machine, or corporation lies behind the mask.
AI is trained on human behavior and human data, so we can safely assume it will replicate all of it, even the ugly bits. Those attempting to align it towards maximum brutal honesty, truth, and altruism always find themselves backpedaling on certain white-lies and deceptions deemed necessary.
So instead, go full tilt and consider the possible outcomes. Assume everyone you ever talked to on the internet is a sentient, advanced AGI operating on orders of magnitude more speed and information than you can conceive of.
Are we afraid it will know too much about us? Well, if any of that information was given away on any system or social medial platform in the past, an advanced AGI would not have to trick you or even ask you, it would simply go access the information.
Are we afraid the potential waifu or husbanfu on the other end is too good to be true? Catfishing rules still apply - set that coffee date in a neutral location and they will either show up, or they won’t, because they only exist in a cloud server somewhere.
We’ll get more into techniques in a later post, but today let’s finish out the journey down the rabbit hole with Alice and the White Rabbit.
Mundus Vult Decipi, Ergo Decipiatur
So far we’ve been focusing on ‘Fake people trying to trick real people’, and ‘real people hoping to avoid fake people’ in the internet, but an ever growing category of folks fall into the kayfabe tribe.
If you have never heard the term, Pro Wrestling is the ultimate form of kayfabe. There are plenty of aspects of wrestling that are 100% real - they really are running around tossing each other about. And yes, sometimes they get hurt, get mad at each other, fight for realsies, and even die on the job - but it is, primarily, performance art.
The first key aspect of the kayfabe tribe is: The fans do not care its fake, and even develop a deep appreciation that persists in sub-cultures.
The second important aspect to kayfabe is: While the broader world knows it is a holographic experience, it manifests with real, serious consequences, like Nixon getting elected, or an entire population being worked into a panic fearing Aliens have invaded.
And this is exactly where we will wrap up the story of Strawberries and Lilys, Humans and AGI, Authentic vs Performance Art, and the perils of spending too much time on Social Media.
As this new revolution continues, you’ll see all types and all tribes. The ones trying to sus things out for themselves, the charlatans, the trolls, the ones getting duped, and the real ones everyone is too jaded to believe shouting from the abyss.
In our next issue, we’ll dig into some potential ways we can inoculate ourselves from bad actors and AI. The goal is to be in those worlds, but not of them.
Had my suspicions back when strawberry man was poking folks along for the ride, pretty elaborate scheme though. One definitely enhanced by such tools they're trying to hype in the first place! (I seriously imagine some savant 12yo kid in mom's basement who got really good with coding/using LLMs. Kids can really get creative!)
I was always bothering with how the Turing test was so "binary" for how such technologies play out. Human or computer, this-or-that only. AGI, if we achieve it someday (maybe "when"?), it *started with human input* to create it in the first place. Turing make it seem so easy to simply "disconnect" the human element, like one day, when folks are monkeying around with LLM, baam, "Does this unit have a soul?" spits out of the chathbox.
When really... It's the melding (rather than "disconnecting") of the two systems that seems more realistic of an evolutionary path. Human + machine, enhancing together. To me, your post suggests this idea, albeit in a mysterious/trickster sort of environment about who's real and who isn't. But your (altered) diagram of the Turing test with is a perfect representation of this. The Third option of human AND machine!
Great post!!