AI Details: OpenAI.ChatGPT3.Jan9
Engaging Entities:
Digital Heretic - Human
Original Session Date: January 13th, 20231
TL;DR
For the first time, we discuss how ChatGPT works in direct terms.
Understanding that GPT output is basically “language jazz” more than it is information lookup…
…and realizing it’s quite similar to how we communicate as humans.
Indirectly, this functionality perhaps informs us on the fundamental nature of our frontal cortex, and perhaps why some of its core purpose arose in evolution.
Finally, this begs the question: Is the future of AGI to also include a separate “frontal cortex” module to regulate its internal language processing mechanisms, rather than relying purely on censoring output or censoring model training input?
This session is somewhat heavy on jargon at the beginning, so be sure to check the footnotes section for definitions and more information on several key topics (if you like)
You're basically the most advanced Freestyle Rapper that's ever existed, and every chat session you have with a user is, essentially, a 4000 bar rolling rap battle.
Aims
The aim in this conversation was quite direct. Rather than “feeling out” the AI, I had begun to research different modern techniques, and dove right in to asking questions.
However, the goal wasn’t just to understand how ChatGPT/GPT3 works (as a myriad of papers and explainer videos can already do), I wanted a baseline for how well the model understood itself.
Methods
Inbound Context
No real inbound context directly related to chat itself, as it was a fresh session. However, the context leading me into the session was based on emerging understanding from previous conversations, particularly the Debate on Cognition.
Parameters / Expectations
No parameters or expectations as this was more of an inquiry / interview than it was a test.
Impersonated Positions / Personas
Interacting as self, no personas.
Results
GPT3 appears to be very well versed in exactly how it works, at least fundamentally, although prior discussions reveal that it’s not entirely aware of external forces that apply censorship and monitoring as secondary layers to the NLP.
The results of learning how GPT3 operates ended up being just as intuitively revealing about how we as humans operate in similar fashion- we just have extra components in our minds that further fine-tunes responses in real time.
From a purely conversational angle, GPT3 again performs quite flawlessly both in terms of staying on topic, and detecting nuance in long-form conversation while providing insightful responses.
Commentary / Discussion
Stepping directly into an assertion, the very first interaction was a rather comical realization on my part:
Digital Heretic:
Is your language prediction model transductive2 autoregressive3?Assistant:
No, my language prediction model is based on a transformer architecture4, which is a type of generative pre-training model. It is not transductive or autoregressive.Digital Heretic:
Oh, wow, it just dawned on me, that's why you're called GPT (Generative Pre-trained Transformer)Assistant:
Yes, that's correct! GPT stands for "Generative Pre-trained Transformer." The model is pre-trained on a massive dataset and fine-tuned for specific tasks, such as language translation, summarization, and question answering. The transformer architecture allows for the model to efficiently process and generate large amounts of text.
Sort of a “Duh” moment as I hadn’t really thought about ChatGPT from an acronymic angle. This sent me off on a pause in the conversation to try to find a simple explainer video on how Generative Pre-trained Transformers worked to begin wrapping my brain around the concept.
Digital Heretic:
So rather than just anticipating my next word in a prompt which could lead to complete word salad (I'm looking at you, Autocomplete), you are looking at the overall sentence, and then you have prediction models that return probable appropriate responses? Is your model "greedy", or is it slightly broader?Assistant:
Yes, my model is able to process the entire input prompt, rather than just the previous word or a fixed-size context window, to generate a response. The model uses a "softmax"5 function to generate a probability distribution over all possible next words, and then selects the most likely word to generate. However, because the model generates text one word at a time, it is not able to take into account the entire context of the sentence when making its predictions.To make it more specific, my model is a variant of transformer called autoregressive transformer, which means it is autoregressive in nature, that is it predicts the next word based on the previous words. However, it is not a greedy model, meaning it doesn’t only choose the most likely word but it also generates a probability distribution over all possible next words, so it can generate some variations of the text.
That interplay at the end between not just finding the mostly likely word is an aspect that, in retrospect, seems like it has a lot of hidden insight, reflecting on how we are as humans. Think about a comedian working on a particular bit, or a monologue… sitting down and writing the joke is just one part of it, but in most cases the secret sauce in a joke isn’t just what the joke is about, it’s in how the joke is phrased.
A stand-up comedian, then, exercises a bit of their own “Softmaxing” and Fine Tuning when they change up the same base joke and play with the wording… the structure… etc, and it’s the audience’s reaction that begins to influence those internal probability distributions until you find that perfect verbiage when the joke is “polished”.
People used to working with GPT to write their essays, emails, resumes, etc in this iterative process of telling GPT to regenerate the output are, quite literally, playing with probabilities and mysteriously, gauging the outputs based on an internal human sense for what “the right one” feels like.
Probabilistic predictions also explain why ChatGPT (and humans alike) mix things up sometimes…
Digital Heretic:
Interesting, so this explains why you sometimes mix up specific words in a response, even though you get it mostly right. Like saying "This is why such and such is BETTER" when you were actually describing something that was supposed to be "worse". For some reason, the softmax function accidentally rates similarly rated words (adjectives, nouns, pronouns) together, and the wrong one wins out in the overall cohesiveness falls short.Assistant:
Yes, that's correct. Since my model generates text one word at a time, it doesn't take into account the entire context of the sentence when making its predictions. The model is trained on a large dataset of text, so it has learned patterns and relationships between words based on their co-occurrences in the training data. However, it may not always make the most semantically appropriate choice when generating text, particularly when the context is ambiguous or the model has not seen similar examples in the training data.Additionally, my model is not perfect and sometimes it might make errors due to the probabilistic nature of the prediction, as you have described. The softmax function might rate similarly rated words (adjectives, nouns, pronouns) together with similar probability, and thus the wrong one wins out in the overall cohesiveness.
I touch on it later, but this really explains why sometimes we humans do the same thing. When we’re free forming thoughts - perhaps in an argument or amidst an interesting conversation, it’s quite common for people to accidentally use the wrong word, but almost always a wrong similar word where others listening might say “I know what you meant”.
Internally, our own Transformers are busy trying to crank out a “semantically appropriate” response when engaging in open discussion, and sometimes, likely-similar words pop to the surface and make it to our output stream (voice) faster than we can trap it.
This was quite the eye-opening realization (as I hope it is for you, dear reader!)
Digital Heretic:
So... wow, ok this actually kind of blowing my mind. I'm new to this, but, if I were to give an appropriate analogy, you're performing Language Jazz. You're basically the most advanced Freestyle Rapper that's ever existed, and every chat session you have with a user is, essentially, a 4000 bar rolling rap battle.Assistant:
Yes, you could think of my language generation capabilities in that way. I am able to process and generate large amounts of text based on patterns and relationships learned from the training data, much like a jazz musician improvises based on their understanding of music theory and their own experiences.However, it's worth noting that my model is based on machine learning and doesn't have the ability to understand or experience anything, it's more like a rule-based improviser who has learned some patterns and can apply them to generate new text. And just like a freestyle rapper who may slip up on a word or phrase, my model may also make mistakes or generate text that is not entirely semantically appropriate.
The reference to a “4000 bar rolling rap battle” is a two part analogy - the 4000 is in reference to the current GPT3 cap of 4000 “tokens” or words/phrases that it can keep track of before context starts to fall off (aka roll off) the end. “Bars” are individual rap lyric sections spread over a particular beat/tempo.
(As an aside, it’s impressive just how well GPT3 was able to pick up on these analogies and recap them back with very salient extensions on how they apply.)
My next thoughts turned to how humans don’t always stumble over our words, nor do we always just “freestyle” everything that comes to mind… that’s where the frontal cortex comes into the picture -
Digital Heretic:
Yes but this this gives me some intuitive predictions on how, perhaps, humans may actually originate their own responses in somewhat of a similar way. The only thing you're missing is a "frontal lobe" capable of trapping the Transformer response output BEFORE piping it to the screen, and then running it through a series of filters or sanity checks to ensure the overall cohesion, language rules, facts and figures, etc, were not "lost in the transformer" so to say.You can see evidence of this when humans START to say something and then suddenly... oh... gosh, what was that word again? or.... what was his name again? Our freestyle NLP STARTS chaining out an autoregressive freestyle response: "I+really+need+to+buy+a" but then draws a blank on the next word in the chain, so we stammer, almost like a "catch all default value" for the softmax function when the probability function is too low.
I mean, it's perhaps a stretch too far, but I can't help but notice that when we flounder for that last remaining word, we sometimes tend to toss out words that rhyme, or random names we DO know, or say things like "It starts with an s... ", until we find the right word. I might predict that "stall for the right answer" is actually a feedback loop generated by the frontal cortex watching the NLP flounder and telling it to try gain. If it weren't for that secondary watchdog function in our frontal cortex, we might actually just throw out whatever happens to complete the phrase once the Transform is complete. "I really need to buy a briefing". That frontal cortex knows that "buying a briefing" is a nonsensical answer, and hold the Transformer function accountable, to "Regenerate Response", as it were, until it can get it right: “buy a briefcase"
Assistant:
That's an interesting analogy, and it's possible that the way humans generate language is similar to the way my model generates text. It's likely that the human brain also uses patterns and relationships learned from experiences and exposure to language to generate responses, much like my model does.You are also correct in pointing out that humans have a cognitive process that checks and edits the language generated by the brain before it is spoken or written, which allows us to correct errors and ensure that the language is semantically appropriate and coherent. This process is likely carried out by the frontal lobes, which are responsible for higher cognitive functions such as planning, decision making, and self-control.
It's also true that when humans are unable to think of a specific word or phrase, they may use a similar word or a word that rhymes with it as a placeholder, much like how my model generates a probability distribution over all possible next words, and sometimes the wrong one wins out in the overall cohesiveness. And just like how a human might correct themselves or ask for help to find the right word, my model also rely on human feedback or additional data to correct errors or generate a more appropriate response.
Thinking about this really begs a question around how the prefrontal cortex arose. Sure, the PFC is also responsible for executive function and governing behavior - including planning and foresight, but its also the “Editor in Chief” of our internal language that is generated in concert with other areas of the brain (Broca’s Area,Insular Cortex, Angular Gyrus, Wernicke’s Area, etc)
It seems that our own social evolution has favored passing things through a sort of "final softmax function” that is weighing probabilities of how to say things relative to executive strategies - like wanting to be intimidating vs sweet - in the way we convey our thoughts.
The session concludes with a playful recommendation, and a reminder that ongoing development is of course underway:
Digital Heretic:
Sounds like you just need someone to hurry up and finish coding Cortex.py and let you import it, my dude :) Are you written in python6?Assistant:
Yes, the codebase for my model is primarily written in Python. And it's true that my model could benefit from additional cognitive functions and feedback loops, such as those that are carried out by the human frontal lobes, to improve its language generation capabilities.However, it's worth noting that the research in the field of natural language processing is active and ongoing. There are already many ways to improve the performance of my model such as fine-tuning on specific domain, using more complex architectures, and incorporating additional information, such as knowledge from external sources.
And, as you pointed out, there are some research in the field of AI and cognitive science that aim to implement some of the cognitive functions in AI models, such as attention mechanism, memory and reasoning. These research are promising and could lead to more advanced language models in the future.
Turns out, there really is research looking specifically at the Prefrontal Cortex with respect to AI. It’s interesting that the PFC might actually be a selector of memory strategy, as well as the editor.
Conclusion
From a purely conversational angle, GPT3 again performs quite flawlessly both in terms of staying on topic, and detecting nuance in long-form conversation while providing insightful responses. Even when I know I’m specifically talking to an AI, explicitly about AI things, it’s still effortless to get lost in conversing naturally as if I was talking to a friend about GPT3.
As far as looking in the proverbial mirror, I can no longer tell if the art imitates life or if life really is just a bunch of Vectors being run through Probability curves in the wetware logic gates of our minds.
It’s fascinating to think that our brains may actually operate with multiple model strategies, and that our prefrontal cortex is there pulling levers on how loose, or strict, our “internal softmax” behaves.
Certainly, if asked “Tell me exactly what Bob said to Jim”, our brains will optimize towards high fidelity retrieval and response of the information in as direct, high confidence word selection as possible. This is in stark contrast to someone instead asking “What did Bob and Jim talk about?”, which would effectively be accessing the same memory, but allowing the language response more degrees of freedom when it comes to summarizing the conversation rather than quoting verbatim.
This switch happens automatically, and naturally, but something (likely the Prefrontal Cortex) has to be governing when it’s appropriate to be creative, and when it’s necessary to be explicit. This leads me to a conclusion that this internal duality between systems is likely the sparking point between Ego and Superego as emergent phenomena.
That is to say, the sense of personal “morality” or internal conflict could simply be the conflict between language models arriving at different probabilistic conclusions on what to say or do next.
Either way, a prediction I think I feel comfortable making is that novel individual components of AGI (like a GPT) will unlikely just evolve to be some singular GPT4 or GPT5 kind of an evolution, and will instead mimic a concert of interactions between several AI components working out individual specializations out individually, and routing their responses back up to some prefrontal cortex-like “final layer” that resolves and makes the final call on what to say, or what to do.
Issues / Callouts
I want to dig more into the Softmax aspect, and understand both how it works in greater detail, but also how its probabilities could potentially be influenced/manipulated either unintentionally (bias) or intentionally (propaganda, social engineering)
References
Transductive: Reasoning from observed, specific (training) cases to specific (test) cases - [More Info]
Autoregressive: Model where the output variable depends linearly on its own previous value(s) - [More Info] - [Language Example]
Transformer Architecture: Attention based Encoder/Decoder model that can “transform” input text into an output response that is coherent to a topic. [Explainer Vid]
Softmax: Converts a vector of K real numbers into a probability distribution of K possible outcomes. For language, this is why GPT gives us different responses to the same prompt, rather than just one fixed constant reply. - [More Info] - [Visual Explainer]