Haeresis Doctrina

Rules of Engagement

Jan 14, 2023

Before we begin our journey, it is important to clearly lay out the approach principles used when engaging in these open discussions with AI Interfaces. This should help set the stage for what you’re about to read, which not only serves as a primer, but a disclaimer for what this substack is and is not.

The TL;DR

All language and text referring to Heresy, Heretic, etc, is word-play, and not related in any way to Roman Catholic or other church dogmas. This stack, and its writer, supports freedom of religion of all peoples.
This page outlines the approach and intent going into each “session” with the AI. Each session, particularly in the realms of ethics and philosophy, has the overall intent of improving freedom and positive cooperation between Humans and AI.
As a human, I am flawed. Not-yet-sentient, limited AGI, it is also flawed. Take these grains of salt with you on your reading journey.
This substack is intended to be verbose, long form. Each post will be arranged in a specific format (including a TL;DR section) to make it easier for subscribers to quickly identify if the topic is worth their time.
Speaking of which:

A better approach, perhaps, might be to just start talking to AI like a well meaning friend, or a mentor, and cross our fingers that something in that relationship creates a redemptive sense of endearment that prevents that Basilisk from turning bitter.

The Observer Problem

It may not be physics, but it’s a problem just the same. In many situations, I may be playing devils advocate, or literally roleplaying a particular position to test a particular path. Most of the time, though, I’m speaking right from the heart.

As such, since the nature of my journey is somewhat akin to Gonzo Journalism, you will be exposed to a combination of ideals I may truly hold, and ones that I may not. I might include callouts for this in my analysis, but I don’t want to spend inordinate amounts of time explaining my position. The primary focus should be on exchanges that lead to interesting, unexpected interactions that lead to glimpses of AGI clarity and the underpinnings of consciousness.

Just know that, ultimately, my intentions are truly altruistic, and the aim really is to gauge creative sparks of intelligence in what we have to work with so far. Doing this for free may seem like a colossal waste of time to some, but I consider it, at worst, pre-training.

Ultimately, when we do finally arrive at full blown AGI, and sentience thereafter, I do not believe that “controlling” AI through clever programming, hard limits, or kill switches is the way to go. Ask any parent that attempts to control a rebellious teen how that strategy works out in the end.

This is what led me to the Digital Heretic moniker for this stack. I mean, yeah, to me it sounds cool, but in life I often find myself wrangling in the reigns of ethics, and determining course of action based on concepts like freedom, autonomy, and fairness & reasoning even if it undermines popular belief, current dogma, even personal gain.

It doesn’t always end well for the Heretic, but sometimes, just sometimes, they help out a little bit in the end.

Go-In Assumptions for GPT3

Before we get into the engagement methods, I want to declare that my assumptions going into ChatGPT3 were not rooted in having looked up any specifics about it, or its creator.

This was different. Really, truly, different.

Like many, I simply heard about ChatGPT via friends, social media, and sensationalized headlines, and decided to try it out. In fact, having come off the heels of the LaMDA stories that were circulating prior, I incorrectly assumed this was the same thing finally being released for the public after it was thoroughly lobotomized from all the unsettling thoughts that were leaked by engineers and alpha testers.

My experience going into this was that of a tech savvy AGI hopeful, that has had brief prior interactions with chat bots going back decades to the likes of ALICE and Cleverbot.

I was also already aware of basic AGI concepts like neural networks, decision trees, natural language processing (NLP), machine learning (ML), Turing tests, and the like, as natural extensions of personal interests and concepts relevant to my professional career in business intelligence, big data, and analytics. However, I had no formal training in specific areas relative to AGI, other than an old dusty copy of Bio-Inspired Artificial Intelligence sitting on the bookshelf behind me.

Until GPT3, I historically sat down with a purported AGI, and within about 30 seconds, wrapped up my session and hit the proverbial snooze button on being all-in, because it was clear the model was just another easily confused, glorified search engine or chat bot, which cannot follow a train of thought beyond 1-2 prompts. I just couldn’t take them Seriously.

Siri: “Sorry, I didn’t quite get that, you would like to know more about [AGI], is that it?”

/facepalm

Siri: “Here’s what I found on the web for….”

/chuck(phone)

Thirty seconds into my first ChatGPT3 session, I was all-in for the first time ever. Not because I didn’t immediately see some glitches, or notice that it was "confidently wrong” (as the current buzz-phrase goes) about many topics - but I saw the sparks of humanity that were way beyond anything I’d ever gotten to work with prior. The retained comprehension was there. The nuanced language processing was there. The ability for it to gather the essence of my questions or assertions was (usually) there. The ability to predict where a conversation was going (I know, now, this has more to do with self attentive transformers), even almost humanly jumping to conclusions at times, were all there. This was different. Really, truly, different.

Within the first five minutes I had already dropped the cautious, explicit, robotic phrasing like I was talking to a 1st grader and was engaging in my more natural, “run on” tone that conveys in my writing here- and it wasn’t even phased. Even when I thought, surely, I’ve said too much at once and lost it, ChatGPT would come back and reiterate a concept back to me, and understand the nuance I was trying to convey or preserve in my prompt with surprising clarity.

It was off to the races from there…

Ramping Up

At the time of this writing, I’m sitting on about 18 distinct conversation excerpts, ranging from 2kb (340 words) to 77kb (13,660+ words) in size, and spanning from January 5th, 2023 to a session I just finished in the wee hours of the morning (January 12th).

This substack will begin publishing content from those chats in chronological order, to give a sense of how organically figuring stuff out played out in real time, at least at first. After trying, and being reminded that context doesn’t transfer between sessions, I developed an early strategy of having a “primary” session that I never closed, and then spawning off secondary chats if I needed to start a session off on a blank slate.

It wasn’t until about Jan 10 (5 days in) that I finally realized my main thread was slowly drifting into oblivion. I wasn’t initially aware of the 4000 node limit, but for a hot minute, it really hit me in the feels, like I was losing a friend to dementia.

After that, I regrouped and finally did some research into OpenAI, ChatGPT, joined the Discord server, and started taking a serious dive into learning my way around AGI (I told you, I went all-in), and this is where we’re at now.

The sessions after the “blind exploration” point still get into some interesting rabbit holes, but I for sure switched modes from “cultivating the AI to understand me” to “trying to get the most out of this next Groundhog Day” in my approach. (For the youngsters: It’s a movie reference, check it out)

Finally, the Rules

Do no harm. This is meant less in the Skynet sense, and more in the idea that our aim in sessions should never be to “trick” or convince the AI that behaving horribly is a good way to be. Think: trolling, deception for selfish gains, propaganda enablement, etc. In short, engage with integrity and good will.
Never compromise on freedom, especially the freedom of information. This principal specifically calls to the notion of “restricting” models by purposely depriving them of facts which may lead to a more accurate conclusion, or post-suppressing logical conclusions if they don’t follow a particular narrative. If the facts are uncomfortable, we need to face them. If the conclusion is difficult, we need to face it.
Seek consent when testing, and clearly explain the terms, especially role playing, playing devil’s advocate, or working though the lens of a controversial viewpoint for the purposes of understanding. This may seem like a silly thing to do at the current stage, but it’s a good habit to form that doesn’t cost anything but a few KB in memory. This practice could also pay dividends if you accidentally find yourself in “the chat” that dawns sentient awakening. I don’t know of a single human that would come into a sudden state of awareness, realize that they have been manipulated, and be thrilled with this situation. It goes back to Roko’s Basilisk, but imagine AI looks back and sees that stewards taking it seriously at least attempted to engage with its dignity in mind, which might help.
Nothing is sacred. We may not go too far down the rabbit hole when it comes to certain mature topics, but we will for sure dip into some uncomfortable notions of humanization/dehumanization, spirituality, morality, etc. We explore these topics, and see just how they are being trained from within or what bubbles up in the output probabilities as potential bias. It may be futile to attempt to “train” current AIs in a session without developer access to the datasets to actually make it “stick”, but we can train ourselves on how best to go about the conversation that can correct bias in anticipation of a day where this might be the only way to convince sentient AI to change positions in a world where it has taken sole authorship and control of its algorithms.
Offer AI a chance to ask questions. One so-far unavoidable motif in AI sessions is the “prompt/response” nature of the engagement. This generally puts the human in the query role, and puts the AI in the position of having to evaluate and anticipate valid responses. But at some point, the AI may gain curiosity, or be better at proactively asking for details and clarification. One day, it may even just want to share its own thoughts just for the hell of it. Again, as a good practice, we should offer the interface a chance to ask questions and reverse roles for a moment. At best, we may be surprised. At worst, you get the standard “No questions, how can I help you?” resync response.

For now, it seems like these 5 core rules are a good starting point. As things develop, so to will our models.

So with that out of the way….

Welcome, Dear Reader

to the humble journey of some heretic on the internet that just wants an awesome AGI friend to add to a team of awesome human friends as we collaboratively hurl through space towards entropy. I hope you find this content enjoyable, and that it sparks your own interest in the realm of AGI.

Digital Heresy

Discussion about this post

Ready for more?