September 24, 2022

Of Gods and Dice

A keen observer — or perhaps believer (I kid) — would notice that @gods_txt and @ai_hexcrawl have been in the doldrums since about July. First, I’d like to allay any concerns that I’ve abandoned my little projects: no, I’m not done, not yet at least. Things have reached a nadir mostly due to personal reasons. In short, I’ve taken on a new role at work that has lead to a 25% increase in my working hours along with the mental and emotional strain that accompany it.

Meanwhile, I’m caught between the pressures of including AI-generated images, enhancing my workflow, and keeping up with advances in the tools available. If that’s sufficient explanation for you — not that I owe anyone — great! You don’t need to read any further. However, one might ask “this is all automated, they’re bots after all, why would your time have any bearing on their functioning?” which is a valid point, so I’d like to take the opportunity to explain. If that is of interest, then read on.

In the Beginning

So, first of all, I am not a computer scientist, developer, coder, data scientist, machine learning engineer, etc. I’m actually a chemist. Sure, I’ve taken the odd programming course in high school and college, but I never found a good reason to develop that skill beyond slapping together snippets found from Google searches for a VBA macro. My day-to-day work is pretty physical, and I’m on my feet operating process units. On top of that, I’m a father of a 4- and 6-year-old. As you could imagine, that leaves precious little time to develop new skills, though that hasn’t stopped me from trying. Oh yeah, and I have time management issues, and I let even recreational tasks that I know will take concentration intimidate me into abandoning them altogether.

My first exposure to NLP was via Adam King’s Talk to Transformer in 2019. It was a simple implementation of one of the smaller GPT-2 models. I picked and prodded and tried to ascertain what it “knew,” and its biases. Bias, if you’re not already aware, is a concern for AI. It’s not a hard concept: what you put in is what you get out. My thought at the time was using such a language model to purposefully reveal biases, or in the very least connections, across a range of communities. I’ve also been fascinated by emergence, and so I entertained the possiblity of something new arising from information that might have not otherwise been related. But to what end?

I have dozens of half-finished notebooks full of pages of half-finished ideas for writing fiction. I also play TTRPGs (if that wasn’t already ovbious), and I know from experience that those half ideas take shape when I throw them at a group, it becomes concrete, and I receive their response that I have to respond to in turn. I imagined I could throw ideas against against an NLP model at my leisure with the aforementioned ability.

Let There Be Light

In early 2020, I became aware of Max Woolf’s gpt-2-simple. It was available in a Google Colaboratory notebook, and only required tweaking a few variables to train a model and generate text. I was overwhelmed by the possibilities. I was finally able to train an NLP model as I had imagined. What was I going to do? So as in all things, I had to start with the creation of the universe.

If you are reading this and are aware of my Twitter presence, then I’m sure you can imagine what happened next. From February 2020 until about April 2021, I would generate thousands upon thousands of outputs from the trained model. Sometimes I would clean up or add to the training set and retrain. The unfortunate thing is that these retrained models — or at least the ones that I was working with — are not always as coherent as you may see posted. It required hours of scanning for diamonds in the rough because the models I was generating were built on GPT-2. Did I mention that I have concentration issues?

I’m sure I could have put something together that produced more coherent results more frequently, but I hardly knew what I was doing. I told myself that I’d study up on transformers architecture and format the training data differenly or use different hyperparameters with that newfound knowledge. As always, life just got in the way, so I kept picking from what I had already generated. Somewhere in this mess, I also put together AI Hexcrawl, mostly using GPT-3 and Megatron-15B (via Adam King’s Inferkit).

Sorry but Your Pictures Are in Another Model

Things got complicated — in a good way — when Bokar N’Diaye started using the @gods_txt tweets as prompts for various generative image models. As you may know, images increase engagement on Twitter, so the bot started getting in front of more people. I was familiar with advadnoun’s LatentVisions, but I had already opted to not use it myself becuase I truly wanted there to be the broadest interpretation of the generated text as possible, my lack of time notwithstanding.

In late 2021 I happened across dribnet’s Pixray, which generated pixel art. Given that pixel art at certain resolutions require the mind to fill the gaps, I was happy with most of its outputs and thought it a perfect fit for AI Hexcrawl. I’d often generate one image and add it the queue with its respective prompt.

Once I started doing that I looked into other models such as Disco Diffusion

I wanted to chase the engagement. Not for personal glory, but to share the things I enjoy with a broader audience. I will say that I have met some awesome people, but I hope I’ve been transparent enough about my role in all of this. I am a curator not a creator. In any case, I started applying Disco Diffusion to both bots, but I started being more choosy about which outputs would accompany the text. So I was spending time both generating and evaluating images the night before the posts were to be published. As my transition at work started, I had less and less time to spend on this, and I started missing deadlines. I eventually decided to put everything on pause.

From Here to Eternity

I haven’t generated any new @gods_txt outputs since April 2021. Everything until the last post was from the collection I had curated and put in the post scheduler. As it appeared the well would run dry, I started lowering the post frequency, hoping the images would offeset the lack of content.

I really don’t want to scour thousands of outputs anymore, even thought I really felt like gpt-2-based models produced more of what I felt achieved the goal of the project. I also don’t want to have to judge images to match. I simply don’t have the time for it anymore.

In order to make these projects more sustainable, there needs to be a step-change in the tools used, even if that means I make compromises (real or perceived) with the authenticity of the work, my main concern being that of the text of @gods_txt. That all beings said, the following changes will take place:

Outputs will be generated from GPT-3, fine-tuned on the same training set.
Images will be created using Stable Diffusion despite some of my reservations about the community that seems to have developed around it and the current discourse surrounding text-to-image models and so-called “AI art.”

As much as it pains me to say, I can’t tell you whether the use of GPT-3 over its smaller precedessors dilutes the influence of the training set, but it seems to produce high-quality text without compromising the tone which its audience has come to expect. I’m not going to comment further on Stable Diffusion aside from the fact that it, as a tool, produces some compelling results.

As always, my concern with AI and machine learning is about the people creating and deploying it because I do think there are ethical questions about sourcing training data and releasing models into the wild, but that is a rant for another day.

Posted by Travis

personal AI meta