People often speak about many different concerns they have for the future of AI: mass surveilance, unmanned weaponry, bioweapons, an unprecedented consolidation of wealth, et cetera. But if we're being honest, these are the same old boring problems we've always had to deal with. Not even my grandparents know a time when oligarchs and mustard gas were merely a conception of science fiction. Sure these pursuits are greatly hastened by the advent of AI, but there are much more novel and devious forms of punishment that we will be subject to such as sloppification, and the worst part is that we will be culpable for our own sloppifaction.
What do I mean by the great sloppification of the world and why does it sound like the sixth mass extinction. It sounds like that because it is, but not through traditional measures of destruction; rather it will be an extinction through extinguishing. Just as a flame dies in a blanket of carbon dioxide, so too will we die in a blanket of slop. Our voices will be silenced and marginalized not through force but through tyranny of the majority. We will obscure our own humanity and in doing so extinguish our souls.
In a sense, generative AI is not malevolent, it just gives us the machinery to be consumed by our own most gluttonous desires, and so consumed we shall become. When everything becomes easy, what is stopping us from doing everything. That's why we should write all of our emails with AI and all of our essays and our speeches and our love letters and our poetry. Let's make websites with it and films and music and photographs. And since it's so easy, let's make a million times more than we've ever made before because why not (Check out My AI Handbook for more tips on responsible AI use).
But now, since we've expanded everything and there is so much to consume, we don't have time to consume it anymore! But that's no problem because we can merely summarize everything we've generated and consume that instead. So in this way we have built ourselves this weird world where we communicate simple ideas through these colossal manifestations. Under one interpretation, we've actually made a reverse Autoencoder where the latent space is actually far more expansive than the input and output dimensions. The point of an autoencoder is to compress information so that we can represent the most important features of a certain input, the premise being that due of the structure of our world, we are able to reconstruct inputs with much less information that was originally given. But when I write a twenty page project specification using AI, I've essentially encoded a small amount of information in even MORE dimensions than I started with. Perhaps then it would be more accurate to call this process decoding, but then that makes our thinking space the latent space which is weird. I also think it is quite funny to visualize an autoencoder with a massive latent dimension (Thank you Greg from Everstar for planting this seed in my head). Then when you summarize my AI specification, you're decoding it back to a normal level of information per dimension, a reasonable level of information density.
I've been thinking about this phenomenon and how we would all be better off if we just stopped filling up all of our breathing room with slop and just sent eachother the prompts we wrote instead. So that's what this project is.
There is a bit of a nuance to this reverse Autoencoder analogy which is that when you summarize the AI email you recieved, you aren't actually getting out what was put in (which is what a traditional autoencoder attempts to do). Instead, you're getting an approximation of it because it has the same number of dimensions, but it might be rotated a bit in our thinking space. In other words, the encoder is not the inverse of the decoder in this scenario even though the function goes from to dimensions (where >> ) and goes from to dimensions. instead is some other function such that h is some stretching or rotation of the idea, but not an expansion. All this to say, when I summarize your AI email, I don't get the prompt you used to generate the email, I instead get a summary of the email it actually generated. The function is a bit confusing but it may be something like what information does this text hope to convey in the thing it is creating.
For this project, I wanted to see if I could approximate that true inverse, , using generative AI. So that's what I've done. Use this on all of the AI generated slop you recieve to see what the original input was or might have looked like. Now of course it's impossible to get the exact input that was used to generate any particular output; namely because generative models are probabilistic and generally not open weight, but also because generative models are not injective functions. Every output could have been made by any number of inputs. For example, the prompts "output just the phrase: hello world" and "translate this phrase: hola mundo" might result in the same phrase coming out. Or you could theoretically get the same email by having a prompt generate it from scratch and having one that shortens an even longer email. So generative models are not injective, but they are surjective since theoretically you could output any possible string you wanted assuming the model is well behaved enough (eg. "output the following: [desired-string]" with temperature equal to one). Also I should note that I'm speaking strictly of language models right now as this is not always the case. For example, current architectures for diffusion models have compressed latent spaces so not every possible output is reachable. Also with current language models having fixed window widths, they too would not be surjective (again assuming temperature of one).
The point is, for any possible output, any number of prompts could have possibly generated it, so to choose the one to present, we will take Occam's razor and choose the simplest one. What is the shortest possible prompt that could reasonably generate this output. Obviously, any input could theoretically generate any possible output due to the probabilistic nature of the models (assuming no top-p sampling is used) even though it might as well be impossible, so we'll choose one that could at least reasonably generate this with a high probability (where that high probability is still essentially zero but in log space is reasonable). Essentially we are minimizing the length to remove as much slop as possible while simultaneously maximizing the perplexity. This is the balance we will try to find.
In the end, I figured out a prompt and few shot examples that worked well enough, so here it is! Try it out with any AI slop that is sent to you or you find. It is still limited to text but maybe one day I will expand it to include images, songs, videos, code, websites, etc. Also it is ratelimited, but I didn't put that much effort into making sure this isn't abused, so please be kind.
Why do we engage in this hyper inflated latent space when it clearly doesn't actually convery much more than was originally said in the prompt? Well for me, at least, it is because of growing expectations of output. In the age of AI, each worker is now expected to produce ten times as much work, so we have to use AI. And since everyone else ends up producing a ridiculuous amount of work, we too must engage in that culture. The expectation enforces the behavior and the behavior reinforces the expecation. But it doesn't have to be this way. Writing your emails by hand is an act of rebellion. Writing your pull request descriptions by hand is resistance. Give people the summary they want instead of slop that is only intended to give the impression of productivity. Use your brain.
I suppose that is my call to action. Fight back against the mass creation of slop so that we exist in a world of beauty, not one of diamonds in the rough.