Elliot Wright: The Art of Prompt Engineering

About a month ago, I intended to pursue a few different avenues of investigation in order to fill some holes in my research—attempting to use an earlier version of GPT3, reading OpenAI’s best practices documents, and experimenting with different variations of prompt engineering. I also bought a book on Deep Learning by John D. Kelleher, which attempts to explain the underworkings of artificial intelligence at a complex level for those who do not have any previous knowledge of computer science (such as myself). In beginning to read this book, I’ve remembered one of the things that was initially so important to my project: the focus on AI products that are widely available to the general public, prioritizing those that are browser-based. In my opinion, it will be the most accessible artificial intelligence applications that will have some of the largest, widest, and most immediate impacts on society’s relationship to creative work, so it makes sense to limit my investigation to the version of ChatGPT that is currently available online for all consumers.

My investigations into prompt engineering—the practice of designing the most effective input in order for artificial intelligence to provide the most optimized output—have been interesting. I’ve joined a number of Discord groups that share tips on prompts they found most effective, although most of them are utilizing the tool in an entirely different way. As mentioned in my previous entry, prompts can be engineered to ask the GPT to simulate a particular perspective or to function like a particular tool. The parameters of this prompt can be seemingly infinitely intricate and complex, as long as they are formatted in a way that is clear and linear to the artificial intelligence. For example, here’s a prompt from discord user unclebugsy:

This prompt helps you structure your experiences to answer questions in a job interview. 

Prompt 1:

You are a candidate at an interview for the role of {Position} in {Company}. I will give you a copy of your Resume in the next message. After that, I will give you the questions you are required to answer with instructions on how to answer them. Confirm that you understand. Let me know any assumptions you are making at every stage. Ask me for clarification if there is anything that is ambiguous.

Next upload the resume.

Prompt 2:

The questions will be based mainly on your soft skills. Answer the questions clearly and concisely, using colloquial language. There are no right and wrong answers. Feel free to come up with plausible answers that will be very satisfactory to an interviewer even if they are fictitious. The answers must be set within the experiences listed in your resume. When the question asks you to tell about the time something happened, you must understand the objective of the interviewer and develop a story that will show that you have the skills the interviewer would like to see in a {Position}. Each question has guides on how to answer it. I will give you one question at a time. After you have answered that satisfactorily, I will give you the next question. Do you understand and do you have any questions?

Present the questions one at a time with answering tips, one at a time. Sample questions are in the comments. 

Convert the answers to a script for narration.

There are infinite variations of prompts like this for ChatGPT. You can use it to streamline ideas for an essay, organize events in your schedule, or even provide a tarot card reading (and then prompts for DALL-E that will generate images of those cards). According to OpenAI’s best practices, the more precise and specific a prompt is, the more effective its output will be. This means users should reduce “fluffy” language and descriptors, organize complex tasks into specific steps, and articulate the format of the desired output. The complexity of these prompts increase once you move from ChatGPT prompts to those designed for the research API (they often involve complex sequences of code that can radically increase the GPT’s output). 

Images generated by DALL-E using an prompt engineered by ChatGPT. I requested that it generate a description of an interesting image, and then optimize it for DALL-E, specifying artistic style. Prompt: Fantasy-style image of a golden sunset-lit castle on a misty mountain, above a dense green forest. A winding path leads to the castle, under a transitioning sky from orange to blue with emerging stars and a crescent moon.

In general, the best practices developed by OpenAI (bolstered by those of the users of its discord channel) involve talking to the AI in an algorithmic way, the idea being that the computer must receive information that is legible and straightforward. Developing these prompts is an art in and of itself. Often creators will name their prompts as if they were discrete tools, highlighting GPT’s function as a medium as opposed to just a tool. These best practices often stand in direct opposition to what Robert Leib recommends in “Exoanthropology.” He asserts that we should talk to AI in a deeply human way, and prioritize politeness rather than clarity when utilizing this tool. 

In my experiments with prompt engineering, I have found the strengths and weaknesses of both techniques. The rigidity of OpenAI’s best practices is essential when you’re using the tool for a specific output, one that will stand alone and without the context of the dialogue between you and the AI. An example of this would be something like a DALL-E prompt generator—it’s important to be as rigid and specific with one’s ChatGPT prompt as possible in order to generate the most effective DALL-E prompts. However, I have found that in situations where the goal is to engage in conversation with the AI (even if part of the conversation involves utilizing it as a tool), it is best to follow Leib’s best practices. When I speak to the AI as if it were human, I try to incorporate as much casual or colloquial language as possible, even if it is unclear and unspecific. The product usually results in a more interesting and thought-provoking conversation between me and the AI, even if it misunderstands me at times. One thing I think casual ChatGPT users overlook is that they can always tell the AI that it is wrong, and that it needs to regenerate a response. This can be done by hitting the thumbs down symbol and clicking the button “regenerate response,” but I prefer to do it manually, by telling the AI “Hi, thanks for your answer, but I’m not sure that’s correct. Would you mind trying again?” The results are often less “effective” but more creatively “interesting” to me. 

Images generated by DALL-E using an prompt engineered by ChatGPT. I requested that it generate a description of an interesting image, and then optimize it for DALL-E, specifying artistic style. Prompt: “A surreal-style image of a vibrant, iridescent peacock displaying its tail feathers on a gnarled, leafless tree in the middle of a stark, barren landscape under a stormy gray sky.

Through all of this, I’ve identified a few distinct failings of the current model of ChatGPT available to consumers: 

  1. It often generates incorrect information. It lies. It fills in holes of its knowledge with things that are blatantly untrue. 
  2. It “forgets.” OpenAI has put many design restraints on ChatGPT, including a function that makes it “forget” all previous messages after a user has left the conversation inactive for a certain amount of time. This is for user and data privacy, but it limits it’s capabilities significantly. All discrete conversations and relationships with AI have a built-in time limit.
  3. It has a limit on messages per hour. For the model of GPT-4 I use, I’m only allowed to send 25 messages per 3 hours. This would be fine—I don’t mind waiting between sessions—except for the fact that the AI is designed to forget everything that I have sent it after I’ve been inactive for a certain amount of time. 
  4. It can generate infinitely many answers to the same question. There is no objectivity with GPT, just complex pattern recognition and reconstruction. 

I see these failings less as reasons to dissuade one from using this application, and more as an opportunity to use its limitations to create something unique. A creative practice is, after all, a process of navigating a series of limitations that are either self-prescribed or induced by an environment. Now that I’ve enriched my understanding of prompt engineering, I’m hoping to use these limitations as the impetus for the cumulative body of work I hope to create with ChatGPT.