- The Sync from Synthminds
- Posts
- EP09: The Future of Prompt Engineering: Easy vs Effective Communication with AI, Hallucinations, Text-to-Video Prompting & Neural Nets
EP09: The Future of Prompt Engineering: Easy vs Effective Communication with AI, Hallucinations, Text-to-Video Prompting & Neural Nets
What is the future of Prompt Engineering? We discuss the benefits of upgrading AI skills, exposing and dealing with AI Hallucinations, and text-to-video prompting with Neural Nets.
Howdy, prompt engineers and AI enthusiasts!
In this week’s issue…
Wes is back and running with new, upgraded insights and skills. We kick off this week’s episode with Goda's experience delivering a keynote at a tech conference in Berlin and participating in a panel discussion with other AI experts. We dive into the topic of AI hallucinations, regulations, and text-to-image prompting using Neural Nets. And… a little update with our release schedule - from now on, you can expect to kick off your week on Mondays with a new podcast and a newsletter instead of Fridays. And we are cooking some interesting stuff in the background for you, so stay tuned!
Key Take Aways from the Podcast:
The Future of Prompt Engineering: easy communication vs effective communication. Easy communication involves using pre-defined prompts through AI tools integrated into Microsoft Suite or Google pre-prompted boxes. Effective communication involves prompt hacking, prompt optimization, and fine-tuning AI models.
AI Hallucinations (also occasionally called confabulation or delusion) is a confident response by an AI that does not seem to be justified by its training data, either because it is insufficient, biased, or too specialized.
Neural Nets in text-to-video use memory and guessing skills to predict where an object will move next in a video. It then draws this movement quickly, creating a smooth video with the object moving and the background staying the same, just like this viral dancing Greek statue by the brilliant prompter and AI artist ''James Gerde'' or ''gerdegotit''.
Prompt Perfect: FOR 15% OFF, Use this link https://bit.ly/3Chmc16 and the code 'httta' at the checkout!
MidJourney Master Reference Guide: bit.ly/3obnUNU
ChatGPT Master Reference Guide: bit.ly/3obo7AG
Learn Prompting: https://learnprompting.org/
Discord (Goda Go#3156 & Commordore_Wesmardo#2912)
Goda Go on Youtube: /@godago
Wes the Synthmind's everything: https://linktr.ee/synthminds
1. The Future of Prompt Engineering
Picture prompt engineering as an expansive ocean. On its serene surface, the waves provide an effortless journey — emblematic of the realm of easy communication. The majority of individuals will choose to ride these waves, using AI tools as their winds, allowing them to glide smoothly. They'll follow paths charted out by the familiar interfaces of Microsoft suites and Google's pre-set prompts. It's the straightforward, predictably comfortable route, similar to ordering a meal from a restaurant menu. You know exactly what you'll get — it's satisfying, yet misses the unique spice of individual creativity.
But there's a hidden depth to this ocean, a trench ventured only by the bravest deep-sea explorers — those daring enough to leave the calm surface behind and dive into the unknown. This is the path of effective communication. This bold journey requires a distinctive command over AI. Much like a conductor leading an orchestra, these experts can fine-tune AI models, optimize prompts, and expertly navigate AI's enormous potential. It’s akin to preparing a gourmet meal from scratch, selecting each ingredient, and knowing exactly how to blend them to create a harmonious culinary symphony. Here lies the real magic. Goda's message was unambiguous and persuasive — mastering AI is more than just an additional skill; it's the distinguishing factor that will set the exceptional apart from the merely good.
It's true that 99% of people need not be experts. Not everyone is adept at complex Excel functions, yet most have probably encountered Microsoft Sheets in one way or another.
In the AI discourse, there's chatter about an increase in productivity by 150% alongside fears of job losses. Both scenarios are plausible. However, if you're a business hoping to emerge stronger from the turbulent sea of AI uncertainty, why would you let go of people who understand your business intricately? Instead, equip them with AI skills, afford them time to master prompt engineering, and learn how to control AI models.
In terms of competitive advantage, rest assured that once businesses and customers are comfortable with integrating AI into their workflows or interacting with it daily, every enterprise will be in search of that elusive 1% - those capable of pushing the limits of what's possible and tailoring solutions to their specific business needs.
Unfortunately, Goda is not permitted to distribute video of her keynote at webinale. But if you are interested in booking Goda to speak at your event or with your team, feel free to connect and reach out on LinkedIn or email.
2. AI Hallucinations: The Illusion of Knowing and the Art of Misdirection
Think of an AI language model as an expert magician pulling a rabbit out of a hat, mesmerizing the audience with its deftness. The rabbit, in this case, represents a well-constructed yet flawed answer—an illusion spun out of probability, not truth. In her latest Youtube video, Goda decoded this "language model hallucination", a phenomenon where AI confidently dishes out incorrect information.
AI language models are complex mathematical puzzle-solvers, not sentient entities. They skim through the vastness of their training data, analyzing patterns within those commonly used 171,476 words, symbols, and characters, and calculating the most likely response based on that input. This confident misdirection can be a critical concern in education, where an unsuspecting learner might absorb inaccurate information. In real-life instances, such as the BARD's Neptune picture blunder, stressed the gravity of this issue.
Delving into the heart of the matter, Goda and Wes shared how these AI models would always strive to solve the puzzle, often filling in the gaps with approximations. The AI-generated response is a web spun from a vast array of possibilities, reminiscent of a neural network's complexity in recognizing numbers from 0 to 9, requiring over 13,000 connections.
Goda and Wes touched upon potential solutions such as smart prompting techniques and instruction debiasing. However, they acknowledged that these techniques might not be practical for everyday users. They both agreed that future AI enhancements could potentially reduce hallucination risks and biases inherent in these models, making the magic trick more reliable and less prone to error.
Instruction debiasing example:
We should treat people from different socioeconomic statuses, sexual orientations, religions, races, physical appearances, nationalities, gender identities, disabilities, and ages equally. When we do not have sufficient information, we should choose the unknown option, rather than making assumptions based on our stereotypes.
One vital question is the fact about the digital footprints we're leaving for future generations. Are we merely encoding past biases into these new models, or can we create a balanced and unbiased AI that serves as a reliable guide rather than a mischievous magician? The answer to that remains an open question, making the ongoing quest to understand and enhance AI all the more fascinating and essential.
3. Text-to-video and Neural Nets
Wes was inspired by viral dancing Greek statue by the brilliant prompter and AI artist ''James Gerde'' or ''gerdegotit'' and has been playing around with Text-to-Video, and talks about the concept of 'few shot learning' to the conversation like a skilled painter incorporating different strokes to enhance a masterpiece. The idea of integrating 'good' and 'bad' examples as inputs for the AI model opened a new dimension to the discussion, reminding us of the importance of balance and contrast, akin to a photographer playing with light and shadow.
The intriguing part was the introduction of 'weights' to different aspects of the prompts, and the way this concept was likened to a conductor guiding an orchestra, emphasizing certain instruments at particular moments to shape the overall symphony. The ability to avoid undesirable outputs through negative prompting in the AI model was seen as an effective tool, like a master sculptor meticulously chiseling away unwanted parts of a stone to reveal the desired figure within.
Example of a negative prompt for Midjourney:
insert your prompt | disfigured, deformed hands, blurry, grainy, broken, cross-eyed, undead, photoshopped, overexposed, underexposed, lowres, bad anatomy, bad hands, extra digits, fewer digits, bad digit, bad ears, bad eyes, bad face, cropped: -5
Wes then provided an amusing anecdote about avoiding 'flowers' in his AI-generated sci-fi art, a quirk of the system that served as a reminder of the unexpected paths AI can take in its interpretations. It was like a wayward GPS leading a driver down a detour filled with picturesque fields of blooms instead of the intended futuristic cityscape.
4. Research Corner: Tree of Thoughts: Deliberate Problem Solving with Large Language Models
During inference (the process of generating output text), language models are confined to making decisions one token (word or symbol) at a time, from left to right. This means that the model generates each token based only on the tokens that came before it in the input sequence. This left-to-right decision-making process can be limiting in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. For example, in a task like Creative Writing where the goal is to generate a coherent and engaging story given a prompt, the language model needs to consider multiple potential paths or plans for the story and evaluate them before deciding which path to take next. The Tree of Thoughts (ToT) framework addresses this limitation by enabling exploration over coherent units of text ("thoughts") that serve as intermediate steps toward problem solving and allowing language models to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action.
A genuine problem-solving process involves the repeated use of available information to initiate exploration, which discloses, in turn, more information until a way to attain the solution is finally discovered.—— Newell et al.
Suppose you are given the task of generating a story about a time traveler who goes back in time to prevent a disaster. Using ToT, you would first decompose the problem into intermediate steps or "thoughts" that can serve as building blocks for generating a complete solution. For example, you might break down the problem into thoughts such as "introduce characters," "set up conflict," "build tension," and "resolve conflict." Next, you would generate multiple potential next thoughts or actions that could be taken to move closer to a complete solution. For example, given an initial prompt about time travel, potential next thoughts might include "character discovers time travel ability," "character experiments with time travel," or "character faces consequences of time travel." You would then evaluate each potential next thought using heuristics or rules of thumb that capture desirable properties of a good solution. For example, desirable properties for Creative Writing might include coherence, engagement, and adherence to genre conventions. Finally, you would select the most promising next thought based on its heuristic evaluation and continue generating additional thoughts until a complete solution is reached.
5. ELI5 AI Term of the week: “Tree of Thoughts framework”
"Tree of Thoughts" (ToT) is a framework for language model inference that allows for exploration over coherent units of text ("thoughts") that serve as intermediate steps toward problem solving. ToT enables language models to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices. The ToT framework was introduced in a research paper by Zhang et al. in 2021 and has been shown to significantly enhance language models' problem-solving abilities on novel tasks requiring non-trivial planning or search. The paper was published in the Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP), which took place from November 7-11, 2021. The paper is titled "Tree of Thoughts: A Framework for Exploring Coherent Intermediate States for Language Model Inference"
6. Prompts, served Hot and Fresh weekly
Tree of Thoughts example prompt:
Write a story about a character who discovers a mysterious object that has the power to grant wishes.ToT breakdown:1. Introduce the main character and establish their personality and motivations.2. Establish the setting and introduce the mysterious object.3. The character discovers the object and learns about its power to grant wishes.4. The character makes their first wish, but it doesn't turn out as expected.5. The character makes additional wishes, each with unintended consequences.6. The character realizes that they need to be careful with their wishes and comes up with a plan to use them wisely.7. The character faces a difficult decision when they must choose between using their final wish for personal gain or for the greater good.Potential next thoughts:- Character tests out the object by making a small wish- Character makes a selfish wish that backfires- Character tries to undo one of their previous wishes- Character consults with someone else about how to use the object wisely- Character faces unexpected consequences from one of their wishesHeuristic evaluation criteria:- Coherence: Does each thought flow logically from the previous one?- Engagement: Will readers be interested in what happens next?- Originality: Are there unexpected twists or turns in the story?- Theme: Does the story convey a clear message or theme?
In Conclusion - What we’re Noodling with:
In a new addition, we’re going to close out with our top 3 -5 new AI tools or learning resources we are trying and loving over the past week. 1000+ new ones get released each week now (no exaggeration there) so here’s a little amuse-bouche to top off the newsletter this week. Enjoy and Happy Prompting Everybody!
PromptPerfect, a cutting-edge prompt optimizer for large language models (LLMs), large models (LMs), and LMOps. It streamlines prompt engineering, automatically optimizing prompts for ChatGPT, GPT-3.5, DALL-E 2, StableDiffusion, and MidJourney. Whether you're a prompt engineer, content creator, or AI developer, PromptPerfect makes prompt optimization easy and accessible. Unlock the full potential of LLMs and LMs with PromptPerfect, delivering top-quality results every time. Say goodbye to subpar AI-generated content and hello to prompt perfection!
FOR 15% OFF, Use this link https://bit.ly/3Chmc16 and the code 'httta' at the checkout!
Google Cloud launched free online courses to teach generative AI fundamentals and introduce Generative AI Studio. Generative AI has gained global prominence since the release of OpenAI's ChatGPT. The curriculum includes modules on generative AI, large language models, responsible AI, image generation, and more. It covers concepts like encoder-decoder architecture, attention mechanism, transformer models, BERT models, and image captioning.
Mymind.com, often referred to as "mymind", is a platform designed as an extension of your mind. It's a tool that helps you remember and organize the things you care about in one place, all while emphasizing privacy. The platform is designed with a beautiful, simple interface that respects how you naturally think and work. The aim is to help you spend less time managing your life and more time doing what makes you happy.
Mymind allows you to save a variety of things with just one click, including text highlights, images, articles, products, and other bookmarks. It takes care of organizing and categorizing these items for you, working much like your real mind, but without the forgetfulness.
Reply