Nuts N Bolts

Explanations and  How To's Prompts


Go to Prompt Page: Dall-E

Images produced by DALL-E when given the text prompt "a professional high-quality illustration of a giraffe dragon chimera. a giraffe imitating a dragon. a giraffe made of dragon."

Revealed in 2021, DALL-E is a Transformer model that creates images from textual descriptions.

Also revealed in 2021, CLIP does the opposite: it creates a description for a given image. DALL-E uses a 12-billion-parameter version of GPT-3 to interpret natural language inputs (such as "a green leather purse shaped like a pentagon" or "an isometric view of a sad capybara") and generate corresponding images. It can create images of realistic objects ("a stained-glass window with an image of a blue strawberry") as well as objects that do not exist in reality ("a cube with the texture of a porcupine"). As of March 2021, no API or code is available.

In March 2021, OpenAI released a paper titled Multimodal Neurons in Artificial Neural Networks, where they showed a detailed analysis of CLIP (and GPT) models and their vulnerabilities. The new type of attacks on such models was described in this work.

DALL·E 2 has been trained on millions of stock images, making its output more sophisticated and perfect for enterprise use. DALL·E 2 produces a much better picture than Midjourney or Stable Diffusion when there are more than two characters. Useful for Everyday average users. Microsoft Bing Ai Pic Generator uses Dall-E

Stable Diffusion 

Go to Prompt Page: Stable Diffusion

Stable Diffusion is a deep learning, text-to-image model released in 2022. It is primarily used to generate detailed images conditioned on text descriptions, though it can also be applied to other tasks such as inpainting, outpainting, and generating image-to-image translations guided by a text prompt. It was developed by the start-up Stability AI in collaboration with a number of academic researchers and non-profit organizations.

Stable Diffusion is a latent diffusion model, a kind of deep generative neural network. Its code and model weights have been released publicly, and it can run on most consumer hardware equipped with a modest GPU with at least 8 GB VRAM. This marked a departure from previous proprietary text-to-image models such as DALL-E and Midjourney which were accessible only via cloud services.

Stable Diffusion is an open-source model accessible to everyone. It also has a relatively good understanding of contemporary artistic illustration and can produce highly detailed artwork. However, it needs an interpretation of the complex original prompt. Stable Diffusion is excellent for intricate, creative illustrations but falls short when creating general images such as logos. Stable Diffussion is also the most advance because you can train models that the Ai will pick up to produce excellent very detailed results.  

Compared to MidJourney, stable diffusion needs specific prompts for the best results. Therefore, you need to provide it with the exact thing that you need. For example, if you want it to generate cute grey cat pictures, you must mention the color of the cats. Otherwise, it would simply create cute cat images.         Useful for gamers or Programmers


Go to Prompt Page: Midjourney

Midjourney is an artificial intelligence program created by a San Francisco-based independent research lab Midjourney, Inc. Midjourney generates images from natural language descriptions, called "prompts", similar to OpenAI's DALL-E and Stable Diffusion. It is speculated that the underlying technology is based on Stable Diffusion. The tool is currently in open beta, which it entered on July 12, 2022. The Midjourney team is led by David Holz, who co-founded Leap Motion. Holz told The Register in August 2022 that the company was already profitable. Users create artwork with Midjourney using Discord bot commands.

Midjourney, on the other hand, is a tool best known for its artistic style. Midjourney uses its Discord bot to send as well as receive calls to AI servers, and almost everything happens on Discord. The resulting image rarely looks like a photograph; it seems more like a painting. More useful for artists! 


How to Prompt Page: Chat GP

The GPT model

The original paper on generative pre-training (GPT) of a language model was written by Alec Radford and his colleagues, and published in preprint on OpenAI's website on June 11, 2018. It showed how a generative model of language is able to acquire world knowledge and process long-range dependencies by pre-training on a diverse corpus with long stretches of contiguous text.


Current Version

On March 14, 2023, OpenAI announced the release of Generative Pre-trained Transformer 4 (GPT-4), capable of accepting text or image inputs. OpenAI announced the updated technology passed a simulated law school bar exam with a score around the top 10% of test takers; by contrast, the prior version, GPT-3.5, scored around the bottom 10%. GPT-4 can also read, analyze or generate up to 25,000 words of text, and write code in all major programming languages.

Examples below show the differences between the three with the same prompt.

GPT-3.5 GPT-4 OpenAI prompts artificial intelligence gen-1 1.0 2.0 v1-2 v1-5 dalle-mini v2 v5 text-to-image text-to-video 4000 artists
GPT-3.5 GPT-4 OpenAI prompts artificial intelligence gen-1 1.0 2.0 v1-2 v1-5 dalle-mini v2 v5 text-to-image text-to-video 4000 artists
GPT-3.5 GPT-4 OpenAI prompts artificial intelligence gen-1 1.0 2.0 v1-2 v1-5 dalle-mini v2 v5 text-to-image text-to-video 4000 artists