Mother’s Day is coming — create a storybook that feels truly personal

Back to blog

The Technology Behind Storique

author
Valentina
Published on Oct 26, 2024
image

At Storique, we’re passionate about bringing your personalized stories to life with stunning illustrations. Central to this magic is the cutting-edge technology behind our platform—specifically, a family of diffusion models.

In this blog post, we’ll dive into the fascinating world of diffusion models, particularly how simpler models like Stable Diffusion work, and explain how they power the unique storytelling experience we offer.

What Are Diffusion Models?

Diffusion models are a class of generative models that have gained significant attention for their ability to produce high-quality, aesthetic images. Unlike traditional generative methods, diffusion models gradually transform a simple distribution (like random noise) into a complex one that represents the data we want to generate—like the illustrations in your storybook.

How Diffusion Works

The process involves two main phases: forward diffusion and reverse diffusion.

Forward Diffusion: In this phase, the model systematically adds Gaussian noise to the data (such as images), progressively obscuring the original content. This helps the model learn the intricate structure and distribution of the data, effectively breaking down clear images into a noisy representation.

Reverse Diffusion: After the model learns how to add noise, it then learns to reverse this process. It takes the noisy data and refines it back into a coherent image. This is done iteratively, where each step reduces the noise slightly, bringing the image closer to a recognizable form. The model is trained using a process called denoising score matching, where it learns to predict the original data from the noisy input.

The Role of Stable Diffusion

Stable Diffusion (SD) is a state-of-the-art diffusion model that has revolutionized image generation and is an important subprocess in the illustration pipeline of Storique. Here’s a deeper look at how it works:

Latent Space: Unlike some diffusion models that operate directly in pixel space, SD works in a lower-dimensional latent space. This means it encodes images into a more compact representation, making the generation process faster and more efficient. The model operates on these latent representations, allowing for high-quality output without the computational load of pixel-space generation.

Conditional Generation: SD is designed for conditional image generation, meaning it can generate images based on specific input prompts. When you describe a scene or character in your story, the model uses this context to guide the diffusion process, ensuring that the generated images align with your narrative.

Customization: SD allows us to create highly aesthetic pictures. Our custom-trained model learns special features from the photos you provide, capturing the unique attributes of your characters. However, personalizing these models is no small feat; it requires a deep understanding of the mathematical operations involved in adapting the diffusion process.

Challenges in Personalization

Creating a very specific personalized model is a complex task. To reach a reliable result we combine the above described technique with several other algorithms, which dictate how the model learns from your input images and how the resulting image is being put together. Fine-tuning such models involves significant computational resources, resulting in relatively high costs.

To address this, we employ various techniques from computer science, such as quantization. This process reduces the precision of certain calculations, making operations less expensive while still maintaining output quality. By optimizing our models in this way, we can offer Storique at an affordable price, ensuring that more people can access the joy of personalized storytelling.

The AI Training Process

To ensure that our diffusion models accurately reflect your characters, we undertake a careful training process:

Photo Collection: You upload eight photos of each character you want to feature in your book. This diverse set of images helps the model learn various angles, expressions, and styles. Your role is very important in this step: the higher quality and variety your pictures have, the better model we can train for you.

Model Training: Our automated process trains the multi modal AI pipeline on your images, ensuring that it captures the essence of your characters in an optimal way. This model becomes uniquely yours, with no access for anyone else, ensuring privacy and exclusivity.

Image Generation: When you write your story and request illustrations, your unique AI model generates images based on the characters and scenes you describe, bringing your story to life in a way that feels authentic and engaging.

At Storique, we believe in the power of storytelling and the magic of personalized experiences. By harnessing the capabilities of diffusion models like Stable Diffusion, we enable you to create unique storybooks that celebrate the people in your life. Our custom-trained models not only produce high-quality illustrations but also reflect the unique traits of your characters.

Whether you’re crafting a tale for your children, friends, or loved ones, our technology ensures that every illustration is a beautiful representation of your narrative. We’re excited to continue pushing the boundaries of what’s possible with personalized AI. If you’re ready to embark on your creative journey, try Storique today and start crafting your personalized storybook!

Resources:
Podell et al. 2023 - SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis