Text-to-image synthesis – UC Berkeley researchers demonstrate vector graphics with text-conditioned diffusion models
The Trust Project is a worldwide group of news organizations working to establish transparency standards.
In text-to-image synthesis, diffusion models have demonstrated outstanding outcomes. Diffusion models learn to produce raster images of extremely diverse objects and situations using enormous databases of annotated pics. However, for digital icons, graphics, and stickers, designers typically employ vector representations of images like Scalable Vector Graphics (SVGs). Vector graphics are small and may be scaled to any size.
UC Berkeley demonstrates how to produce vector graphics that can be exported as SVG using a text-conditioned diffusion model that was trained on picture pixel representations. It accomplishes this without using extensive collections of SVGs with captions. Instead, Berkeley researchers vectorize a text-to-image diffusion sample and fine-tune it with a Score Distillation Sampling loss, motivated by recent work on text-to-3D synthesis.
Example generated vectors
Check out the freshly generated SVG gallery here.
Vector graphics are small but maintain their sharpness when scaled to any size. Researchers at Berkeley improve an image-text loss based on Score Distillation Sampling to optimize vector graphics. The DiffVG differentiable SVG renderer, which is used by VectorFusion, makes inverse visuals possible.
Additionally, VectorFusion allows a multi-stage configuration that is more effective and of higher quality. This method starts by taking raster samples from the text-to-image diffusion model called Stable Diffusion. The samples are then automatically traced by VectorFusion using LIVE. These samples, nevertheless, frequently lack detail, are boring, or are difficult to adapt to vector graphics. Enhancing vibrancy and textual consistency through Score Distillation Sampling.
VectorFusion can produce pixel art in the style of old video games by limiting SVG paths to squares on a grid.
This approach is easily expanded to support text-to-sketch generation. In order to learn an abstract line drawing that accurately represents the user-supplied text, we first draw 16 randomly chosen strokes. Then, we optimize our latent Score Distillation Sampling loss.
Read related articles:
Any data, text, or other content on this page is provided as general market information and not as investment advice. Past performance is not necessarily an indicator of future results.