WordArt Designer: User-Driven Artistic Typography Synthesis using Large Language Models

Jun-Yan He¹^* Zhi-Qi Cheng²⁺ Chenyang Li¹ Jingdong Sun² Wangmeng Xiang ¹ Xianhui Lin¹ Xiaoyang Kang¹ Zengke Jin⁴ Yusen Hu³ Bin Luo¹ Yifeng Geng¹ Xuansong Xie¹ Jingren Zhou¹

^*Project Lead, Alibaba Team

⁺Co-Lead, CMU/ICL/RCA Team

¹DAMO Academy Alibaba Group ²Carnegie Mellon University

³Imperial College London ⁴Royal College of Art

arXiv page arXiv pdf ModelScope DEMO Github

This paper introduces WordArt Designer, a user-driven framework for artistic typography synthesis, relying on Large Language Models (LLM). The system incorporates four key modules: the LLM Engine, SemTypo, StyTypo, and TexTypo modules. 1) The LLM Engine, empowered by LLM (e.g. GPT-3.5-turbo), interprets user inputs and generates actionable prompts for the other modules, thereby transforming abstract concepts into tangible designs. 2) The SemTypo module optimizes font designs using semantic concepts, striking a balance between artistic transformation and readability. 3) Building on the semantic layout provided by the SemTypo module, the StyTypo module creates smooth, refined images. 4) The TexTypo module further enhances the design's aesthetics through texture rendering, enabling the generation of inventive textured fonts. Notably, WordArt Designer highlights the fusion of generative AI with artistic typography.

Existing solutions mainly generate semantically coherent and visually pleasing typography within predefined concepts. These solutions often lack adaptability, creativity, and computational efficiency. To overcome these limitations, we introduce WordArt Designer, a system composed of three primary modules: the LLM Engine, SemTypo Module, and StyTypo Module, supplemented by the TexTypo Module for texture rendering.

Framework Overview

This structure involves an LLM engine, the SemTypo module for Semantic Typography, the StyTypo module for Stylization Typography, and the TexTypo module for Texture Typography. These modules operate coherently, guided by a preset control flow, to facilitate a seamless and innovative transformation of text into artistic typography.

The Semantic Typography (SemTypo) module alters typographies based on a given semantic concept. It unfolds in three stages: (1) Character Extraction and Parameterization, (2) Region Selection for Transformation, and (3) Semantic Transformation and Differentiable Rasterization.

Qualitative & Quantitative Results

We compare our method against 6 state of the art baselines qualitatively and quantitatively. Below are some examples of images generated from our method. For our full analysis please refer to our paper.

Ablation study of the ranking model on the validation set. `p', `r', and `s' stand for precision, recall, and success rate, respectively. `x' in `TopX' indicates the number of stylized images retained. In the ranking-based method, `TopX' are selected based on ranking scores, while for the random-based method, `TopX' are selected randomly. Results of the random-based method are obtained by averaging over 10,000 iterations. Increased values are indicated in blue.

WordArt Designer: User-Driven Artistic Typography Synthesis using Large Language Models

Framework Overview

Qualitative & Quantitative Results

BibTeX