How to Write AI Image Prompts: A Practical Guide

Most weak AI images come from weak prompts, not weak models. The fix is structural: instead of typing whatever comes to mind, fill in five slots every time.

The formula

Subject + Environment + Lighting + Style + Composition

Subject — what or who the image is about, with concrete details
Environment — where the subject is, and what surrounds it
Lighting — quality, direction, and color of light
Style — medium, aesthetic, or era (photo, watercolor, 3D render…)
Composition — framing, angle, lens, aspect ratio

You don't need all five in every prompt, but when a result disappoints, the missing slot is usually the reason. Let's walk through each one with before/after pairs you can copy into the generator.

Subject: be specific, not poetic

The model can't read your mind. "A dog" gives you the statistical average of every dog photo ever taken.

Product photography:

Before: a perfume bottle
After:  a frosted glass perfume bottle with a brushed-gold cap,
        droplets of water on the glass, label-free

Portrait:

Before: a beautiful woman
After:  a woman in her 60s with silver hair pulled back, deep
        laugh lines, wearing a chunky knit turtleneck

The "after" versions tell the model which dog, which bottle, which face — age, material, texture, state. Adjectives like beautiful do almost nothing; nouns and physical details do everything.

Environment: ground the subject somewhere

Without an environment, models default to plain studio backdrops or generic blur.

Before: a cup of coffee
After:  a cup of coffee on a weathered wooden windowsill,
        rain streaking the glass behind it, a blurred city street outside

Before: a mountain biker
After:  a mountain biker mid-jump on a dusty alpine trail,
        pine forest falling away below, storm clouds on the horizon

Notice that environment also sets mood. Rain, dust, and storm clouds are doing emotional work the word "moody" never could.

Lighting: the fastest quality upgrade

Lighting language is the single highest-leverage addition to any prompt. Name the source, direction, and temperature.

Before: a bowl of ramen, looks delicious
After:  a bowl of ramen, soft window light from the left,
        steam backlit and glowing, warm tones

Before: a portrait of a man, dramatic
After:  a portrait of a man, single hard rim light from behind,
        face half in shadow, cool blue fill

Useful vocabulary: golden hour, overcast softbox light, neon glow, candlelight, hard noon sun, backlit, rim light, volumetric light.

Style: name the medium and the era

If you don't specify a style, the model picks one for you — usually a glossy default.

Illustration:

Before: a fox in a forest, cartoon style
After:  a fox in a forest, flat vector illustration, limited
        palette of rust orange and deep teal, mid-century
        children's book style, visible paper texture

Logo:

Before: a logo for a coffee shop
After:  a minimal line-art logo of a coffee cup merged with a
        sunrise, single-weight strokes, monochrome, centered on
        a plain white background, flat vector style

For logos, always add flat, vector, and plain background — otherwise you get a 3D rendered badge instead of a usable mark.

Composition: direct the camera

Framing words are cheap to add and dramatically change results.

Before: a chess piece
After:  a black knight chess piece, extreme close-up, macro lens,
        shallow depth of field, off-center on the right third

Other handles: wide shot, low angle, bird's-eye view, 85mm portrait lens, symmetrical, negative space on the left. Pick the aspect ratio in the generator settings rather than describing it in text.

Five common mistakes

Cramming two images into one prompt. "A castle at sunrise and also a dragon battle at night" forces the model to average them. One prompt, one scene.
Stacking empty quality words. "Masterpiece, best quality, ultra HD, 8k" adds little on modern models. Replace them with concrete lighting and material details.
Contradicting yourself. "Minimalist composition" followed by a list of twelve props. Decide which one you mean.
Ignoring composition entirely. You'll get centered, eye-level shots every time — then wonder why everything looks the same.
Rewriting from scratch after a near-miss. If the result is 80% right, change one variable. A full rewrite throws away what was already working.

Different models, different prompting styles

Prompting is not one-size-fits-all, and the differences are worth knowing:

Model	What it responds well to
GPT Image 2	Long natural-language descriptions, multi-sentence scene direction, and prompts that include text to render (signs, labels, posters)
Nano Banana series	Editing-style instructions on an existing image — "make the sky overcast", "swap the jacket to denim". Qualitative descriptions are enough; no need for exhaustive detail

On Bno AI you can write one prompt and route it to either — browse the prompt library to see how the same idea reads across models.

Negative prompts and iteration

A negative prompt lists what you don't want: blurry, extra fingers, watermark, text artifacts. Keep it short — a long negative list constrains the model more than it helps. Negatives can't add anything; they can only remove.

A practical iteration loop:

Run the formula prompt.
Identify the single biggest problem in the output.
Change only the slot responsible (wrong mood → lighting; wrong look → style).
Re-run, compare, repeat.

On Bno AI's free tier you get 10 credits per day, and a GPT Image 2 image at 1K costs 2 credits — about five free generations daily, enough to test one change per run several times over. For heavier iteration, Pro plans remove the daily ceiling. Once an image works as a starting frame, you can also push it into motion with the AI video generator.

FAQ

How long should a prompt be? Long enough to fill the five slots, short enough that every word earns its place — usually 20 to 60 words. Past that, models start ignoring details, and you lose track of which words mattered.

Do comma-separated keyword lists still work? Yes, but full sentences carry relationships that keywords can't ("a cat watching a goldfish" vs. "cat, goldfish"). Modern models, GPT Image 2 in particular, reward natural phrasing.

Why does the same prompt give different results each time? Generation is sampled, not deterministic — the same prompt explores different points in the model's possibility space. Treat this as a feature: run the prompt a couple of times before deciding it needs editing.

How to Write AI Image Prompts: A Practical Guide

Table of Contents