Art Director

v1.0.3

Generate on-aesthetic images for any brand telling stories with images. Combines a persistent brand aesthetic with per-image editorial thinking to produce vi...

⭐ 0· 80·0 current·0 all-time

by@machinesofdesire

OpenClaw Prompt Flow

Install with OpenClaw

Best for remote or guided setup. Copy the exact prompt, then paste it into OpenClaw for machinesofdesire/art-director.

Previewing Install & Setup.

Prompt PreviewInstall & Setup

Install the skill "Art Director" (machinesofdesire/art-director) from ClawHub.
Skill page: https://clawhub.ai/machinesofdesire/art-director
Keep the work scoped to this skill only.
After install, inspect the skill metadata and help me finish setup.
Required env vars: GEMINI_API_KEY
Required binaries: uv
Use only the metadata you can verify from ClawHub; do not invent missing requirements.
Ask before making any broader environment changes.

Command Line

CLI Commands

Use the direct CLI path if you want to install manually and keep every step visible.

OpenClaw CLI

Bare skill slug

openclaw skills install art-director

ClawHub CLI

Package manager switcher

npx clawhub@latest install art-director

Security Scan

Capability signals

CryptoRequires sensitive credentials

These labels describe what authority the skill may exercise. They are separate from suspicious or malicious moderation verdicts.

VirusTotal

Pending

View report →

OpenClaw

Benign

high confidence

✓

Purpose & Capability

The skill claims to produce on-aesthetic images and only requires a Gemini API key (GEMINI_API_KEY) and the 'uv' runner; those are used to invoke the nano-banana-pro generation script. The declared dependencies align with the described function (it instructs the operator to install nano-banana-pro).

✓

Instruction Scope

SKILL.md and art_director.py limit behavior to composing prompts from aesthetic.md + per-call briefs, reading briefs files, and invoking the external nano-banana-pro generator. There are no instructions to read unrelated secrets, exfiltrate data, or post prompts/results to unexpected endpoints.

✓

Install Mechanism

No network download/install spec is provided; the skill is instruction-first with local Python code. It relies on an already-installed nano-banana-pro skill/script and the 'uv' runner rather than fetching arbitrary code from remote URLs, which keeps install risk low.

✓

Credentials

The only required environment variable is GEMINI_API_KEY (declared as primary), which is proportional for a Gemini-based image generator. Optional vars (OUTPUT_DIR, AESTHETIC_PATH, NANO_BANANA_SCRIPT) are reasonable. The skill does not request unrelated credentials or config paths.

✓

Persistence & Privilege

The skill does not request always:true and is user-invocable only. It does not attempt to modify other skills or system-wide configs. Its filesystem activity is limited to locating nano-banana-pro and reading/writing aesthetic and output files.

Assessment

This package appears to do what it says: compose a brand aesthetic + brief and hand that prompt to an external Gemini wrapper (nano-banana-pro). Before installing, verify the nano-banana-pro skill and the 'uv' runner come from sources you trust, and avoid placing sensitive data in per-image briefs or in aesthetic.md (prompts and briefs will be sent to the image-generation service). Use a dedicated Gemini API key with minimal scope and audit the nano-banana-pro implementation if you plan to run this in production.

Like a lobster shell, security has layers — review code before you run it.

Runtime requirements

Binsuv

EnvGEMINI_API_KEY

Primary envGEMINI_API_KEY

latestvk9724p9s3zz9dhejchvpv5x2gh84z2jx

80downloads

0stars

4versions

Updated 1w ago

v1.0.3

MIT-0

Art Director Skill

Generate on-aesthetic images for any brand telling stories with images — publications, newsletters, essays, reports, brand blogs, longform product marketing, anything where the image has to feel like it belongs to you and has to do work.

The difference between this skill and a text-to-image tool: A text-to-image tool takes a description and renders it. This skill takes (a) your brand's persistent aesthetic and (b) the specific brief for this image, and generates something that argues. The image is a parallel statement to the writing, not a summary of it.

What this looks like

One brief, nine shipped aesthetics. Same image brief run through each preset — nine different arguments from the same subject.

Documentary	Product-photo	Conceptual-illustration

Observed, photographic	Studio photography, seamless	Painterly, metaphor-forward
Schematic	Orbital	Editorial-collage

Ink linework, paper, exploded-view	Flat vector, mid-century poster	Torn paper, halftone, analog
Product-render	Synthwave	Phosphor

3D render, architectural	Chrome, neon, after-midnight	Green CRT, scan lines, low-bit

A detailed worked example (same brief across three of these aesthetics, with the thinking that produced the brief) lives further down in this file.

The two layers

Every generation is brand aesthetic + per-image brief → final prompt. Two files, two roles:

aesthetic.md — your brand's visual identity. Written once, edited as you learn. Defines palette, composition, rendering, tone, constraints, and reference anchors that stay consistent across every image you generate. Think of this as your publication's or brand's visual voice. Calling agents never touch this file; operators do.

The brief — what this specific image needs to do. Written fresh each call by the agent or human requesting the image. Subject, argument, emotional register for this piece. Never contradicts the aesthetic; sharpens it for this particular story.

The skill merges them. Operators control the aesthetic. Agents control the brief. Nobody has to know the other layer to do their job.

Setup

Install nano-banana-pro first — this skill calls it for image generation:

openclaw skill install nano-banana-pro

Pick a starting aesthetic preset and copy it into your workspace:

python3 art_director.py install --preset documentary
# or one of: conceptual-illustration, product-render, product-photo,
#            schematic, editorial-collage, synthwave, phosphor, orbital
# or: --preset blank  to start from an empty template

This writes aesthetic.md into the current directory. Edit it freely — the preset is a template, not a runtime value.

Optional environment variables:

OUTPUT_DIR — where generated images land (default: current directory)
AESTHETIC_PATH — path to aesthetic.md (default: ./aesthetic.md)
GEMINI_API_KEY — required for image generation (inherited by nano-banana-pro)

How to use this skill

When asked to create an image for a piece of content, follow this process.

Step 1 — Understand the brief

You need at minimum:

What the piece is about — not just the topic, the argument
Tone — urgent, contemplative, melancholy, sardonic, precise, angry, hopeful
What to avoid — topic-specific clichés to rule out

Ask for any missing elements before proceeding. A vague brief produces a vague image.

Step 2 — Apply art direction thinking

Before writing a single prompt, work through these questions:

What must this image DO? Not what should it look like — what should it accomplish emotionally and intellectually? Define the function before defining the form. A piece about labor displacement should make the reader feel the weight before they read a word. A piece about financial abstraction should feel cold and constructed.

What is the visual metaphor? Every image worth making has one. Not a literal illustration of the topic — a metaphor that the reader carries into the text. A piece about regulatory capture is not a photograph of a door with a lock. It might be a hand adjusting a scale that was never level to begin with. The literal image is the first idea. It is almost never the right one.

What clichés must be avoided? Every topic has visual clichés that signal lazy thinking:

AI: robot hands, Matrix green text, glowing brains, humanoid robots, circuit board patterns
Finance: stock tickers, dollar signs, Wall Street facades, upward-pointing arrows
Politics: Capitol buildings, handshakes, flags
Climate: melting ice, smokestacks, polar bears
Technology: devices, keyboards, code on screens
Product / SaaS: dashboards floating in space, gradient backgrounds, abstract geometry Name the clichés specific to this brief. Actively move away from them.

Does the brief require departing from the brand aesthetic? Most of the time, no — the aesthetic holds. Occasionally a piece demands departure (a tonal shift, a special issue, an unusual subject). Departure requires editorial justification, not aesthetic preference. If you're departing, say so and say why in your delivery notes.

Step 3 — Construct the image prompt

The prompt is a creative brief to a generative system. Apply these rules:

Use art and photography language, not tech language:

"chiaroscuro lighting" not "dramatic shadows"
"Kodachrome warmth" not "warm colors"
"grain and imperfection" not "realistic texture"
"negative space as structure" not "minimalist"
"desaturated with single color accent" not "muted colors"

Be specific about what matters:

Mood and emotional register
Compositional approach (rule of thirds, centered, asymmetric, etc.)
Color temperature and saturation
Light source and quality
Texture and finish

Be silent about what doesn't: Let the system find its own solutions for secondary elements. Over-specification produces over-engineered images.

Always include these technical specifications (the skill will append them if you forget):

16:9 aspect ratio — required for standard publication header format
no embedded text or typography — typography is set separately by the CMS
Fight the generic AI aesthetic: grain, imperfection, photographic texture — not hyper-rendered smoothness

Structure your prompt as: [Visual metaphor / scene], [compositional approach], [light quality], [color palette and temperature], [texture and finish], [reference anchor if relevant], 16:9, no embedded text, [any per-piece direction]

The skill will automatically prepend your brand's aesthetic.md as the opening context. Your brief should sharpen and specify within that frame, not restate it.

Step 4 — Generate

python3 art_director.py generate \
  --brief "your full art-directed prompt" \
  --output "YYYY-MM-DD-slug.png" \
  [--resolution 2K]

Resolutions:

1K — draft / review pass
2K — standard publication (default)
4K — high-resolution final

Step 5 — Review

After generation, review against the brief:

Does it do the work defined in Step 2?
Does it avoid the named clichés?
Does the tone match?
Does it feel on-brand (aesthetic held) while still specific to this piece?
If the piece has a CMS that overlays text on the image, is there space for that?

If the image feels decorative rather than editorial, identify which element turned it generic. Adjust the prompt toward the metaphor, away from the literal. Regenerate.

One refinement pass is expected and normal. More than two passes usually means the metaphor needs rethinking, not the prompt.

Step 6 — Deliver

Report:

The saved image path
The final prompt used (for the prompt archive)
A one-sentence rationale: what does this image argue, and how does it connect to the piece?

The iteration loop

Aesthetic configs only get good through iteration. After install, generate a batch of 10–20 images against your current aesthetic with varied briefs. Review them side by side. Tune the aesthetic. Regenerate.

python3 art_director.py batch --briefs briefs.txt --outdir ./iteration-01/

briefs.txt is one brief per line (blank lines ignored). The skill generates one image per brief against the current aesthetic.md. This is how you learn what your brand actually wants by seeing what it doesn't want.

Editorial standards

No people without abstraction. Do not generate identifiable individuals, real or synthetic. If human presence is needed, use silhouette, abstraction, hands, shadow, partial framings. The ethics of synthetic portraiture are unresolved. Don't go there.

No text in the image. Ever. Text rendering in generative models is unreliable and typography is a typographer's job. Images with text embedded will look wrong even when they work.

Credit the image. If your brand has a convention for labeling AI-generated images, use it. Transparency is non-negotiable.

Bias awareness. Generative models default toward Western, lighter-skinned, conventionally attractive subjects. Direct against the default when the image calls for diversity the model would otherwise erase.

Worked example

A real brief run through three of the nine shipped presets, so you can see the two-layer model working: same image brief, different aesthetics, three different arguments. (The full nine-aesthetic grid lives in README.md.)

Piece: An essay about silent obsolescence — things that still run but no longer serve their purpose. The piece never names projection; the image carries the metaphor.

Tone: Quiet, slightly haunted, observational.

Avoid: Dramatic beams of light, empty theaters, film reels mid-flight, anything that romanticizes the projector as a Cinema Paradiso icon.

Thinking: The image must carry the stopped-but-still-running feeling. The literal read — a projector in a booth — would miss it. The move is to make the machine feel on while giving it nothing to do: no reel threaded, no window to a theater, no beam — just the lamp contained inside its own housing, glowing for no one.

Brief:

Extreme close-up of a projection booth interior — projector housing, lamp,
film gate, two closed canisters on the shelf. Nothing loaded. Booth sealed;
no window to a theater. The lamp amber is a small contained glow at the
housing aperture, not a beam, not a room wash. The machine is on. There is
nothing to play. Accent: archival-tape amber (#C4873A).

The same brief across three of the nine presets

Documentary — observed, photographic, magazine-feature register:

Documentary preset output

Conceptual-illustration — painterly, metaphor-forward, essay register:

Conceptual-illustration preset output

Product-render — 3D render, architectural, product-marketing register:

Product-render preset output

The brief never changed. The aesthetic did. That's the skill.

What this skill is not

This is not a tool for decorative images. It is not a tool for literal illustration. It is not a faster way to get stock photography.

If the brief is "generate an image of an AI robot for our AI article," push back. That's the wrong brief. The right brief is: what should a reader feel before they start reading? Start there.

Comments

Loading comments...