The spec for YouTube Shorts in 2026
Here are the constraints and best-practice targets you should design to before you record a single second:
- Aspect ratio: 9:16 vertical. Render at 1080x1920. Keep all critical text and UI inside a center-safe zone to avoid overlays.
- Duration: up to 60 seconds. Aim for 15 to 35 seconds for brand videos. Fit one clear message and a single CTA.
- Captions: assume sound-off is common. Use on-screen captions at all times, 2 lines max, 28 to 32 characters per line to avoid mobile wrap.
- Sound behavior: playback is sound-on if the viewer's device volume is up, but many watch in public with sound off. Design for both.
- File basics: H.264 in an MP4 container, 8 to 16 Mbps for clean motion graphics, 30 or 60 fps is fine. Peak audio at -1 dBTP, dialog around -16 to -18 LUFS.
- Safe zones: avoid the top and bottom UI overlays. Keep text and important graphics inside roughly the center 80 percent of frame height and width.
- Thumbnail reality: in-feed thumbnails are inconsistent. Make frame 0 instantly legible with a bold, contrasting visual and clean hook text.
- Hashtags: useful but optional. Relevance beats #shorts spam. Use 1 to 3 targeted tags.
You will win if your Short is legible at arm's length, understandable on mute, and satisfying with sound.
The structure that works for a YouTube Shorts brand video
Shorts are unforgiving. You need a tight beat map that front-loads value and never stalls. Choose a framework based on your message complexity.
15 second framework - single message
- 0.0 to 1.5s - Visual hook: a bold motion graphic or a surprising before state. On-screen text: 4 to 6 words that state the payoff, not the product.
- 1.5 to 4.0s - Problem in one sentence: name the friction your audience feels. Cut fast, maintain motion.
- 4.0 to 10.0s - Proof in action: 1 to 2 punchy shots. Show the feature solving the problem. Add a one-line overlay that completes the thought.
- 10.0 to 14.0s - Brand imprint + CTA: logo bug in a corner, quick sonic tag, one clear action like "Try a demo" or "See how". Keep text to a single line.
- 14.0 to 15.0s - End frame: hold for 0.5 to 1.0 seconds to let the CTA register and for replay to feel seamless.
30 second framework - value + proof + CTA
- 0.0 to 2.0s - Hook: speak or show the transformation. Example: "Turn 60 minutes of edits into 6."
- 2.0 to 6.0s - Problem: one concrete pain point. Keep on-screen text under 2 lines with verbs up front.
- 6.0 to 18.0s - Proof: 2 to 3 micro demos. Each demo beat gets 3 to 5 seconds. Use cursor or finger circles to focus attention. Add concise overlays.
- 18.0 to 24.0s - Social proof or credibility: one stat, one logo wall, or one short quote. Never pile all three.
- 24.0 to 29.0s - CTA: say it once, show it once. Example: "Start free" with your URL handle and a subtle arrow animation.
- 29.0 to 30.0s - Brand lockup: 0.5 second brand mnemonic and visual consistency. Keep it fast.
60 second framework - narrative with tension
- 0.0 to 2.0s - Hook: provocative question or strong claim.
- 2.0 to 10.0s - Stakes: who gets hurt if the problem stays unsolved. Specific, not generic.
- 10.0 to 40.0s - Sequence of proof: three beats - setup, action, payoff. Hard cuts at 3 to 5 seconds per beat. Keep captions crisp.
- 40.0 to 54.0s - Objection handling: answer one common doubt in a single sentence while showing the UI or outcome.
- 54.0 to 60.0s - CTA + brand recall: deliver the ask, flash the brand, end clean.
Never bury your premise after second two. On Shorts, attention is earned in frames, not minutes.
Hooks that earn attention
Use formulas that promise a payoff your audience cares about, then deliver immediately.
- Time save claim: "From X to Y in Z." Examples:
- "From 5 tools to 1 in 30 seconds."
- "Turn a messy brief into a polished short in 6 clicks."
- Before or after contrast: "Old way vs new way." Examples:
- "Old way: 10 tabs. New way: 1 timeline."
- "Stop trimming audio by hand - auto-sync in seconds."
- Myth bust: "You do not need X to get Y." Examples:
- "You do not need a studio to make studio-quality shorts."
- "You do not need a big budget to look on brand."
- Micro tutorial: "Do X in 3 steps." Examples:
- "Design mobile-safe captions in 3 steps."
- "Fix muddy audio with one EQ move."
- Audience callout: "If you are a [role], watch this." Examples:
- "If you ship dev tools, show this feature, not that one."
- "Marketers, here is the 15 second brand template that works."
Write 5 to 10 hook lines, then test the first second as a freeze frame. If it is not legible as a still, it will not stop the scroll.
Brand + voice: why consistency beats any single Short
Shorts behave like billboards that happen to move. One great video helps, but brand recall comes from consistent visual identity and a steady voice. Build a minimal brand kit and apply it to every cut so audiences recognize you within one second.
What your brand kit should include
- Colors: one primary, one accent, one neutral. Ensure 4.5:1 contrast for overlays on typical footage.
- Typography: a bold headline font and a clean caption font. Avoid ultra-thin weights.
- Logo treatments: a micro logo bug for corners and a lockup for end frames. Use transparent PNG or vector where possible.
- Lower thirds: two sizes - one for names or handles, one for feature callouts. Keep to 2 lines, max 32 characters per line.
- Motion language: 2 to 3 transitions and a consistent entry animation for key text. Keep transitions under 250 ms.
- Sonic tag: a 0.5 to 1 second mnemonic or whoosh that binds cuts together and trains recognition.
Set these once and you remove dozens of micro decisions per video. In many teams, the difference between consistent output and chaos is a per-project brand kit that locks fonts, colors, lower thirds, and end frames. HyperVids applies a brand kit at the project level so every Short inherits your visual system without manual rework.
Captions + accessibility that actually get watched
Captions are not optional. They are how you win sound-off viewers and how you reinforce meaning for sound-on viewers.
Practical caption rules for Shorts
- Always on: burn in high-contrast captions or upload accurate captions and design on-screen text that does not duplicate in a cluttered way.
- Characters per line: 28 to 32 max. Two lines only. Break on phrase boundaries, not mid-word.
- Read time: plan 150 to 180 ms per word on mobile. A 7 word line should sit at least 1.2 seconds.
- Contrast: meet 4.5:1 against the background. Use a subtle 2 to 4 px stroke or a semi-opaque background box at 60 to 80 percent to ensure legibility on busy footage.
- Placement: keep captions inside center-safe zones. Avoid bottom UI overlays by lifting captions slightly above center-bottom, then test on device.
- Typeface and size: medium or semibold weight, avoid thin fonts. At 1080x1920, 60 to 72 px is a reliable starting size for captions.
- Motion: animate captions with fast but gentle fades or slides under 150 ms. No bouncing or excessive movement that fights with the footage.
- Language accessibility: if your audience is global, add a pinned comment with a translated summary and include a short signpost line on screen that is language neutral, like a checkmark tick next to the result.
Good captions guide the eye. Bad captions create clutter, collide with UI, and cost you retention.
A sample HyperVids prompt
Here is a realistic one-line prompt and brand context for a 28 second YouTube Shorts brand video. This assumes a project-level brand kit is already configured and the /hyperframes skill is active in your Claude CLI subscription.
Brand kit: Acme Dev Tools - primary #2D6DF6, accent #0BD3D3, neutral #0B1026 - headline font Inter Bold - caption font Inter Medium - logo acme-logo.png - handle @acmedev. Voice: developer-friendly, direct, no fluff. Goal: show how Acme auto-generates clean API docs. CTA: Try free at acme.dev/docs.
Create a 28 second YouTube Short with this structure: 0-2s hook that says "Ship docs in minutes, not days" with a bold motion graphic, 2-6s problem "Spec changes break docs", 6-20s proof montage with 3 beats - import OpenAPI file, preview docs, publish - each beat gets 4 seconds with concise overlays under 32 characters per line, 20-24s social proof "Trusted by 2,300 teams", 24-28s CTA with URL and logo bug. Always-on captions, center-safe placement, high-contrast. Add subtle whoosh transitions and a 0.5 second sonic tag at the end. Deliver 1080x1920 MP4, 30 fps, dialog at -16 LUFS.
Output: a tight vertical cut with a complete script, on-screen captions aligned to your brand kit, smart crop on UI shots, and an end frame that matches your visual system. You can regenerate beats or swap the hook without touching colors, fonts, or lower thirds.
Common failure modes on YouTube Shorts brand videos
- Burying the hook: if your premise lands after 2 seconds, you already lost the viewer. Fix it by writing the first line last and previewing frame 0 as a still.
- Cluttered captions: three lines of tiny text is unreadable on mobile. Keep it to two lines, 28 to 32 characters, and boost contrast.
- Feature soup: listing five features with no outcome. Lead with the result, then show one or two features that cause it.
- No safe-zone awareness: overlays collide with YouTube UI. Use a center-safe template and test uploads before you scale.
- Slow pacing: long pauses and slow zooms kill retention. Cut every 2 to 4 seconds, add purposeful motion, and compress silences.
- Weak audio: inconsistent levels and noisy rooms. Use a close mic, high-pass at 80 Hz, gentle compression, and normalize dialog.
- Off-brand visuals: random fonts and colors per video. Lock a brand kit and reuse lower thirds, animations, and end frames.
- Muddy promise: vague hooks like "We help you grow." Replace with a quantifiable claim or a specific before-and-after.
- Overusing stock: generic b-roll screams ad. If you must use stock, grade it to your palette and overlay purposeful UI or text.
- No CTA: viewers do not guess the next step. Ask for one action only and make it visible.
Conclusion: ship tight, branded, watchable Shorts
The winning YouTube Shorts brand video is focused, visual, and unmistakably yours. Lock your brand kit, write hooks that promise a real payoff, show proof instead of telling, and keep captions always on with clean contrast. Treat the first second like a thumbnail and the rest like a guided tour that never stalls. If you want to move from idea to finished cut faster, HyperVids turns a brand context and a single-line prompt into a vertically perfect Short that respects these constraints.
FAQ
Do YouTube Shorts need to be exactly 60 seconds?
No. Sixty seconds is the hard cap, not the target. Most brand videos perform best between 15 and 35 seconds because the message stays focused and retention remains high.
Should I add burned-in captions or rely on auto captions?
Do both when possible. Upload accurate captions for accessibility and SEO, and design high-contrast on-screen text that supports your visuals without clutter. Keep to two lines and inside safe zones.
How often should a brand post Shorts?
Consistency beats bursts. Aim for 2 to 4 Shorts per week that share a visual system and voice. Rotate topics across hook formulas and frameworks, then review analytics for the first 3 seconds, 50 percent watch rate, and CTR on CTAs. When a hook pattern wins, iterate variations quickly with tools like HyperVids to scale output without losing brand consistency.