The spec for YouTube Shorts
Technical baseline for short-form video
- Aspect ratio: 9:16 vertical is the standard. Export 1080x1920. Square and 4:5 can render, but 9:16 wins the feed more reliably.
- Duration cap: Up to 60 seconds. Plan for 58 seconds if you want a clear end card without trimming risk.
- Frame rate: 24, 30, or 60 fps. If you shoot at 60 fps, keep motion crisp and use faster shutter to avoid blur.
- Bitrate: Aim for 8-16 Mbps for 1080p. Avoid crushing gradients with overly aggressive compression.
- Audio: Stereo AAC at 320 kbps if possible. Normalize to -14 LUFS integrated, peaks below -1 dBFS.
- Hashtag and metadata: Include #Shorts and 1-2 topic keywords in your title or description. Keep titles under 60 characters for mobile readability.
Sound-on vs sound-off reality
YouTube Shorts auto-plays sound on by default for most viewers, but many people still browse with volume low or muted. Design for dual-consumption: the story must work without sound, and the audio must enrich it for those listening.
Caption expectations and UI safe areas
- Always-on captions. Assume sound-off users must follow every beat.
- Safe areas: keep text inside the center 80 percent of frame. On 1080x1920, leave at least 90 px padding on left and right and keep captions away from the bottom control cluster and top username area. Avoid the bottom 220 px and top 150 px for persistent captions.
- Visual pacing: 2-4 cuts every 10 seconds works well. Punch in and out to emphasize points without disorienting the viewer.
The structure that works for YouTube Shorts
A proven 45-60 second beat map
- 0-2s - Hook: Interrupt the scroll with a bold claim, a pattern break, or a visual surprise. On-screen caption mirrors the hook exactly.
- 2-5s - Context: One line that frames the problem or goal. Make it specific, not generic.
- 5-20s - Step 1 payoff: Deliver a concrete action or insight. Show the result instantly if possible.
- 20-40s - Step 2-3 proof: Add supporting steps, a quick before/after, or a micro-demo. Use cutaways or b-roll to keep motion on-screen.
- 40-55s - Wrap and CTA: Summarize the win in one sentence. CTA should be native to Shorts, for example, "Save this", "Comment 'guide' for the checklist", or "Watch the full tutorial on my channel".
- 55-60s - Loop moment: A visual or verbal callback that makes the first second interesting again, prompting replays. This helps retention and increases reach.
Alternate micro-structures by length
- 15 seconds: Hook (0-2s), one insight with visual proof (2-10s), CTA (10-15s). Use jump cuts and a single caption line.
- 30 seconds: Hook (0-2s), context (2-5s), two quick wins (5-20s), CTA (20-30s). Keep each win under 7 seconds.
Hooks that earn attention
Use formulas that map to real outcomes
- Problem-to-outcome: "Cut your Docker image from 1.2 GB to 350 MB in 3 steps."
- Myth bust: "Everyone says you need daily uploads. Wrong, here is what actually scales."
- Before/after shock: "This page loaded in 4.2s. Now 1.1s after one tiny change."
- Challenge the audience: "If your bounce rate is over 60 percent, try this for 7 days."
- Micro-tutorial: "Add dark mode to your app in under 60 seconds."
Concrete example hooks you can copy
- "Stop scrolling if your tests are flaky. This one flag fixes half of them."
- "I reduced AWS bill by 27 percent. Here is the 2-minute audit checklist."
- "Your captions are costing you views. Fix them in 30 seconds with this rule."
- "Turn one blog post into 5 Shorts without sounding repetitive. Watch this."
Brand + voice that compound over time
Why consistency beats any single video
Short-form reach is spiky, but brand trust compounds. A clear brand kit and consistent voice create instant recognition in the feed, reduce cognitive load, and make split-second viewers more likely to stay. When your colors, type, lower thirds, and phrasing are uniform, your audience can focus on the message instead of recalibrating to a new style every upload.
Build a lean brand kit for Shorts
- Palette with contrast pairs: One primary, one accent, one neutral background. Verify WCAG AA at minimum for text on color.
- Type stack: A bold sans for headings and a highly readable face for captions. Avoid thin weights on mobile.
- Caption and lower-third rules: Size, line height, shadow, and placement. Lock to a style that respects safe areas.
- Logo usage: Keep it subtle, 48-64 px wide in 1080x1920, placed top right or top left with at least 24 px padding from edges.
- Voice guide: 3-5 tone rules, for example, direct verbs, no hedging, one benefit per sentence, ban filler like "just" or "basically".
How a per-project brand kit keeps you consistent
Set your colors, fonts, caption styles, CTA templates, and tone once, then reuse them across every cut. In tools like HyperVids, you attach a per-project brand kit that locks styling on exported caption layers, lower thirds, and end cards. The /hyperframes skill and your Claude CLI subscription power consistent script beats and shot suggestions that match your voice, so every Short looks and sounds like the same creator, even when the topics vary.
Captions + accessibility for Shorts
Always-on captions that are actually readable
- Two lines max, with 28-32 characters per line for mobile. Shorter words scan faster.
- Font size: For 1080x1920, 62-72 px for body captions, line height at 120-140 percent. Adjust to your typeface's x-height.
- Contrast: At least 4.5:1 against background. Add a 2-4 px text shadow or a semi-opaque box at 20-30 percent black for busy scenes.
- Placement: Keep captions above the lower UI. Center them or align to lower third center with 220 px bottom padding.
- Timing: Reveal captions 100-200 ms before the spoken word to help comprehension for sound-off users.
Clarity and motion guidelines
- One idea per card. Avoid long sentences across multiple cards. Break up complex points into tight phrases, for example, "Open settings", then "Enable caching".
- Motion: Keep caption movement minimal. Use a subtle pop or fade. Excessive kinetic text is tiring at phone distance.
- Color as meaning: Never rely on color alone to convey meaning. Use icons or labels if a color indicates state.
Quick accessibility checklist
- Test with volume muted from start to finish.
- Avoid jargon unless your audience expects it. If you must use a term, add a two-word definition in overlay.
- Do not hide critical information behind UI elements like the like/comment bar.
A sample HyperVids prompt
One-line prompt for a developer-facing YouTube Shorts
"Show how to cut a 1.2 GB Docker image to under 400 MB in 45 seconds, with 3 concrete steps and on-screen before/after, captions in my brand style, and end with 'Comment DOCKER for the checklist'."
What this yields
With your brand kit attached, HyperVids uses the /hyperframes skill to draft a 45-60 second talk track, propose b-roll inserts of the Dockerfile and image sizes, generate on-brand captions, and produce a 1080x1920 vertical export. The Claude CLI-backed scripting step aligns transitions to beats and inserts a loop moment at the end. You get a Shorts-ready file, matching your colors and type, plus a caption file you can tweak if needed.
Common failure modes for YouTube Shorts
Why short-form videos flop and how to fix them
- Weak first second: If your hook starts at second 2, you already lost. Solution: Open with the outcome, a number, or a visual that breaks pattern.
- Buried lead: Spending 10 seconds on context is too long. Solution: One-line context, then show the win.
- Illegible captions: Thin fonts, low contrast, or too many characters per line. Solution: Enforce the caption rules above and test on a phone at arm's length.
- Horizontal footage squeezed vertical: Black bars destroy retention. Solution: Shoot vertical or reframe with proper crop and background blur that respects safe areas.
- Over-produced intros: Animated titles that last longer than 1 second kill watch time. Solution: Static logo bug only, or none.
- Muddy audio: Loud background music or inconsistent voice levels. Solution: Sidechain music to -18 LUFS during dialogue, keep voice at -14 LUFS integrated, tame sibilance with a de-esser.
- No payoff: Teasing a result without showing it. Solution: Always show the before/after or the metric change.
- CTA mismatch: Using long-form CTAs like "Click the link in description for details". Solution: Use platform-native micro-CTAs such as "Save", "Comment for the template", or "Watch the full video on my channel".
- Ignoring retention data: Not reviewing audience drop-off points. Solution: In YouTube Analytics, watch the absolute retention curve, then re-cut the dead zones in future posts.
Step-by-step workflow to produce a Shorts-ready cut
Pre-production
- Define outcome: One measurable win your audience cares about. Write it as a 12-15 word sentence.
- Script beats, not paragraphs: Write the 6 beats listed above in one line each. Leave room for natural delivery.
- Assemble proof: Screenshots, logs, or metrics you can show on-screen in under 3 seconds.
Production
- Lighting: One key light 45 degrees off-axis, one fill at half power. Shoot at 1/120 shutter for 60 fps or 1/60 for 30 fps.
- Framing: Headroom tight, eyes on the top third. Keep a clean, on-brand background.
- Audio capture: Lavalier or directional mic 15-25 cm from mouth. Record room tone for noise profiling.
Post-production
- Cut to the beat map: Place your hook first, even if you recorded it last. Remove every "um" and hedge.
- Caption to spec: Two lines max, high contrast, safe placement. Check spelling and abbreviations.
- Sound design: Music low and consistent. Add light whooshes for transitions only when they clarify the cut.
- Export settings: H.264, high profile, 1080x1920, 10-16 Mbps, AAC 320 kbps stereo.
- Upload metadata: Title with outcome and keyword. Description with 1-2 supporting lines and #Shorts. Pin a comment that restates the CTA.
Conclusion: Make Shorts that stack, not stall
Short-form success is about compound consistency. Nail the specs, structure your beats for retention, use captions that pass the arm's length test, and keep your brand and voice locked in. With a repeatable workflow and a per-project brand kit, you can ship more often without drifting off-style. Tools like HyperVids help automate the pieces that should be automatic, so you can focus on finding sharper hooks and delivering faster payoffs.
FAQ
How long should a YouTube Shorts video be for maximum retention?
Thirty to forty five seconds is a strong sweet spot. It gives enough room for a hook, one to two concrete wins, and a CTA. If you can land the payoff earlier, do it. Shorter is usually better as long as the result is clear.
Do I need #Shorts in the title or description?
It helps the system classify your upload, though YouTube can detect aspect ratio and length without it. Include #Shorts plus one topical keyword in the title or description for clarity.
Can I reuse TikTok or Reels content on Shorts?
Yes, if you remove other platform watermarks and reframe to respect YouTube's UI safe areas. Adapt the CTA to Shorts, for example, ask for comments or direct viewers to the channel for the long-form version. If you want a hands-off conversion, HyperVids can ingest an existing cut and restyle captions and layouts to your brand kit.