The spec for YouTube in {{year}}
YouTube short-form lives in the Shorts shelf and the mobile feed. Ship to this spec if you want consistent delivery and eligibility.
- Aspect ratio: 9:16 vertical, 1080x1920 pixels. YouTube can accept 1:1 to 9:16, but 9:16 is the cleanest for the Shorts feed.
- Duration cap: 60 seconds. In practice, target 20-45 seconds. If you push 60, stop at 58-59 to avoid awkward cutoffs and give the end card a beat.
- Frame rate: 30 or 60 fps. Use 60 for action or product demos, 30 for talking head.
- Audio defaults: On mobile, audio typically starts on if the device volume is on. Many viewers scroll with sound, but design for sound-off clarity.
- Captions: Required in effect. YouTube auto-captions exist, but upload an .srt or burn-in your own to control accuracy and styling.
- Safe zones: Keep core text away from UI overlays. For 1080x1920, avoid the bottom 200 px and the top 130 px. Keep vital content inside a 720x1420 center box.
- File format: H.264 MP4, AAC audio, 8-12 Mbps target bitrate for 1080p. Loudness normalized to around -14 LUFS integrated, peaks below -1 dB.
- Metadata: Title under 60 characters, include a clear payoff. Use 1-2 relevant keywords and optionally #Shorts.
The structure that works for YouTube short-form
Shorts are ruthless about pacing. Here is a beat-by-beat blueprint that repeatedly lands high retention in {{year}}.
0:00 - 0:02 - The instant hook
- Visual or verbal pattern break in the first 1-2 seconds.
- Show a fast payoff preview - a before/after, a number, or a bold claim you will prove.
- On-screen text mirrors the hook in 6-9 words.
0:02 - 0:05 - Setup and context
- One sentence: who it is for and what you will show.
- Cut to a close-up or screen capture quickly. Avoid logo stings.
0:05 - 0:20 - Rapid value delivery
- Three crisp steps, tips, or moments - one per cut.
- Each beat delivers a micro-win. Keep shots 1-3 seconds each. Use a whip cut or hard cut, no flashy transitions.
- Reserve part of the frame for captions. Keep hands and pointer movements big and readable.
0:20 - 0:35 - Proof and specificity
- Show the result, metric, or live outcome. If it is a workflow, show the tool executing.
- Add one sentence of technical detail so experienced viewers feel respected.
0:35 - 0:50 - Payoff condensed
- Summarize the result in seven seconds or less. Put the summary in on-screen text.
- Optional: quick B-roll cutaway to reset attention.
0:50 - 0:58 - Call to action
- CTAs are specific, not needy: Get the repo link in the description, Watch the full teardown on my channel.
- Keep the visual moving - a circle highlight, cursor glide, or quick pan. End on a resolved frame.
Keep the density high. If a moment does not create clarity or intrigue within two seconds, trim it.
Hooks that earn attention
Hooks are formulaic because human attention is formulaic. Ship variations of these templates with concrete payoffs.
1) Numbered speed promise
- Formula: X ways to Y in Z seconds
- Example: 3 terminal tricks to halve your Git commit time in 30 seconds
2) Before-after flip
- Formula: We used to [pain], now it is [measurable outcome]
- Example: We used to wait 12 minutes to build, now it is 90 seconds - here is the diff
3) Challenge and reveal
- Formula: I bet you're doing [common practice] wrong - fix it in 10 seconds
- Example: I bet you're squashing Git history wrong - one flag fixes it
4) Visual shock
- Formula: Start with a striking visual that begs a question, then answer it fast.
- Example: Show a CPU graph dropping from 90 percent to 30 percent, then say, This one config cut CPU usage by two thirds - here is how
5) If-then shortcut
- Formula: If you [role or tool], do this once to save [time or money]
- Example: If you deploy on Docker, add this healthcheck to avoid 3 a.m. restarts
Brand + voice that compounds results
A single viral Short is luck. A recognizable brand system compounds watch time, subscriber growth, and conversion. Build and apply a kit that travels across every Short in {{year}}.
- Visual identity: Consistent color palette, safe-zone title card, and lower-third styling. Use a 2-4 px stroke on text, avoid drop shadows that muddle compression.
- Voice and POV: Pick a narrator style - teacher, teammate, or tactician - and stick to a tense and tone. For technical audiences, say the number, then the nuance.
- Reusable components: Introspective sound bed at -24 LUFS, whoosh SFX for cuts at -18 LUFS peak, subtle brand sting under 400 ms if used at all.
- Content pillars: Define 3-4 recurring themes so viewers know what the channel stands for. Rotate pillars to avoid fatigue.
If you use HyperVids, set up a per-project brand kit once - fonts, palettes, caption style, lower thirds, end card - and it applies across renders. You get consistency without babysitting details, so every Short looks and sounds like your channel in one click.
Captions + accessibility for YouTube
Shorts are consumed in noisy environments and while multi-tasking. Always-on captions and contrast-safe graphics drive comprehension and retention.
- Always-on captions: Even if you upload an .srt, consider burned-in captions for Shorts. It guarantees styling and visibility in the feed.
- Line length: Max 28-32 characters per line, max 2 lines. Short lines avoid covering faces and important UI. Break on phrase boundaries, not exact syllable timing.
- Font size: 5-7 percent of frame height for body captions. On 1080x1920, that is roughly 54-72 px. Headline text can be 9-12 percent.
- Contrast: Maintain at least WCAG AA contrast between text and background. Use a 2-4 px stroke or a subtle 60-80 percent opacity box if the background is busy.
- Positioning: Keep captions in the lower third but above the 200 px bottom safe zone. Center-align for speaking lines, left-align for lists or code.
- Color coding: Use one color for the narrator and another for code or commands. Avoid red for general text since it signals errors.
- Motion safety: Limit rapid captions movement. If you animate, use snappy in-place pops under 150 ms, not long slides.
- Upload .srt tips: Title-language code like video-file.en.srt. Fix proper nouns and technical terms the auto-captioner butchers.
A sample HyperVids prompt
Here is a realistic single-line prompt for a YouTube Short that teaches a developer workflow. Brand context is assumed to be set in your project kit.
Prompt: Vertical YouTube Short, 9:16, title "3 Git tricks to commit faster" - hook in 2 seconds, on-screen captions, code overlays, show before-after timing, CTA to full tutorial in description.
With the brand kit active, HyperVids applies your fonts, colors, caption template, and end card. Using the /hyperframes skill and your existing Claude CLI setup, it generates a 30-40 second vertical MP4, clean burned-in captions, an .srt for accessibility, and suggested title and description copy. The result slots straight into the YouTube upload flow and meets Shorts requirements without extra tweaks.
Common failure modes on YouTube Shorts
- Soft open: Logos, long pauses, or a greeting burn the first two seconds. Fix by cutting straight to the payoff or a provocative visual, then identify the topic in five words.
- Too many ideas: One Short, one promise. If you have three tips, frame them as one theme and state the number up front.
- Low shot density: Shots longer than three seconds without purpose lose viewers. Add insert shots, tight crops, or quick screen captures to reset attention.
- Muddy audio: Compression artifacts and background noise kill retention. Use a dynamic mic close to the mouth, cut sub-80 Hz rumble, gate gently, and loudness-normalize to -14 LUFS.
- Illegible captions: Long lines, low contrast, or text in unsafe zones. Apply the caption rules above and test on a 6-inch screen.
- Overcooked transitions: Star wipes and long zooms distract. Use hard cuts and 100-150 ms whip sounds if needed.
- Weak CTA: Asking for likes without a next step undercuts momentum. Offer a direct benefit: Get the checklist in the description, Watch the longer demo linked.
- Misaligned metadata: The title promises one thing, the video delivers another. Match the hook and the thumbnail text precisely to the content.
- Ignoring comments: Shorts discoverability compounds when you respond and pin useful clarifications. Mine comments for the next Short's hook.
Practical production checklist
- Script to beats, not paragraphs. One sentence per shot, max 12 words per line on captions.
- Record A-roll first, then layer B-roll that shows each beat. Maintain left-to-right visual flow for continuity.
- Capture screen at native 1080x1920 or 1920x1080 and crop strategically. Zoom UI elements to at least 250 px width for legibility.
- Mix music at -24 to -20 LUFS under speech. Duck 6 dB during speech with a fast attack sidechain.
- Color grade lightly. Lift mids for faces, reduce saturation a touch to prevent compression banding.
- Export H.264, High profile, 10-12 Mbps CBR for 1080p60. Verify the first frame is readable and the last frame holds for 0.5 seconds.
- Upload with a descriptive title and two keywords. First two lines of the description should repeat the payoff and CTA.
Conclusion
YouTube short-form in {{year}} rewards clarity, pace, and proof. Build for the feed: a sharp two-second hook, fast beats that deliver value, captions that are always readable, and a CTA that moves viewers deeper into your world. Systematize your brand so every Short looks intentional and familiar. Tools like HyperVids remove the mechanical overhead so you can ship on schedule and learn from the results.
FAQ
How long should a YouTube Short be for best retention?
Target 20-45 seconds. Shorter than 20 often feels rushed, longer than 45 needs exceptional pacing. If you go to 60, make sure every five seconds adds new value.
Do I need to use #Shorts in the title or description?
It is optional. Shorts eligibility depends on aspect ratio and length, not the hashtag. Use the hashtag if it fits your keyword plan, but do not rely on it.
Should I upload burned-in captions or an .srt file?
For Shorts, burned-in captions guarantee style and visibility in the feed. Upload an .srt as well for accessibility and to help search. Many creators do both.