The spec for X (Twitter)
Ship to spec first, then get clever. Here is the current, practical baseline for brand videos in the X feed in 2026.
- Aspect ratio - vertical 9:16 for maximum mobile reach (1080x1920), square 1:1 for cross-posting (1080x1080), landscape 16:9 for demos and screen capture (1920x1080).
- Duration cap - 2 minutes 20 seconds for most accounts, long-form is available to some premium tiers, but feed performance usually drops after 45-60 seconds. Optimize core cuts for 30-45 seconds and keep longer versions as threads or replies.
- Frame rate and codec - 24-30 fps is plenty. H.264 video, AAC audio, MP4 container is the safest bet.
- Audio default - autoplay is muted. Viewers tap for sound. Design for sound-off first, then reward sound-on with tasteful SFX and music.
- Captions - upload an .srt on web or burn in open captions. Always include captions since sound starts off.
- File weight - keep under a few hundred MB for fast upload and quick processing. Premium tiers allow much larger files, but snappier encodes start faster in feed.
- Safe areas - reserve the bottom 12-15 percent and top 8-10 percent for UI overlays and progress bars. Keep captions just above the lower third.
The structure that works for a brand video on X
You get seconds, not minutes. Build a tight arc that lands the value fast and gives a clear next step. Here is a proven 30-45 second blueprint that respects X's cap and feed behavior.
0-2s - Hook
- On-screen text: one line that states a bold outcome or problem.
- Visual: dynamic cut-in, fast punch-zoom, or quick motion graphic of the outcome.
- Audio: crisp hit or whoosh if sound is on. Never rely on it.
2-5s - Problem in plain language
- Script: name the pain in the user's words.
- Visual: b-roll of the pain in action, or a tight talking-head crop with empathetic expression.
5-15s - What your product does
- Script: one sentence value prop, then one sentence on how it works.
- Visual: quick over-the-shoulder demo or animated UI sequence. Keep legibility priority one.
- Overlay: 2-3 word labels that call out the moment of value, for example "1-click import", "Auto-sync".
15-25s - Social proof or credibility
- Script: a metric, a customer logo, or a short testimonial line.
- Visual: lockup of partner logos, quick chart, or tweet screenshot with blur on sensitive bits.
25-35s - Call to action
- Script: one clear next step, for example "Try the free starter plan" or "Watch the full demo in the thread".
- Visual: brand lockup, URL slug, QR code only if it remains small and does not clutter the frame.
If you must extend to 45-60 seconds, add a brief "how it works" sequence between 15-25 seconds. Keep each beat under 8 seconds. In editing, prioritize cuts that can be trimmed to 29-31 seconds for higher completion in the feed.
Hooks that earn attention
These formulas open strong on X. Pair each with on-screen text in the first 2 seconds. Examples assume a developer-facing product, but the patterns generalize.
- Outcome first - "Ship X in Y time"
- Example: "Ship OAuth in 5 minutes, not 5 days."
- Practical teardown - "We replaced X with Y"
- Example: "We replaced 7 scripts with one CLI command."
- Counterintuitive lesson - "Stop doing X, do Y instead"
- Example: "Stop screenshotting logs, stream them to Slack in 30 seconds."
- Numbered micro-guide - "3 steps to X"
- Example: "3 steps to zero-downtime deploys."
- Before vs after - "Before: pain, After: outcome"
- Example: "Before: flaky webhooks. After: 99.9 percent delivery with retries."
Brand and voice that compound
One viral video is a spike. A recognizable brand system turns every post into memory reinforcement. Consistency beats novelty on X because the feed is chaotic and fast. Viewers should recognize your videos by tone, color, and pacing before they see the handle.
- Color and typography - commit to a minimal palette and 1-2 typefaces. Use the same caption style across posts. Do not swap fonts weekly.
- Lower-third and label system - design a reusable label strip for feature callouts. Keep it within the safe area and legible against footage.
- Voice - pick a tone and stick with it. For developer audiences, be plainspoken, direct, and example heavy. Avoid buzzwords and passive voice.
- Motion rules - define how you animate cuts, zooms, and list builds so the energy feels yours, not a template pack.
- CTA patterns - two or three consistent CTAs you rotate. For example "Reply 'demo' for the thread", "Link in bio", or "Open the docs".
The per-project brand kit in HyperVids locks your logo, color hex values, font stack, lower-thirds, and default caption style. That means every output inherits your system automatically without manual tweaks. The /hyperframes skill lets you iterate on the same structure across multiple topics while preserving the look and pacing.
Captions and accessibility for X
Design for silent autoplay and fast scanning. Good captions increase watch time and comprehension. Treat them as part of your brand system, not a last-minute overlay.
- Always-on - add open captions in the master render or upload an .srt. When in doubt, burn them in so they survive re-uploads and quotes.
- Line length - 28-42 characters per line, maximum 2 lines. Shorter lines scan better on smaller phones.
- Reading speed - target 12-15 characters per second. Keep each subtitle on screen for at least 1.0s and not more than 4.0s unless the line is long.
- Contrast - meet a 4.5:1 ratio minimum. Use a semi-opaque backdrop or stroke. Avoid pure white on bright b-roll.
- Safe placement - position captions above the bottom 12-15 percent of the frame to clear the progress bar and UI. Center-align for short phrases, left-align for longer sentences to ease scanning.
- Emphasis - bold only key words, do not rainbow the whole line. Consider subtle word-by-word highlights synced to speech for sound-on viewers.
- Localization - if you have multilingual audiences, cut region-specific versions instead of jamming three languages into two lines.
A sample HyperVids prompt for X
This is a realistic one-line prompt with embedded brand context that produces a 35-second vertical brand video tailored for X. It assumes you have your Claude CLI set up and the /hyperframes skill available.
/hyperframes Brand: - Name: Acme DevTools - Tagline: Ship backend features in minutes - Colors: #101820 primary, #00D1B2 accent, #F5F7FA background - Fonts: Inter for UI, IBM Plex Mono for code - Voice: direct, concise, example-driven, no buzzwords - Logo safe area: 5 percent from corners - Caption style: white bold on 70 percent black rounded box, 2 lines max Goal: - 35s brand video for X (Twitter), vertical 1080x1920, 30 fps - Structure: hook, problem, value prop, quick demo, social proof, CTA - Silent-autoplay friendly with open captions and punchy on-screen text - Safe areas: keep text above bottom 15 percent and below top 8 percent - CTA: "Open the docs - link in bio, thread below" Script seed: - Hook: "Stop waiting on OAuth." - Problem: "Auth should take minutes, not days." - Value: "Acme handles flows, tokens, and refresh - one config." - Demo beats: CLI init, one config file, test login - Proof: "12k developers, 2.3M monthly logins" - CTA: "Try free, docs linked in the thread" Deliver: - 1 talking-head + UI cut, 35s, 9:16, H.264, AAC - On-screen labels: "1-file setup", "Live in minutes", "Retry-safe" - Music: light percussive bed at -18 LUFS, tasteful hits - Export captions burned-in and also provide .srt
Output will include a fully structured script, on-screen text, shot list, and a rendered vertical video that follows your brand kit. You can regenerate alternates by swapping the Script seed lines or tweaking the Goal section while the kit stays fixed.
Common failure modes on X and how to avoid them
- Slow start - if the first frame is a logo fade, you lose viewers. Put the payoff or problem on screen in the first second.
- Wall-of-text captions - long sentences at 3 lines reduce legibility. Edit the script for line breaks and brevity.
- Illegible UI - zoom into interfaces and crop smartly. Do not display dense screens at full height on a phone.
- Wrong aspect - posting a 16:9 screen recording without reframing tanks retention. Create a vertical cut with crop-safe overlays.
- Overmixing music - loud beds mask speech. Normalize dialog around -14 LUFS integrated and keep music 8-12 dB under.
- No proof - pure claims without metrics or logos feel like ads. Add real usage numbers or recognizable partners.
- Weak CTA - "Learn more" is vague. Tell viewers exactly what to do next and where it lives in the thread or profile.
- Inconsistent visual brand - every week a new style confuses recognition. Use a stable kit and evolve slowly.
- Overlength - 90-second monologues rarely hold in feed. Cut a 30-45 second hero and place the deep dive in a reply.
- No subtitles file - relying solely on burned-in text can hurt accessibility for screen readers. Pair open captions with an .srt when possible.
Conclusion
Winning brand videos on X are short, structured, and legible with the sound off. Lead with the outcome, show the value in under 15 seconds, prove it, then point to a next action. Bake your visual system into every cut so the audience builds recognition that compounds over time.
If you want a faster path from idea to ready-to-post verticals, HyperVids pairs a per-project brand kit with /hyperframes so you can keep structure, style, and pacing consistent while swapping topics. Iterate weekly, test two hooks per post, and let metrics guide the next edit.
FAQ
What is the best length for a brand video on X?
30-45 seconds is the current sweet spot for feed completion and shares. If you have more to say, post a reply with the deeper walkthrough and pin the thread.
Should I go vertical or square?
Vertical 9:16 performs best in mobile feed and gives you the most screen real estate. Square is fine for cross-posting, but design the master in vertical and export a square reframed version if needed.
Open captions or .srt uploads?
Do both if you can. Burn in high-contrast captions for immediate legibility in quotes and re-uploads, and attach an .srt on web for accessibility and search. HyperVids can export both from the same timeline so you do not have to choose.