The spec for Instagram Reels
Here is the practical baseline for audiograms that win on Instagram Reels:
- Aspect ratio - 9:16 vertical. Render at 1080 x 1920 px. Keep all critical text in a center-safe area.
- Duration cap - up to 90 seconds. Most audiograms perform best at 20 to 45 seconds.
- Frame rate and codec - 30 fps H.264 in an MP4 container. Target 10 to 16 Mbps for clean type and waveforms.
- Audio - AAC, 44.1 kHz or 48 kHz, -16 LUFS integrated, peaks below -1 dBTP. Duck background music 8 to 12 dB under voice.
- Captions - assume sound-off. Burn open captions onto video, then optionally enable Instagram's auto captions for redundancy.
- Safe zones - avoid UI overlays. Leave ~130 px top clear and ~250 px bottom clear. Keep text 60 px from left and right edges.
- Cover - export or design a clean vertical cover with a short title under 40 characters and high contrast. Avoid tiny type.
The structure that works
Audiograms perform when the edit respects attention half-lives. Here are two proven beat maps sized for Reels.
30-second cut
- 0 to 2 s - Silent 2-frame flash or motion logo, then an instant hook caption. Waveform already active. No fade in.
- 2 to 6 s - The payoff headline in a large top bar. Keep it under 7 words. Example: Why your logs lie.
- 6 to 22 s - The core quote. Trim all filler. Use punch-in keyframes every 5 to 7 seconds to add motion without distracting.
- 22 to 27 s - Secondary insight or a do this today tip. One sentence, one verb.
- 27 to 30 s - Call to action and clean loop. Example: Follow for debugging tips. End on a held frame that doubles as a thumbnail.
60-second cut
- 0 to 3 s - Visual pattern break. Bold text-on-color or a quick speaker cut-in. No music-only intro.
- 3 to 10 s - Claim or tension: what the listener will gain or what mistake they are making.
- 10 to 35 s - Story chunk 1. One idea, one example. Hard cut silences. Emphasize a key phrase with on-screen text.
- 35 to 50 s - Story chunk 2. Contrast or objection and resolution. Use a color pop to reset attention.
- 50 to 57 s - Action step. 1 to 2 concrete steps max.
- 57 to 60 s - CTA and loop. Return to the opening visual so the reel feels seamless if replayed.
Layout that consistently works for audiograms:
- Top - concise title bar with a solid or blurred background for contrast.
- Middle - square or circular speaker image or logo anchored left or centered, with the waveform occupying the opposite side.
- Bottom - open captions in a high-contrast subtitle band with stroke or shadow. Include a thin progress bar just above the bottom safe zone.
Hooks that earn attention
Strong hooks are specific, time-bound, and counterintuitive. Use these formulas and examples:
- Counterintuitive truth + timebox - You do not need a better mic, you need this 10 second filter.
- Numbered micro-framework - 3 steps to clean voice fast - high-pass, de-ess, normalize.
- Data point shock - Most Reels are 40 percent too loud. Here is the LUFS target.
- Myth bust + proof - Silence does not make it awkward. It makes your point land.
- Benefit-first question - Want cleaner sound in 60 seconds without plugins?
Test 3 to 5 variants for the same clip. Swap only the first 2 to 3 seconds and keep the rest identical to attribute lift correctly.
Brand + voice
One high-performing audiogram is good. A consistent branded series compounds reach and recall. A brand kit gives you repeatability so each new clip feels like part of a system, not a one-off.
What to lock in your kit:
- Color set - two primaries, one accent, one neutral background. Verify 4.5:1 contrast ratio minimum for text.
- Type - heading and caption fonts, with sizes for 1080 x 1920. Example: Title 84 px, captions 64 px, line height 120 percent.
- Caption style - background boxes vs stroke, maximum lines, and placement rules.
- Waveform style - bar vs line, thickness, animation speed, and color behavior on peaks.
- Motion rules - cut cadence, zoom amounts, transition types. Avoid arbitrary wipes.
- Voice and tone - write a 3 sentence voice charter: audience, promise, and what you will never say.
- CTAs - a small library of 3 to 5 action lines mapped to objectives like follow, newsletter, or site visit.
HyperVids' per-project brand kit maps these choices to templates. You define colors, fonts, caption rules, waveform style, safe zones, and CTAs once, then every audiogram inherits those settings. That saves edit time and keeps your series recognizable in the feed.
Captions + accessibility
Design for sound-off by default and aim for maximum legibility:
- Always-on captions - open captions burned in with a semi-opaque background or a 3 to 4 px stroke. Avoid thin hairlines.
- Character limits - 24 to 32 characters per line, maximum 2 lines. Split on natural phrase boundaries. Reading rate 12 to 16 characters per second.
- Placement - keep captions in the bottom third but above overlays. Leave ~250 px above the bottom to clear the like/comment bar.
- Contrast - meet WCAG AA. White on near-black or near-black on light. For brand colors, test contrast and add a backdrop if needed.
- Emphasis - bold single keywords only. Do not rainbow every line. Use casing consistently - sentence case is most readable.
- Color vision - avoid red/green only semantic cues. Pair color with icons or weight.
- Audio clarity - gentle noise reduction and a light de-esser before normalization. Keep sibilance in check so captions are not doing all the work.
- Alt and description - add a plain-language description in the post text for screen readers and search. Summarize the clip in one sentence.
A sample HyperVids prompt
Create a 45s Instagram Reels audiogram from this quote: "People think logs tell the truth, but sampling, buffering, and clock skew lie to you. Trust traces and timestamps, not vibes." - 9x16 at 1080x1920 - H.264 30fps - normalize to -16 LUFS and cap peaks at -1 dBTP - trim silences over 150ms - top title: "Why your logs lie" - open captions large, 2 lines max, Inter SemiBold 64px with 3px stroke - colors #0B1220 bg, #0EA5E9 accent, #FFFFFF text - left circular headshot, right bars waveform in #0EA5E9 - progress bar above bottom safe zone - end frame 2s with CTA "Follow for debugging tips" and @janedoe_dev - export MP4 12 Mbps and a matching vertical cover.
When you run this in HyperVids, you receive a vertical MP4 ready for Reels, a clean cover image that matches the title, and consistent captions aligned to your brand kit. The waveform, safe zones, audio levels, and CTA are applied automatically so you can publish without manual tweaks.
Common failure modes
- Soft or noisy audio - viewers swipe within 1 second. Fix - light noise reduction, a high-pass at 80 to 100 Hz, de-ess, then normalize to -16 LUFS.
- Slow start - music intro or logo sting with no value. Fix - start with a hook line or the strongest sentence immediately.
- Cramped captions - 3 to 4 lines of tiny text. Fix - hard edit the transcript and keep lines under 32 characters with larger type.
- Blocked UI - text under the like/comment bar. Fix - respect safe zones. Add a progress bar just above the bottom area, not in it.
- Low contrast - brand colors that are pretty but unreadable. Fix - add a text backdrop or switch to neutral text colors.
- Flat visuals - static image with a tiny waveform. Fix - add subtle punch-ins every 5 to 7 seconds and increase waveform amplitude responsiveness.
- Dead air - untrimmed pauses and filler words. Fix - cut silences over 150 ms and remove um/uh where it does not harm meaning.
- Wrong length - 75 to 90 seconds of ramble. Fix - publish a tighter 25 to 45 second version. Save long cuts for other placements.
- Watermarks - reposting TikTok exports with logos. Fix - render clean masters and upload directly to Instagram.
- No loop thought - end crush cuts to black. Fix - design a final frame that visually matches the first second.
Conclusion
An audiogram that works on Instagram Reels is clear, brisk, and reliably branded. Nail the vertical spec, compress the message into a 20 to 45 second arc, keep captions ultra readable, and let the waveform support the story instead of stealing attention. If you want speed without sacrificing consistency, HyperVids can encode your brand kit and apply it to every audiogram so publishing becomes a repeatable habit, not a heroic edit.
FAQ
What is the ideal length for an audiogram on Instagram Reels?
20 to 45 seconds is the sweet spot for completion and replays. If your quote genuinely needs more air, cap it at 60 seconds and tighten every pause.
Can I reuse a square audiogram on Reels?
You can upload it, but it will letterbox, shrink legibility, and risk UI collisions. Rebuild for 9:16 with larger type and safe margins for best results.
Should I add music under the voice?
Only if it reinforces energy. Keep it simple, sidechain it 8 to 12 dB under the voice, and mute it during dense phrases. Prioritize clarity over vibe.