How to use LinkedIn voice notes at scale with Aimfox 2026: when voice notes outperform text DMs, where to insert them in the Aimfox sequence, and how to script 30-second messages that get replies.
Ryan Mercer
SDR turned cold email consultant, 8 years outbound · Updated June 24, 2026
Last updated: June 2026 · Ryan Mercer, SDR turned cold email consultant, 8 years outbound
TL;DR — 7 things to know before reading
LinkedIn voice notes are one of the most underused differentiation tactics in B2B outreach. Eight years in outbound sales, I have seen text DM reply rates in the 10–15% range from well-crafted Aimfox sequences. For a curated subset of high-value prospects who have not replied to text follow-ups, a personalised 30-second voice note has produced reply rates in the 28–35% range in my own campaigns.
The mechanism is straightforward: voice notes are unusual enough that they interrupt the scroll. They also communicate tone, specificity, and genuine effort in a way that no text message can replicate at the same word count. A prospect who has received and ignored two text DMs has already filtered out the category "text message from an outbound sales person." A voice note does not fit that category and gets processed differently.
This guide covers when to use voice notes, how to integrate them with an Aimfox text-based campaign, how to script a 30-second message that sounds personalised (because it is) without sounding scripted (because it should not be), how to adapt the approach for different prospect personas, what to do when voice note outreach does not produce the expected reply rates, and the advanced tactics that experienced outbound operators use to maximise voice note performance.
Aimfox handles the volume layer; Quarvio provides the verified contact data that populates the campaigns; Instantly and Inframail run the parallel email channel. Voice notes are the tactical layer you apply selectively on top of this infrastructure.
LinkedIn allows users to record and send voice messages to 1st-degree connections via the LinkedIn mobile app. Key specifications:
This means: voice notes require you to be physically present on the LinkedIn mobile app, finding the conversation, pressing record, and speaking.
This is not a limitation — it is the point. Voice notes are differentiated precisely because they require this manual effort. A prospect who receives a voice note knows it was sent by a human specifically to them, not by automation to 200 people simultaneously.
The technical constraints of LinkedIn voice notes also define the workflow: Aimfox automates the connection and the text follow-up sequence to the full prospect list; voice notes are layered on top for the small, curated subset where manual effort is commercially justified. This combination produces the highest efficiency: automated volume at the top of the funnel, differentiated manual effort only where it matters most.
Voice notes are not universally better than text messages. They are specifically better in these situations:
Situation 1: Senior decision-makers who receive high text DM volume VPs, C-level executives, and founders receive significant text-based outreach daily. They have developed efficient filters for deprioritising or ignoring it. A voice note bypasses these filters because it is rare enough in their LinkedIn inbox to register as different.
Situation 2: High-value prospects who have not replied to 2 text follow-ups If Step 1 and Step 2 of your Aimfox sequence produced no reply from a prospect where the deal is strategically important, a voice note is the differentiated intervention worth the manual effort. For standard volume accounts, this is not economical. For the top 5–10 prospects from a campaign where one conversion would justify the time, it is.
Situation 3: Prospects where personalised verbal context is more credible than text If the reason you are reaching out is genuinely specific to something this person published, led, or achieved, a voice note delivers that specificity more convincingly than text. "I actually read your post on [topic] and have a specific thought on the approach you described" lands differently when spoken.
Situation 4: After a positive but stalled text reply A prospect who replied positively ("Thanks, I'll have a look") but then went quiet is a warm conversation that has stalled. A voice note is a natural, human way to re-engage without the repetitive feel of another text follow-up.
When voice notes are NOT the right tool:
The correct integration of voice notes into an Aimfox campaign:
Phase 1: Aimfox runs the connection campaign (automated) Configure the standard Aimfox connection request campaign for your full audience. This runs automatically: connection request → Step 1 follow-up (Day 3) → Step 2 follow-up (Day 7–8). Stopping rules configured for all accepted connections.
Phase 2: Triage accepted connections into tiers (daily manual step) Check Aimfox Unibox daily. As connections accept, classify them:
The triage decision should be made within 48 hours of a connection being accepted. Do not wait until after the full text sequence has completed to decide whether a prospect is Tier 1.
Phase 3: Voice notes for Tier 1, post-sequence (manual) For Tier 1 prospects who have not replied to Step 1 or Step 2 of the Aimfox sequence:
Phase 4: Track responses in Aimfox Unibox When a prospect replies to your voice note (via text message on LinkedIn), the reply appears in Aimfox's Unibox like any other LinkedIn conversation. Label it and follow up as you would any hot lead. If the prospect replies positively, cancel their remaining automated Aimfox sequence steps immediately and handle the conversation manually from this point forward.
Recording a high-quality voice note takes 3–7 minutes per prospect when done correctly. Here is the exact sequence:
Step 1: Research the prospect (2–3 minutes) Before opening the LinkedIn app, look up the prospect's profile on desktop or another browser tab. Note: their current role and company, any posts they published in the last 30 days, any recent career changes or company announcements, and any mutual connections or shared background that could anchor a specific reference.
Step 2: Prepare the three elements (1 minute) Do not write a script. Write three bullet points:
Step 3: Find a quiet space and prepare to record Move away from background noise. Hold the phone at a conversational distance (6–8 inches from your mouth). Do not put the phone on speaker — record the standard way to control audio quality.
Step 4: Record in one take Open LinkedIn mobile, navigate to the conversation, press the microphone button, and speak. One take is the standard. Speaking naturally for 25–35 seconds is the target. Check the duration indicator after recording but before sending.
Step 5: Listen back once before sending Play the voice note once. Check: did you say their name? Did the hook reference something specific? Does it sound natural or does it sound read? If the note sounds natural and covers all three elements, send it. If it sounds scripted or you made a material error, re-record once.
Step 6: Send and log Send the voice note. In your tracking spreadsheet or CRM, note the prospect's name, the date the voice note was sent, and the specific hook you used. This log helps you avoid repeating the same reference in a future re-engagement.
A voice note should not be scripted verbatim — a read-aloud script sounds like a script. Instead, prepare three elements and speak naturally around them:
Element 1: Name and personalised hook (5–7 seconds) Open with their first name and a specific reference. Not "I noticed your profile" but "your post last week about [specific topic]" or "your move from [Company A] to [Company B] — that caught my attention."
Element 2: Why you are reaching out (10–12 seconds) State the specific relevance. Not "I think we might be a fit" but "the [specific aspect of your work] connects directly to what we do with [specific function] teams at [type of company]."
Element 3: The ask (5–7 seconds) Make it small and easy. "If this is even 20% relevant to your current priorities, worth 10 minutes? Happy to schedule around your calendar."
Element 4: The close (3–5 seconds) End naturally. "Let me know either way — no obligation."
Total: approximately 25–32 seconds.
Practice once silently before recording. Do not prepare a script; prepare the three elements and their order. The naturalness of the delivery matters more than verbal precision.
Practical recording tips:
Different prospect types respond to different approaches within the core three-element structure. Here are frameworks for four common B2B persona types:
Persona 1: VP of Sales at a target account VPs of Sales receive high outbound message volume and are attuned to sales tactics. The hook for this persona must be genuinely specific — a generic compliment immediately signals template use.
Persona 2: Founder of a Series A/B company Founders are stretched thin and filter ruthlessly for relevance. They are also more accessible than enterprise executives if the message is genuinely relevant.
Persona 3: Head of People / Talent Acquisition Leader For recruiting use cases, this persona responds to specificity about candidate type and an understanding of the sourcing challenge.
Persona 4: Director of Operations or RevOps Operations leaders are analytical and respond to specific metrics and process improvements rather than aspirational language.
Do not read from a script: A monotone reading of prepared text is immediately distinguishable from genuine speech and produces the opposite of the intended effect.
Do not pitch the product in the first voice note: The voice note is a relationship-opening move, not a product demonstration. Reference relevance, not features. The product details come after the prospect replies.
Do not make the voice note too long: Above 45 seconds, voice notes start to feel burdensome. The prospect is now invested in listening for too long before they know the point. Keep under 40 seconds.
Do not send a voice note immediately after connecting: Allow the Aimfox text sequence to run first. Voice notes are most effective as an escalation after text messages have received no reply, not as a first contact.
Do not send voice notes to everyone on your list: The selective use is what makes them effective. If every prospect receives a voice note, they stop being differentiated and start being just another contact channel. Reserve for the top 5–10% of your campaign audience.
Do not open with "Hi, my name is...": This is recognisable as a sales-call opener and immediately signals a scripted approach. Instead, open directly with the hook: their name + the specific reference.
Do not apologise for reaching out: Statements like "Sorry to bother you" or "I know you're busy" signal insecurity and waste the limited 30-second window. Open with the hook directly.
Voice notes are manual effort applied to a small subset of your campaign audience. Volume expectation:
| Campaign size | Tier 1 (voice note candidates) | Voice notes sent per week | Expected reply rate |
|---|---|---|---|
| 200-person campaign | 10–20 prospects | 10–20 | 28–35% |
| 500-person campaign | 25–50 prospects | 25–50 | 28–35% |
| 1,000-person campaign | 50–100 prospects | 50–100 | 28–35% |
Time investment: approximately 3–5 minutes per voice note (finding the conversation, preparing the 3 elements, recording, sending). At 20 voice notes per week, this is 60–100 minutes per week — appropriate for a high-value prospect list where each conversion has significant commercial value.
For standard SMB outreach where the deal size does not justify manual effort at this level, voice notes are not the right tool. The Aimfox text sequence alone is sufficient.
ROI calculation framework:
To decide whether voice notes are worth the effort for a specific campaign, use this calculation:
Example: $25,000 average deal × 30% reply rate × 15% close rate = $1,125 expected value per voice note sent. At 5 minutes per voice note, this is $13,500/hour of expected output. Very few other outbound activities produce this ratio, which is why selective voice note use for high-value accounts is consistently worth the time investment.
The same high-value Tier 1 prospects receiving voice notes should also be in an Instantly email sequence using contact data from Quarvio. Coordinate:
Per Woodpecker multichannel outreach study, multichannel approaches (email + LinkedIn) produce 40–60% higher total reply rates. The voice note layer on top of this compounds the reply rate further for the specific subset where it is applied.
Timing coordination for Tier 1 prospects:
This staggered coordination ensures the prospect hears from you across channels at intervals that feel like genuine persistence rather than automated bombardment. The voice note at Day 12 is the differentiated step that breaks the pattern of the prior text-based touches.
| Parameter | Recommended value | Notes |
|---|---|---|
| Duration | 25–35 seconds | 40 seconds maximum |
| Timing in sequence | After 2 text DMs with no reply | Do not use as first contact |
| When to apply | Top 5–10% of accepted connections | High-value prospects only |
| Preparation time | 2–3 minutes research + 1 minute prep | Per prospect |
| Recording time | 30–90 seconds (one take) | One take standard |
| Hook type | Specific reference from their profile/posts | Not a generic compliment |
| Script approach | 3 bullet points, speak naturally | No verbatim scripts |
| Recording environment | Quiet, low background noise | Critical for audio quality |
| Phone distance | 6–8 inches from mouth | Closer = distortion |
| Takes before sending | 1–2 maximum | More produces stiff delivery |
| Follow-up if no reply | 1 text DM after 5–7 days | Then close the sequence |
| Use for recruiting | Yes, Tier 1 candidates | Same framework applies |
| Sequence cancel on reply | Immediately | Manual action required |
| Email cancel on reply | Within same day | Check Instantly after voice note reply |
Aimfox reviews on G2 include practitioners who describe using Aimfox for the volume layer of LinkedIn outreach and manual voice notes for high-value accounts, with reply rates from voice notes cited as materially higher than text DM follow-ups for senior decision-maker audiences.
LinkedIn automation tools on G2 category analysis shows voice notes mentioned as a practitioner differentiation tactic in senior B2B prospecting, with the manual-effort-as-signal aspect being the primary reason practitioners use them selectively.
"I use Aimfox for the connection campaign and the first two text follow-ups to everyone. For the 15 most strategically important accounts in each campaign, I send a voice note as Step 3 instead of another text DM. The reply rate from voice notes is consistently higher than Step 2 text follow-ups for those accounts. The extra 5 minutes per prospect is worth it when the deal is worth six figures."
— Verified G2 reviewer, enterprise account executive, B2B SaaS, Aimfox on G2
"Voice notes work because they cannot be faked at scale. The prospect knows you recorded it specifically for them. That specificity — even if the content is similar to what a text message would say — changes how they receive it. It is not automation. That is the point."
— Verified G2 reviewer, business development director, management consulting, Aimfox on G2
Symptoms: Voice notes are being sent to Tier 1 prospects, but the reply rate is not materially different from the text DM reply rate. The differentiation benefit is not materialising.
Diagnosis steps:
Fix: Increase the specificity of the hook. The hook should reference something that would surprise the prospect with your research depth: a specific post they published, a company announcement, a career transition detail. Generic compliments ("Your background is impressive") do not produce the differentiation effect. If timing is the issue, ensure voice notes are sent only after 2 text DMs have received no reply, not as the first or second contact.
Symptoms: When trying to send a voice note, the microphone button does not appear in the message input field.
Diagnosis steps:
Fix: Confirm 1st-degree connection status and update the LinkedIn app. If the feature is still not available after confirming both, the prospect's account settings may have restrictions on who can send them audio messages. Send a text DM instead.
Symptoms: Voice notes are sending but the audio quality is noticeably poor: muffled, distorted, or with significant background noise.
Diagnosis steps:
Fix: Find a quieter recording environment for voice note sessions. If working in an open office, a phone booth, conference room, or even a car provides significantly better audio quality. Hold the phone 6–8 inches from the mouth. Remove any phone case attachment that might obstruct the microphone. Test audio quality by recording a test message to yourself before starting a voice note session.
Symptoms: LinkedIn shows that voice notes have been listened to (the listen indicator appears), but the prospect does not reply. Reply rates are low despite evidence of engagement.
Diagnosis steps:
Fix: Sharpen the ask. The ask at the end of the voice note should be a specific question with a low-friction response path: "Happy to send a quick overview — would that be useful?" or "I have 2 slots open Thursday — does either work?" A binary question is easier to respond to than an open-ended invitation. If the hook is the issue, increase research depth before future notes.
Symptoms: A Tier 1 prospect received a voice note from you manually and an automated Step 3 text DM from the Aimfox sequence on the same day. The prospect replied negatively about receiving multiple messages simultaneously.
Diagnosis steps:
Fix: Establish a protocol: before sending a voice note to any Tier 1 prospect, pause or cancel their remaining Aimfox sequence steps first. This takes 30 seconds per prospect but prevents the dual-message problem. For prospects where the automated step sent simultaneously, acknowledge the overlap in your next message: "Apologies for the double message — I wanted to reach out personally." Most prospects who would have responded will still respond; the apology signals awareness.
Symptoms: Trying to send a voice note to a specific Tier 1 prospect, but finding their conversation in LinkedIn mobile takes significant time or the conversation is buried.
Diagnosis steps:
Fix: When triaging Tier 1 prospects in Aimfox, create a simple tracking list (even a text note on your phone) of the specific names and companies you plan to send voice notes to that day. Search for each name directly in LinkedIn messaging rather than scrolling. Alternatively, send a placeholder text message to the prospect first (from desktop) so the conversation is elevated in the recent messages list, making it easy to find in mobile.
Symptoms: A Tier 1 prospect replied to the voice note on LinkedIn. The conversation is visible in LinkedIn native but does not appear in Aimfox Unibox.
Diagnosis steps:
Fix: Wait 30 minutes after a prospect's reply before concluding it is not in Unibox. If it still does not appear, re-authenticate the LinkedIn account in Aimfox settings. As a temporary measure, manage voice note replies directly in LinkedIn while the sync issue is investigated.
Symptoms: With multiple active campaigns and a list of Tier 1 prospects, it is becoming difficult to track who has received a voice note, when, and what the current status of each conversation is.
Diagnosis steps:
Fix: Create a simple spreadsheet or CRM tracking view specifically for Tier 1 voice note prospects. Columns: name, company, LinkedIn URL, date voice note sent, hook used, current status (sent, replied, booked, declined, no reply). Update this list daily. In Aimfox Unibox, apply a specific label (e.g., "Voice Note" or "VN Sent") to conversations where a voice note has been sent. This filter allows you to quickly review all voice note conversations separately from automated sequence conversations.
For the very highest-value prospects — the top 1–2% of your audience where a single conversion would be a significant outcome — a personalised video DM is the next tier beyond a voice note. LinkedIn supports native video messaging to 1st-degree connections through the mobile app.
A 45–60 second video adds the visual element to the audio signal: the prospect sees a person talking to them, not just a voice. This further reduces the "automated" perception and increases the feeling of genuine personal contact. The tradeoff: video requires more setup (camera-facing environment, appropriate background), more preparation (knowing what you are going to say on camera), and more recording time.
Reserve video DMs for named target accounts where the deal would be material to your business. Voice notes for the broader Tier 1 group; video DMs for the top 5–10 prospects from the entire campaign.
Prospects who engaged warmly but went quiet — responded positively to a connection request or text DM months earlier but never booked a meeting — represent a warm audience for re-engagement. A voice note is the highest-signal re-engagement tactic for these conversations because it signals that you specifically remembered the prior conversation and thought of them.
Voice note re-engagement script for stalled conversations:
This re-engagement approach produces strong reply rates (20–30%) from previously warm conversations because it demonstrates genuine attention to the specific prior interaction.
The highest time cost in voice note outreach is the research required to produce a genuinely specific hook. Over time, you will develop patterns: the most memorable posts, the most common career transitions, and the most resonant company milestones in each vertical you target.
Build a notes document organised by vertical (SaaS founders, VP of Sales at series B, Head of Talent at growth-stage companies) that lists:
This library does not script your voice notes. It gives you a starting point that speeds up the research and preparation phase without removing the personalisation. A hook from the library still needs to be made specific to the individual prospect before recording.
For the most strategically important Tier 1 prospects, sending a voice note on LinkedIn and a personalised email in Instantly on the same day creates a coordinated multi-channel moment that is hard to ignore without being aggressive.
The coordination works because the channels are different enough that it does not feel like harassment:
This cross-channel acknowledgment shows multi-channel awareness without being sycophantic. The prospect understands you are making a genuine effort to reach them. Per Woodpecker's multichannel outreach study, same-day multi-channel touches are among the highest-performing combinations for senior decision-maker audiences.
After running voice note outreach for 2–3 months across multiple campaigns, review the reply patterns:
Document the findings and use them to:
| Need | Tool | Notes |
|---|---|---|
| Verified B2B contacts | Quarvio | One-time purchase, no subscription |
| Email inboxes | Inframail | Microsoft 365 inboxes, auto DNS |
| Cold email sending | Instantly | Sequences, warm-up, reply tracking |
| LinkedIn outreach | Aimfox | Connection campaigns, Unibox |
Can Aimfox automate LinkedIn voice notes?
No. LinkedIn voice notes must be recorded and sent manually through the LinkedIn mobile app. No LinkedIn automation tool, including Aimfox, can send voice notes programmatically — LinkedIn does not expose voice note functionality through any external API. Aimfox handles the automated text-based connection campaign and follow-up sequences; voice notes are a manual tactic layered on top for high-priority prospects within your existing connections.
How long should a LinkedIn voice note be for outbound prospecting?
25–35 seconds is the optimal range. Under 20 seconds does not provide enough context to justify the voice format (a text message would deliver the same content more conveniently). Over 45 seconds starts to feel like a burden on the listener's time before they know whether the content is worth it. Structure the note as: personalised hook (5–7 seconds) + specific relevance (10–12 seconds) + small ask (5–7 seconds) + graceful close (3–5 seconds).
Should I send a voice note as the first message after a connection accepts?
No. Let the Aimfox text sequence run first. Voice notes are most effective as an escalation tactic after one or two text follow-ups have not received a reply. Sending a voice note as the first follow-up (before text DMs) removes the escalation dynamic — you have used your highest-differentiation tactic as your opening move, with no escalation option remaining.
Which LinkedIn prospects are worth the manual effort of a voice note?
Tier 1 prospects where the commercial value of a single conversion justifies 5–8 minutes of manual effort per person. This typically means: named target accounts, senior decision-makers (VP+, C-level, Founders), or prospects who have shown warm signals (accepted quickly, viewed your profile after connecting) but have not replied to text follow-ups. For standard volume campaigns where the deal size is average, the Aimfox text sequence is sufficient without voice note escalation.
What is the best time of day to send LinkedIn voice notes?
Send voice notes during the prospect's business hours. The optimal windows are Tuesday through Thursday between 9 am and 11 am, or 2 pm and 4 pm, in the prospect's local timezone. Avoid Monday mornings (prospects are catching up on the week) and Friday afternoons (attention is elsewhere). Voice notes sent outside business hours are more likely to be overlooked in the morning notification backlog.
Should I listen to the voice note before sending?
Yes, once. A single listen-back takes 30 seconds and lets you catch material errors (missed their name, wrong company reference, audio quality issue) without introducing the rehearsed quality that comes from recording and re-recording multiple times. If the note sounds natural and covers the three elements, send it. If it sounds scripted, record once more. Do not record more than twice — the second take should be enough.
How do I know if a prospect listened to my voice note on LinkedIn?
LinkedIn shows a "listened" indicator in some message threads once the recipient has played the voice note. This indicator is not always reliable and depends on the prospect's notification and listening settings. Do not use the absence of a "listened" indicator to conclude the voice note was not heard — some prospects listen without triggering the indicator.
What should I do if a prospect replies to a voice note saying they are not interested?
Thank them for listening and responding. Apply the "Not Interested" label in Aimfox Unibox immediately. Cancel any remaining Aimfox sequence steps and any Instantly email sequence steps for that prospect. Do not re-engage for at least 6 months. A prospect who listened to a personal voice note and explicitly declined has made an informed decision that should be respected.
Can I use voice notes for recruiting outreach as well as sales outreach?
Yes. The same approach applies to recruiting: Aimfox runs the connection campaign and text sequence for all candidates; voice notes go to Tier 1 candidates (senior roles, named target companies) after 2 text steps with no reply. The hook for a recruiting voice note references the candidate's specific career background or achievement rather than a company milestone. See the Aimfox recruiting guide for full candidate-specific frameworks.
Is it worth sending a second voice note if the first received no reply?
Generally no. If a well-crafted voice note from a genuine, researched hook received no reply after 7 days, sending a second voice note is unlikely to produce a different outcome. The exception: if significant time has passed (60+ days) and a new, distinct hook has emerged (new company milestone, recent post, industry development), a single re-engagement voice note may be appropriate. Sending a second voice note too quickly feels like persistence bordering on pressure.
What should a follow-up text DM say after a voice note received no reply?
Wait 5–7 days after the voice note before sending a text DM. Keep it brief: 2 sentences maximum. Acknowledge the voice note without being apologetic about it: "Sent you a voice note last week about [brief context] — happy to connect if timing is ever right." Then close the sequence. Do not reference the voice note as evidence of effort ("I already went out of my way to send a voice note") — this frames the follow-up as a demand for reciprocity, which is counterproductive.
What reply rate should I realistically expect from voice notes sent to senior decision-makers?
For well-crafted voice notes with specific hooks sent to genuinely Tier 1 prospects after 2 unreplied text DMs, expect 25–35% reply rates. This is 2–3 times the typical reply rate from a third text DM to the same audience. Below 15% consistently suggests either the hooks are not specific enough (generic compliments rather than research-based references) or the prospect tier is not genuinely Tier 1 (not senior enough, not high enough deal value to justify the manual effort).
The voice note is the manual layer. The data is the foundation.
Aimfox runs the connection campaign at volume; voice notes go to the top 5–10% of high-value accepted connections. Quarvio ensures that top 5–10% is from a verified, targeted list of the right prospects, not random connections. One-time purchase, credits valid for 12 months, no subscription.
Pricing from $129 for 5,000 contacts to $699 for 50,000 contacts.