A speech-to-text and AI summarization app powered by your own OpenAI API key — no subscription required. You only pay OpenAI's usage fees when you use it. From recording to transcription, AI summaries, meeting minutes, and Slack posting, the entire workflow is handled in one app.
Download on the App Store
WhisperDirect does not charge any in-app subscription. You pay OpenAI directly — no middleman markup, no monthly fees.
≈ $0.36/hour
Whisper API is priced at approximately $0.006/minute. With $5, you can transcribe about 14 hours of audio.
A few cents/run
Choose from GPT-4.1-nano, GPT-4.1-mini, GPT-5-nano, or GPT-5-mini. Even long texts of 1,000–2,000 words can typically be processed for just a few cents.
Available models may change based on OpenAI's offerings.
$0/month (app fee)
No monthly subscription to WhisperDirect. After the one-time unlock, you pay OpenAI directly for what you use — nothing more.
Free trial: 5 sessions included before purchase.
From setup to your first transcription in just a few minutes.
From audio capture to transcription, AI summarization, and team sharing — everything you need to streamline your post-meeting workflow.
Record in real-time with the microphone button — including Bluetooth microphones, switchable via the selector above the record button. Import audio files via the Share Sheet, or load video files with automatic audio extraction and compression. Supported audio: mp3, m4a, aac, wav, flac, ogg, opus, wma, amr, mpga, webm, aiff, caf. Supported video: mp4, mov, m4v, webm, mkv, avi, mpeg, mpg.
Automatically post transcripts, summaries, and meeting minutes to a specified Slack channel. Supports automatic backup to Google Drive. When a meeting ends, your notes are delivered to the team without any manual steps.
Transcript text highlights automatically in sync with audio playback. Jump to any segment with a tap. Timeline markers are configurable in 5-second steps, making it easy to navigate long recordings.
Extract text from photos of whiteboards or documents using high-accuracy OCR. Supports batch processing of multiple images. All processing happens on-device — no API cost.
Edit the prompts used for summaries and meeting minutes directly from Settings. Tailor the AI output to your team's needs — extract action items only, output in bullet points, or any format you choose.
Export audio files, transcripts, summaries, and meeting minutes. Also supports VTT and SRT subtitle file generation for use with video content and editing tools.
Estimated API costs are calculated in real time based on audio length and character count, so you always have a clear sense of usage before you run a job.
Select your LLM model, adjust timeline intervals in 5-second steps, edit prompts, and configure Slack and Google Drive settings — all from a single settings screen.
Record your meeting and generate AI meeting minutes immediately after. Auto-post to Slack to eliminate manual note-taking entirely.
Transcribe long interviews with high accuracy. Use timeline markers to quickly reference specific statements.
Record lectures and automatically extract key points for study notes and review materials.
Import a video file and export VTT or SRT subtitle files ready for YouTube or your video editing software.
Audio: mp3, m4a, aac, wav, flac, ogg, opus, wma, amr, mpga, webm, aiff, caf
Video: mp4, mov, m4v, webm, mkv, avi, mpeg, mpg