A desktop Electron app that listens to your live interview via microphone, transcribes speech in real-time using Deepgram, and generates AI-powered answers instantly from Groq, OpenAI, Anthropic, or Google Gemini. Also includes screenshot analysis for coding/whiteboard screens.
| Feature | Description |
|---|---|
| Live Transcription | Captures microphone audio, streams it to Deepgram via WebSocket, and displays real-time & final transcripts |
| AI Answer Generation | Automatically sends the last few transcript sentences to an LLM and displays generated answers in real-time |
| Screenshot Analysis | One-click screenshot capture + AI vision analysis for coding questions, shared documents, or video call screens |
| Multi-Provider LLM | Supports Groq, OpenAI, Anthropic (Claude), and Google Gemini — switch in Settings |
| Custom Context | Upload your resume, job description, and extra instructions so answers are tailored to you |
┌─────────────────────────────────────────────┐
│ [Start Transcription] [AI] [Screen] [End] │
├─────────────────────────────────────────────┤
│ Live Transcript (top, compact, rolling) │
│ "What is the time complexity of..." │
│ "Can you explain how React hooks work?" │
├─────────────────────────────────────────────┤
│ AI Answers (bottom, main area) │
│ Q: "What is the time complexity..." │
│ A: "O(n log n) because..." │
├─────────────────────────────────────────────┤
│ Debug Log (amber, shows pipeline status) │
└─────────────────────────────────────────────┘
| Layer | Technology |
|---|---|
| Framework | Electron 35 + React 18 + TypeScript |
| Bundler | Webpack (via ERB — Electron React Boilerplate) |
| Styling | Tailwind CSS |
| Transcription | Deepgram Streaming API (WebSocket, nova-2 model) |
| AI LLM | Groq / OpenAI / Anthropic / Google Gemini (streaming) |
| Screenshot | screenshot-desktop (native module) |
| WebSocket | ws library (Node.js, main process) |
- macOS (11+ / Big Sur or newer)
- Node.js 20+ and npm
- API keys for at least one service:
- Deepgram (for transcription) — console.deepgram.com
- Groq (fastest & free tier available) — console.groq.com
- OR OpenAI / Anthropic / Gemini keys
git clone https://github.com/yaxit24/pm.git
cd pm
npm installThis installs all dependencies and rebuilds native modules for Electron.
Launch the app in dev mode:
npm startClick Settings (gear icon) and enter:
- Deepgram API Key — required for transcription
- LLM API Key — required for AI answers (pick one provider)
- Groq (recommended, fastest):
llama-3.3-70b-versatile - OpenAI:
gpt-4o-mini - Anthropic:
claude-3-5-sonnet-20241022 - Gemini:
gemini-2.0-flash
- Groq (recommended, fastest):
In Settings, fill:
- Job Description — so AI knows the role
- Resume — so AI tailors answers to your background
- Extra Instructions — any custom rules (e.g., "Keep answers under 50 words")
npm run packageOutput:
release/build/mac-arm64/pmodule.app— Apple Silicon (M1/M2/M3)release/build/mac/pmodule.app— Intel Macrelease/build/pmodule-3.2.28-arm64.dmg— DMG installer
- Open the app before your interview starts
- Click Start Transcription — the mic icon turns green when connected
- Speak normally — transcripts appear live in the top panel
- AI answers appear automatically in the bottom panel as new sentences finalize
- Click Screen to capture & analyze any coding screen or document
- Click End when finished
┌─────────────┐ ┌──────────────┐ ┌─────────────────────┐
│ Renderer │────▶│ Main Process │────▶│ Deepgram WebSocket │
│ (React) │ │ (Node.js) │ │ wss://api.deepgram │
│ │ │ │ │ /v1/listen │
│ - Mic capture│ │ - ws library │ │ │
│ - MediaRecorder│ │ - Auth header │ │ Audio → Transcripts │
│ - IPC send │◀────│ - IPC relay │◀────│ │
│ │ │ │ │ │
│ - Transcript │ │ │ │ LLM Providers │
│ display │ │ │ │ (fetch, streaming) │
│ - AI answers │◀────│ │◀────│ Groq/OpenAI/etc │
└─────────────┘ └──────────────┘ └─────────────────────┘
Why main process for WebSocket? Browser WebSockets cannot set custom headers like Authorization: Token <key>. We use the Node.js ws library in the main process, then relay transcripts back to the renderer via IPC.
| Symptom | Likely Cause | Fix |
|---|---|---|
"Transcript 0 lines" but connection shows true |
Mic returning silence (macOS hardened runtime blocks hardware) | Rebuild with mic entitlements (already included). Reset permission: tccutil reset Microphone com.electron.pmodule |
| "WebSocket was closed before connection" | Race condition on reconnect | Fixed — uses terminate() + generation counter |
| No AI answers appearing | autoGen disabled or no LLM key |
Enable "Auto-generate answers" in Settings and add LLM key |
| Gatekeeper warning on other Macs | App is signed but not notarized | User runs xattr -cr /Applications/pmodule.app once |
| Audio chunks are ~70 bytes | Mic stream is silent (permission or hardware) | Check System Settings → Sound → Input level is not at zero |
| Slow AI answers | Wrong provider or large context | Use Groq, keep answers brief, context limited to last 5 sentences |
- The app is signed with a Mac Developer Distribution certificate
- It is not notarized — users on other Macs will see a Gatekeeper warning
- To fully distribute without warnings, enable notarization in
package.jsonand set Apple ID credentials before building - Supports both arm64 (Apple Silicon) and x64 (Intel) architectures
MIT — based on Electron React Boilerplate.