🎤 Sherpa ONNX - Speech Models

Upload and manage Sherpa ONNX models for Speech-to-Text and Speaker Diarization

📡 API Endpoints

POST /api/v1/speech/stt - Speech-to-Text
curl -X POST [API_URL]/api/v1/speech/stt \ -H "X-API-Key: your-key" \ -F "file=@audio.wav"
POST /api/v1/speech/diarization - Speaker Diarization
curl -X POST [API_URL]/api/v1/speech/diarization \ -H "X-API-Key: your-key" \ -F "file=@audio.wav"

🎤 Upload Sherpa ONNX STT Models

Upload Speech-to-Text models (.onnx, .txt files) to app/models/sherpa-onnx/

⚠️ File Name Requirements:
  • tokens.txt - Vocabulary/token mappings (exact name required)
  • encoder.onnx - Encoder model (exact name required)
  • decoder.onnx - Decoder model (exact name required)
  • joiner.onnx - Joiner model (exact name required)

All 4 files must be uploaded with these exact names. The system will not work if files have different names!

📤
Click to select .onnx or .txt files or drag and drop
Supports: .onnx, .txt files (tokens.txt, encoder.onnx, decoder.onnx, joiner.onnx)
0%

👥 Upload Sherpa ONNX Diarization Models

Upload Speaker Diarization models (.onnx files) to app/models/sherpa-onnx/diarization/

⚠️ File Name Requirements:

Option 1 (Recommended - Two files):

  • encoder.onnx - Encoder model (exact name required)
  • decoder.onnx - Decoder model (exact name required)

Option 2 (Single file - experimental):

  • model.onnx - Single model file (will be used for both encoder and decoder)

Note: Standard Sherpa ONNX diarization requires two separate models (segmentation and embedding). If you only have one file, you can upload it as model.onnx to try the single-file option, but for best results, download both encoder.onnx and decoder.onnx from Sherpa ONNX releases.

📤
Click to select .onnx files or drag and drop
Supports: .onnx files (encoder.onnx, decoder.onnx)
0%