Upgrade to Pro — share decks privately, control downloads, hide ads and more …

WhisperKit がだいぶ良いので紹介する

March 28, 2024

WhisperKit がだいぶ良いので紹介する


March 28, 2024

More Decks by shu223

Other Decks in Technology


  1. ࣗݾ঺հ • అ मҰ • @shu223 (GitHub, Qiita, Zenn, note,

    𝕏, YouTube, Podcast, etc...) • ॻ੶ʢ঎ۀग़൛4࡭ɺݸਓग़൛ଟ਺ @BOOTHʣ:
  2. Core ML ͱ͸ • ػցֶशϞσϧΛiOS, macOS, etc. ʹ૊ΈࠐΉͨΊͷApple ੡ͷϑϨʔϜϫʔΫ, ϞσϧϑΥʔϚοτ

    • CPUɾGPUɾNeural Engine (ANE) Λར༻͠ɺϝϞϦ઎༗ྔ ͱిྗফඅྔΛ࠷খݶʹ཈͑ͭͭύϑΥʔϚϯεΛ࠷େݶʹ ߴΊΔΑ͏ʹઃܭ͞Ε͍ͯΔ
  3. ܥේɿ whisper.cpp ͔Β WhisperKit΁ • whisper.cpp ͷ Core ML൛ΑΓ΋͞ΒʹAppleϋʔυ΢ΣΞ Λ׆͔͢Α͏ΧϦΧϦʹ࠷దԽ

    • ։ൃݩͷargmaxࣾ͸ Appleͷ ml-ane-transformers ͷ தͷਓ 2 ͷձࣾ • Core ML൛ whisper.cpp ΑΓ 1.86ഒʙ2.85ഒ ߴ଎ 3 3 ग़యɿ WhisperKit — Argmax 2 ͭ·Γͦ΋ͦ΋TransformerϞσϧΛCore MLͰ࠷దԽͯ͠iOS/macOSͰಈ͔ͤΔΑ͏ʹ͢ΔྲྀΕͷݩ૆ʹ͋ͨΔਓ
  4. WhisperKit ͷϞσϧαΠζ WER File Size (MB) small.en 3.12 483 small

    3.45 483 base.en 3.98 145 base 4.97 145 tiny.en 5.61 66 tiny 7.47 66 ʢग़యɿ https://huggingface.co/argmaxinc/whisperkit-coreml ʣ
  5. ΞϓϦ΁ͷ૊ΈࠐΈํ๏ Swift PackageΛ࢖ͬͯ 2ߦ Ͱ࣮૷Մೳ let pipe = try? await

    WhisperKit() let transcription = try? await pipe!.transcribe(audioPath: path)?.text
  6. Swift CLI • Πϯετʔϧ brew install whisperkit-cli • ࣮ߦ swift

    run whisperkit-cli transcribe --model-path "foo" --audio-path "bar"