Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
OCRを使ってゲームのアイテムをデータ化する
Search
Kishikawa Katsumi
May 22, 2026
Programming
50
0
Share
OCRを使ってゲームのアイテムをデータ化する
プロトタイプを製品にする技術
OCRを使ってゲームのアイテムをデータ化する
Kishikawa Katsumi
May 22, 2026
More Decks by Kishikawa Katsumi
See All by Kishikawa Katsumi
Running Swift without an OS
kishikawakatsumi
0
900
浮動小数の比較について
kishikawakatsumi
0
500
Automatic Grammar Agreementと Markdown Extended Attributes について
kishikawakatsumi
0
240
愛される翻訳の秘訣
kishikawakatsumi
3
440
Private APIの呼び出し方
kishikawakatsumi
3
1k
iOSでSVG画像を扱う
kishikawakatsumi
0
240
Build your own WebP codec in Swift
kishikawakatsumi
2
2k
iOSDC 2024 SMBファイル共有をSwiftで実装する
kishikawakatsumi
1
320
Enhancing Applications with Accessibility API
kishikawakatsumi
3
5.6k
Other Decks in Programming
See All in Programming
実践ハーネスエンジニアリング:ステアリングループを実例から読み解く / Practical Harness Engineering: Understanding Steering Loops Through Real-World Examples
nrslib
5
5.5k
GitHubCopilotCLIをはじめよう.pdf
htkym
0
330
過去のレビュー知見をSkillsで資産化した話
pkshadeck
PRO
1
2k
PHPでバイナリをパースして理解するASN.1
muno92
PRO
0
460
Migrations : C'est une question d'hygiène !
vinceamstoutz
0
200
Structured Concurrency, Scoped Values and Joiners in the JDK 25 26 27
josepaumard
1
150
リセットCSSを1行消したらアクセシビリティが向上した話
pvcresin
4
520
AI時代のエンジニアリングの原則 / Engineering Principles in the AI Era
haru860
0
1.3k
[BalkanRuby 2026] Drop your app/services!
palkan
2
340
いつか誰かが、と思っていた フロントエンド刷新5年間の実践知
kiichisugihara
1
280
AlarmKitで明後日起きれるアラームアプリを作る
trickart
0
140
🦞OpenClaw works with AWS
licux
1
370
Featured
See All Featured
The World Runs on Bad Software
bkeepers
PRO
72
12k
WCS-LA-2024
lcolladotor
0
590
How Software Deployment tools have changed in the past 20 years
geshan
0
33k
Accessibility Awareness
sabderemane
1
120
Measuring & Analyzing Core Web Vitals
bluesmoon
9
820
We Have a Design System, Now What?
morganepeng
55
8.1k
AI Search: Implications for SEO and How to Move Forward - #ShenzhenSEOConference
aleyda
1
1.2k
KATA
mclloyd
PRO
35
15k
Balancing Empowerment & Direction
lara
6
1.1k
Rails Girls Zürich Keynote
gr2m
96
14k
Designing Dashboards & Data Visualisations in Web Apps
destraynor
231
54k
AI: The stuff that nobody shows you
jnunemaker
PRO
7
640
Transcript
ϓϩτλΠϓΛʹ͢Δٕज़ LJTIJLBXBLBUTVNJ !LJTIJLBXBLBUTVNJ!IBDIZEFSNJP LJTIJLBXBLBUTVNJ 0$3ΛͬͯήʔϜͷΞΠςϜΛσʔλԽ͢Δ
None
ϓϩτλΠϓΛʹ͢Δٕज़ r ݸͷΞΠςϜΛਖ਼֬ʹσʔλԽ͢Δ r ̍ͭʹ͖ͭඵҎʹಡΈऔΕΔ r ήʔϜͷݴޠ͕ຊޠͱӳޠͷͲͪΒͰಡΈऔΕΔ
ϓϩτλΠϓΛʹ͢Δٕज़ r ݸͷΞΠςϜΛਖ਼֬ʹσʔλԽ͢Δ r ̍ͭʹ͖ͭඵҎʹಡΈऔΕΔ r ήʔϜͷݴޠ͕ຊޠͱӳޠͷͲͪΒͰಡΈऔΕΔ ਫ਼
ϓϩτλΠϓΛʹ͢Δٕज़ r ݸͷΞΠςϜΛਖ਼֬ʹσʔλԽ͢Δ r ̍ͭʹ͖ͭඵҎʹಡΈऔΕΔ r ήʔϜͷݴޠ͕ຊޠͱӳޠͷͲͪΒͰಡΈऔΕΔ ਫ਼
ϓϩτλΠϓΛʹ͢Δٕज़ r ݸͷΞΠςϜΛਖ਼֬ʹσʔλԽ͢Δ r ̍ͭʹ͖ͭඵҎʹಡΈऔΕΔ r ήʔϜͷݴޠ͕ຊޠͱӳޠͷͲͪΒͰಡΈऔΕΔ ਫ਼ ॊೈੑ
4BNQMF$PEF HJUIVCDPNLJTIJLBXBLBUTVNJ$BNFSB0$3
None
None
େ͖͞ ৭ छྨ ޮՌςΩετ
4UFQ3BX0$3 actor OCRRunner { private var busy = false func
process(_ cgImage: CGImage) async -> [RecognizedTextObservation]? { guard !busy else { return nil } busy = true defer { busy = false } var request = RecognizeTextRequest() request.recognitionLanguages = [ Locale.Language(identifier: "ja-JP"), Locale.Language(identifier: "en-US"), ] request.recognitionLevel = .accurate request.usesLanguageCorrection = false return try? await request.perform(on: cgImage) } }
None
4UFQ4UBCJMJUZ'JMUFS
let windowSize: Int = 5 let minHits: Int = 3
func updateStability(with results: [RecognizedTextObservation]) { let textsThisFrame = Set( results.compactMap { (observation) -> String? in let raw = observation.topCandidates(1).first?.string ?? "" let t = raw.trimmingCharacters(in: .whitespacesAndNewlines) return t.isEmpty ? nil : t } ) recentTextSets.append(textsThisFrame) if recentTextSets.count > windowSize { recentTextSets.removeFirst(recentTextSets.count - windowSize) } var counts: [String: Int] = [:] for set in recentTextSets { for t in set { counts[t, default: 0] += 1 } } let stable = counts.filter { $0.value >= minHits } stableTexts = Set(stable.keys) stableLines = stable .map { StableLine(text: $0.key, hits: $0.value) } .sorted { $0.hits == $1.hits ? $0.text < $1.text : $0.hits > $1.hits } } ϑϨʔϜͰճҎ্ग़ݱͨ͠ ςΩετΛ࠾༻͢Δɻ
None
Ԍ߈ܸྗ্ঢ Ԍ߈ܸྗ্ঢ ཕ߈ܸྗ্ঢ Ԍ߈ܸྗ্ঢ ཕ߈ܸྗ্ঢ ग़ܸ࣌ͷثʹ ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ
ཕ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ ཕ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ ཕ߈ܸྗΛՃ ϦϯάόοϑΝʹΑΔ҆ఆੑͷ্ /ϑϨʔϜத.ճҎ্ग़ͨςΩετΛ࠾༻͢Δ
Ԍ߈ܸྗ্ঢ Ԍ߈ܸྗ্ঢ ཕ߈ܸྗ্ঢ Ԍ߈ܸྗ্ঢ ཕ߈ܸྗ্ঢ ग़ܸ࣌ͷثʹ ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ
ཕ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ ཕ߈ܸྗΛՃ ग़ܸ࣌ͷثʹ ཕ߈ܸྗΛՃ ϦϯάόοϑΝʹΑΔ҆ఆੑͷ্ /ϑϨʔϜத.ճҎ্ग़ͨςΩετΛ࠾༻͢Δ
4UFQ30* 3FHJPOPG*OUFSFTU
ΨΠυͷൣғ͚ͩಡΈऔΔ ؔͳ͍ςΩετΛಡ·ͳ͍ɾ্
static let roiOnScreen = CGRect(x: 0.08, y: 0.30, width: 0.84,
height: 0.32) static let visionROI: NormalizedRect = { let r = roiOnScreen return NormalizedRect(x: r.minX, y: 1 - r.maxY, width: r.width, height: r.height) }() func process( _ cgImage: CGImage, roi: NormalizedRect ) async -> [RecognizedTextObservation]? { ... request.regionOfInterest = roi ... } 6*ͷ࠲ඪͷ7JTJPOGSBNFXPSLͷ ࠲ඪʹมͯ͠ηοτ ΨΠυͷൣғ͚ͩಡΈऔΔ ؔͳ͍ςΩετΛಡ·ͳ͍ɾ্
None
4UFQ.BTUFS.BUDIJOH
Ϛελʔσʔλͱর߹
func bestMatch(for input: String) -> (master: Master, distance: Int)? {
let n = normalize(input) var best: (Master, Int)? for (key, master) in normalizedKeys { if abs(key.count - n.count) > 5 { continue } let d = levenshtein(n, key) if best == nil || d < best!.1 { best = (master, d) } } return best } // डཧ: ڑ ≤ max(1, |master| × 0.3) ≒ score ≥ 0.70 let threshold = max(1, Int(Double(master.textJa.count) * 0.3)) guard match.distance <= threshold else { return nil } ฤूڑ -FWFOTIUFJO%JTUBODF ͰҰகΛఆ Ϛελʔσʔλͱর߹
None
ΞϧΰϦζϜ )BNNJOHڑ ܭࢉྔͱΈ ࠷ 903 QPQDPVOU ɻಉ͡͞ͷจࣈྻͰʮҟͳΔҐஔͷʯΛ͑Δ ࠾༻͠ͳ͔ͬͨཧ༝ 0$3ͷจࣈGSBNF͝ͱʹ༳ΕΔͨΊద༻ෆՄɻจࣈͰ͕͞ҧ͏ͱ͑ͳ͍ ΞϧΰϦζϜ
OHSBN+BDDBSEྨࣅ ܭࢉྔͱΈ ͍ ू߹ԋࢉ ɻจࣈ/HSBNू߹Λ࡞Γc"ˬ#cc"˫#cΛܭࢉ ࠾༻͠ͳ͔ͬͨཧ༝ จࣈॱংΛࣺͯΔͨΊʮ߈ܸྗ্ঢʯͱʮ্ঢ߈ܸྗʯΛ۠ผͰ͖ͳ͍ ΞϧΰϦζϜ %BNFSBV-FWFOTIUFJO ܭࢉྔͱΈ -FWFOTIUFJOͱ΄΅ಉ Θ͔ͣʹ͍ ɻ-FWFOTIUFJO ྡจࣈͷೖΕସ͑ΛίετͰڐ༰ ࠾༻͠ͳ͔ͬͨཧ༝ 0$3ͰUSBOTQPTJUJPOΑΓ७ਮͳஔޡΓ͕େͰɺԸܙ͕ബ͍ ΞϧΰϦζϜ +BSP8JOLMFS ܭࢉྔͱΈ -FWFOTIUFJOΑΓఆ͕খ͍͞ɻҰகจࣈ USBOTQPTJUJPO ڞ௨QSF fi YՃͰྨࣅΛग़͢ ࠾༻͠ͳ͔ͬͨཧ༝ ͍ਓ໊ɾॅॴ͚ʹ࠷దԽ͞ΕͨؔͰɺʙจࣈͷFGGFDUจͰ-FWFOTIUFJOͱͷ͕ࠩग़ʹ͍͘ ͦͷଞͷর߹ΞϧΰϦζϜ
ͦͷଞͷর߹ΞϧΰϦζϜ ΞϧΰϦζϜ 4ZN4QFMM ܭࢉྔͱΈ ࣄલܭࢉͰ࣮࣭0 MPPLVQɻNBTUFSΛʮFEJU≤Lͷશมܗʯʹల։ͨࣙ͠ॻΛQSFCVJME͠ɺೖྗల։ͯ͠IBTIিಥΛݕग़ ࠾༻͠ͳ͔ͬͨཧ༝ ڑ≤·Ͱ͔͠Ҿ͚ͣ͞มಈʹऑ͍ɻࣙॻల։Ͱ0 -
ͷϝϞϦு͕͋Γɺ݅نͰԸܙ͕͍͠ ΞϧΰϦζϜ #,USFF ܭࢉྔͱΈ ฏۉ0 MPH/ ఔͷۙ୳ࡧɻจࣈྻۭؒʹNFUSJDUSFFΛߏங͠ɺڑEҎͷͷΛͰߜΔ ࠾༻͠ͳ͔ͬͨཧ༝ ෦Ͱ݁ہ-FWFOTIUFJOΛݺͿɻ݅نͰΠϯσοΫεߏஙίετ͕ԸܙΛ্ճΔ ΞϧΰϦζϜ 4PVOEFY.FUBQIPOF ԻӆIBTI ܭࢉྔͱΈ ͍ ఆ࣌ؒ ɻൃԻྨࣅੑͰಉΫϥεΛ࡞ΓɺϋογϡҰகͰൺֱ ࠾༻͠ͳ͔ͬͨཧ༝ ӳޠԻӆ͚ͷࢉ๏Ͱ͋Γɺຊޠ$+,ʹద༻Ͱ͖ͳ͍ ΞϧΰϦζϜ จ຺ϞσϧຒΊࠐΈڑ #&35 ܭࢉྔͱΈ େ෯ʹ͍ ेNTΫΤϦ ɻจࣈྻΛߴ࣍ݩϕΫτϧʹຒΊࠐΈɺDPTJOFڑͰྨࣅΛܭࢉ ࠾༻͠ͳ͔ͬͨཧ༝ ϦΞϧλΠϜಈըʹॏ͗͢ΔɻϞσϧ͕NBTUFSͷEPNBJOޠኮɺಛʹήʔϜޠΛΒͳ͍
ೖྗσʔλΛΩϨΠʹ͢Δࡉ͔͍ r Χˠྗ̍ͭΧλΧφͷΧΛࣈͷྗ ͔ͪΒ ʹஔ͖͑Δ ‣ ߈ܸʮྗʯͳͲܾ·ͬͨύλʔϯʹ͍ͭͯ r શ֯ɾ֯Λଗ͑Δ r
ۭനΛআڈ͢Δ Ϛονϯάͷલʹਖ਼نԽͯ͠ϊΠζΛআڈ͢Δ
ೝࣝΛ্ͤ͞Δࡉ͔͍ 0xD5D5EBF7EAD5D5EB ݩը૾ 9×8 grayscale 8×8 bit pattern 64-bit hash
E)BTIΛϋϛϯάڑͰൺֱͯ͠ྨࣅͷϑϨʔϜΛແࢹ͢Δ
None
·ͱΊ r Ϛελʔσʔλʢਖ਼ղͷఆٛʣΛ͑Δ r ϑϨʔϜ୯ҐͰޡೝࣝΛϑΟϧλʔ͢Δ ‣ .VMUJGSBNF$POTFOTVT r ࣄલʹೖྗΛΫϦʔϯʹ͢Δ ‣
ςΩετͷਖ਼نԽ ‣ Α͋͘ΔޡೝࣝΛஔ r ڍಈΛܾఆతɾ؍ଌՄೳʹ͢Δ ߴ͍࣭Ͱ࠶ݱੑͷ͋Δ݁ՌΛग़ྗ͢ΔͨΊͷ
3FTPVSDFT r IUUQTHJUIVCDPNLJTIJLBXBLBUTVNJ$BNFSB0$3 r IUUQTHJUIVCDPNLJTIJLBXBLBUTVNJ3FMJD'PSHF r IUUQTBQQTBQQMFDPNVTBQQSFMJDGPSHFJE r IUUQTSFMJDGPSHFQBHFTEFW