Upgrade to Pro
— share decks privately, control downloads, hide ads and more …
Speaker Deck
Features
Speaker Deck
PRO
Sign in
Sign up for free
Search
Search
OctoLab Cookbook: how to generate a unique key ...
Search
Kamil Samigullin
May 03, 2016
Education
1
36
OctoLab Cookbook: how to generate a unique key for a sequence
PHP tips and tricks: work with IteratorAggregate, Generator, SplQueue and SplDoublyLinkedList.
Kamil Samigullin
May 03, 2016
Tweet
Share
More Decks by Kamil Samigullin
See All by Kamil Samigullin
OctoLab Cookbook: Go lang tips and tricks - protection of sensitive config data
kamilsk
1
130
OctoLab Cookbook: how to use composer.yml and stop creating issues about
kamilsk
1
57
Enter Cookbook: refactoring under a microscope
kamilsk
1
55
Other Decks in Education
See All in Education
予習動画
takenawa
0
7.8k
新卒研修に仕掛ける 学びのサイクル / Implementing Learning Cycles in New Graduate Training
takashi_toyosaki
1
170
2025年度春学期 統計学 第10回 分布の推測とは ー 標本調査,度数分布と確率分布 (2025. 6. 12)
akiraasano
PRO
0
150
プレゼンテーション実践
takenawa
0
7.2k
SkimaTalk Teacher Guidelines
skimatalk
0
800k
ThingLink
matleenalaakso
28
4.1k
万博非公式マップとFOSS4G
barsaka2
0
400
生成AIとの上手な付き合い方【公開版】/ How to Get Along Well with Generative AI (Public Version)
handlename
0
500
20250625_なんでもCopilot 一年の振り返り
ponponmikankan
0
240
Info Session MSc Computer Science & MSc Applied Informatics
signer
PRO
0
190
SARA Annual Report 2024-25
sara2023
1
180
Pythonパッケージ管理 [uv] 完全入門
mickey_kubo
20
15k
Featured
See All Featured
Building a Modern Day E-commerce SEO Strategy
aleyda
42
7.4k
Keith and Marios Guide to Fast Websites
keithpitt
411
22k
YesSQL, Process and Tooling at Scale
rocio
173
14k
A Tale of Four Properties
chriscoyier
160
23k
Building Flexible Design Systems
yeseniaperezcruz
328
39k
Practical Tips for Bootstrapping Information Extraction Pipelines
honnibal
PRO
20
1.3k
No one is an island. Learnings from fostering a developers community.
thoeni
21
3.4k
A Modern Web Designer's Workflow
chriscoyier
695
190k
KATA
mclloyd
30
14k
Helping Users Find Their Own Way: Creating Modern Search Experiences
danielanewman
29
2.7k
ピンチをチャンスに:未来をつくるプロダクトロードマップ #pmconf2020
aki_iinuma
126
53k
Producing Creativity
orderedlist
PRO
346
40k
Transcript
Episode #1.0.2 (2016-05-22) Level: Intermediate OctoLab Cookbook How to generate
a unique key for a sequence
The context
timestamp payment currency 18.03.2016 09:22:23 100 USD 18.03.2016 09:21:02 50
EUR 18.03.2016 09:19:46 10 USD 08.03.2016 09:31:03 10 USD 07.03.2016 22:52:17 145 USD 18.02.2016 23:17:25 35 EUR Import CSV from tinkoff.ru to TCA application
The problem
Merging timestamp payment currency 18.03.2016 09:22:23 100 USD 18.03.2016 09:21:02
50 EUR 18.03.2016 09:19:46 10 USD timestamp payment currency 18.03.2016 09:19:46 10 USD 08.03.2016 09:31:03 10 USD 07.03.2016 22:52:17 145 USD 18.02.2016 23:17:25 35 EUR timestamp payment currency 18.02.2016 23:17:25 35 EUR 17.02.2016 19:49:59 11 USD 02.02.2016 17:58:36 9 USD a first imported CSV file a second imported CSV file a third imported CSV file
The solution
- first iteration: simple, unique key constraint $table->addUniqueIndex([‘timestamp’, ‘payment’, ‘currency’]);
timestamp payment currency 22.02.2016 20:36:33 15 USD 22.02.2016 19:25:14 25
USD 22.02.2016 00:00:00 0,1 USD 22.02.2016 00:00:00 0,1 USD 21.02.2016 21:19:38 10 USD 21.02.2016 17:04:00 10 USD valid internal operations look exactly the same
- second iteration: buffering and forward hashing $table->addColumn(‘f_hash’, ‘string’, …)
$table->addUniqueIndex([‘f_hash’]);
None
column forward hash row 1 08bb542ead91978a26317a4aa8caf1e5d355706e row 2 e993895c5fd4fcbb22f9af9c7d341c47db87a8a7 row
3 7bb25dfeeaec0235cc5cdc4643a09e753687bded row 4 NULL buffer size = 2 column forward hash row 1 bd74be8fda860811368dfc3dd170f34652c73dea row 2 eaf057a4aac69217b5cc2c996679e7d1e11f411d row 3 NULL row 4 NULL buffer size = 3
- third iteration: bidirectional hashing $table->addColumn(‘b_hash’, ‘string’, …) $table->addUniqueIndex([‘b_hash’]);
None
column forward hash backward hash row 1 … NULL row
2 … 0b48e15756c28dc8a00d54594a1c25cabe707973 row 3 … af8d00a06f6e58ec06a650433df597cf4140d78b row 4 NULL 219fe3369d071daadedf81fde031e0f1e74f571b buffer size = 2 not-nullable combination
Limitations • number of rows in a CSV file must
be greater than or equal to the buffer size Benefits • import CSV files in any order • natural flow of the code with minimal changes in the original iteration
• StringifierInterface and CsvRowStringifier • BufferInterface and HashedBuffer • BufferedIterator
• changes in the original iteration The real-world example
Thank you for your attention! Have questions? Kamil Samigullin a
some developer
[email protected]
@ikamilsk github.com/kamilsk linkedin.com/in/kamilsk