How to write well designed imports with Symfony

How to write well designed imports with Symfony

Importing data into an application is a common task through different fields. Since imports are often running in the background, they tend to be forgotten, until things break or don't meet requirements anymore, that is. Let's bring these import scripts out of the shadows and cast some light on how to improve them.

In this talk we will look at a few ways to write imports, from barebone scripts to an elaborate import domain utilizing all the bells and whistles provided by modern frameworks. We will discuss what to look out for when designing and improving them and build a checklist of things to consider before getting started. Hopefully, by the end of the talk you are motivated to look at the imports hidden away in your projects and are motivated to improve them, before things break, write better imports in the future or just give you the good feeling that you are already on the right track.

6a1345d8e6dd15b2c78eff0c331963b1?s=128

Denis Brumann

October 24, 2019
Tweet

Transcript

  1. 4.

    Requirements / What do I mean by "well designed"? testable

    modifiable reusable fast memory efficient @dbrumann
  2. 7.

    Input nconst primaryName birthYear deathYear primaryProfession knownForTitles nm0004813 Nancy Cartwright

    1957 \N actress, soundtrack, producer tt0096697, tt0089153, tt0120685, tt0462538 @dbrumann
  3. 10.

    public function execute(ReadContext $readContext, WriteContext $writeContext) { $reader = $this->reader->open($readContext);

    $writer = $this->writer->open($writeContext); $count = 0; $items = []; $generator = $reader->read(); while ($generator->valid()) { $item = $generator->current(); $items[] = $this->processor->process($item); ++$count; $generator->next(); if (($count % $this->writeInterval) === 0) { $writer->write($items); $items = []; } } $writer->write($items); } @dbrumann
  4. 11.

    public function execute(ReadContext $readContext, WriteContext $writeContext) { $reader = $this->reader->open($readContext);

    $writer = $this->writer->open($writeContext); $count = 0; $items = []; $generator = $reader->read(); while ($generator->valid()) { $item = $generator->current(); $items[] = $this->processor->process($item); ++$count; $generator->next(); if (($count % $this->writeInterval) === 0) { $writer->write($items); $items = []; } } $writer->write($items); } @dbrumann
  5. 12.

    public function execute(ReadContext $readContext, WriteContext $writeContext) { $reader = $this->reader->open($readContext);

    $writer = $this->writer->open($writeContext); $count = 0; $items = []; $generator = $reader->read(); while ($generator->valid()) { $item = $generator->current(); $items[] = $this->processor->process($item); ++$count; $generator->next(); if (($count % $this->writeInterval) === 0) { $writer->write($items); $items = []; } } $writer->write($items); } @dbrumann
  6. 13.

    Reader interface Reader { /** * @return Reader Returns an

    opened Reader-instance * that you can read from. */ public function open(ReadContext $context): Reader; public function read(): Generator; /** * Counts the numbers of processable items based on * the current file and line position. */ public function count(): int; } @dbrumann
  7. 14.

    TsvReader public function open(ReadContext $context): Reader { $reader = clone

    $this; $reader->file = new SplFileObject($context->filename(), 'r', false); $reader->file->setFlags(SplFileObject::DROP_NEW_LINE | SplFileObject::READ_AHEAD | SplFileObject::SKIP_EMPTY | SplFileObject::READ_CSV ); $reader->file->setCsvControl("\t"); $reader->linePosition = $context->linePosition(); $reader->readLimit = $context->readLimit(); return $reader; } @dbrumann
  8. 15.

    TsvReader public function read(): Generator { if ($this->file === null)

    { throw new \RuntimeException('No file opened! Please call open() first.'); } $count = 0; $this->file->seek($this->linePosition); while ($this->file->valid() && ($this->readLimit === null || $count < $this->readLimit) ) { yield $count => $this->file->current(); ++$count; $this->file->next(); } } @dbrumann
  9. 16.

    TsvReaderTest public function test_reading_names_full(): void { $reader = new TsvReader();

    $context = new ReadContext(__DIR__ . '/../Fixtures/name.basics.tsv', 0); $openedReader = $reader->open($context); $generator = $openedReader->read(); $rows = []; while ($generator->valid()) { $rows[] = $generator->current(); $generator->next(); } self::assertCount(3, $rows); self::assertSame('nconst', $rows[0][0]); self::assertSame('nm0000001', $rows[1][0]); self::assertSame('nm0000002', $rows[2][0]); } @dbrumann
  10. 17.

    TsvReaderTest public function test_reading_names_skip_headers(): void { $reader = new TsvReader();

    $context = new ReadContext(__DIR__ . '/../Fixtures/name.basics.tsv', 1); $openedReader = $reader->open($context); $generator = $openedReader->read(); $rows = []; while ($generator->valid()) { $rows[] = $generator->current(); $generator->next(); } self::assertCount(2, $rows); self::assertSame('nm0000001', $rows[0][0]); self::assertSame('nm0000002', $rows[1][0]); } @dbrumann
  11. 18.

    TsvReaderTest public function test_read_limit_skip_headers(): void { $reader = new TsvReader();

    $context = new ReadContext(__DIR__ . '/../Fixtures/name.basics.tsv', 1, 1); $openedReader = $reader->open($context); $generator = $openedReader->read(); $rows = []; while ($generator->valid()) { $rows[] = $generator->current(); $generator->next(); } self::assertCount(1, $rows); self::assertSame('nm0000001', $rows[0][0]); } @dbrumann
  12. 20.

    Processor public function process($item) { if ($item[2] === '' ||

    $item[2] === '\N') { $birthYear = null; } else { $birthYear = (int) $item[2]; } if ($item[3] === '' || $item[3] === '\N') { $deathYear = null; } else { $deathYear = (int) $item[3]; } return new Person($item[0], $item[1], $birthYear, $deathYear); } @dbrumann
  13. 22.

    Writer public function write(iterable $items): void { if ($this->entityManager ===

    null) { throw new \RuntimeException('No EntityManager. Please call open().'); } $count = 0; foreach ($items as $item) { $this->entityManager->persist($item); ++$count; if (($count % $this->batchSize) === 0) { $this->entityManager->flush(); $this->entityManager->clear(); } } $this->entityManager->flush(); $this->entityManager->clear(); } @dbrumann
  14. 26.

    PartitionManager while ($range < $totalItemCount) { if (count($this->processes) >= $this->processLimit)

    { sleep(2); foreach ($this->processes as $index => $process) { if (!$process->isRunning()) { unset($this->processes[$index]); } } } if (count($this->processes) < $this->processLimit) { $process = new Process(['php', 'bin/console', $command, '--amount', (string) $partitionSize, (string) $offset]); $process->start(); $processes[] = $process; $offset += $partitionSize; $range += $partitionSize; } } @dbrumann
  15. 27.

    PartitionManager while ($range < $totalItemCount) { if (count($this->processes) >= $this->processLimit)

    { sleep(2); foreach ($this->processes as $index => $process) { if (!$process->isRunning()) { unset($this->processes[$index]); } } } if (count($this->processes) < $this->processLimit) { $process = new Process(['php', 'bin/console', $command, '--amount', (string) $partitionSize, (string) $offset]); $process->start(); $processes[] = $process; $offset += $partitionSize; $range += $partitionSize; } } @dbrumann
  16. 37.

    Consuming @dbrumann Usage: messenger:consume [options] [--] [<receivers>...] messenger:consume-messages Arguments: receivers

    Names of the receivers/transports to consume in order of priority [default: ["async"]] Options: -l, --limit=LIMIT Limit the number of received messages -m, --memory-limit=MEMORY-LIMIT The memory limit the worker can consume -t, --time-limit=TIME-LIMIT The time limit in seconds the worker can run --sleep=SLEEP Seconds to sleep before asking for new messages after no messages were found [default: 1] -b, --bus=BUS Name of the bus to which received messages should be dispatched (if not passed, bus is determine automatically.