Upgrade to Pro — share decks privately, control downloads, hide ads and more …

"Grid computing on a budget (Or: Reinventing ce...

Pycon ZA
October 06, 2017

"Grid computing on a budget (Or: Reinventing celery)" by Matthew French

A Lightning talk at PyConZA 2017

Pycon ZA

October 06, 2017
Tweet

More Decks by Pycon ZA

Other Decks in Programming

Transcript

  1. Paying the Rent… 2 •Name: Matthew French •Company: FIS –Financial

    services –53,000 employees –20,000 clients –130 countries –Offices in Johannesburg and Cape Town •Application: Front Arena –Core code is C++ –Embedded Python for customisation
  2. The Problem 3 The problem: • Large, long running process

    (> 8 hours) • Need to break it down into smaller pieces and run concurrently • But: • Running inside Python friendly but proprietary environment • Cultural issues • Change control Solution: • Roll your own
  3. But wait, there’s more… 4 Other requirements: •Long startup time

    – needs to use a running process •Need to be inside different OS processes •Need persistent temporary objects •Need to be able to pause/cancel process •Ability to rerun failed processes •Pre-generate work queue •Needs to support checkpoints
  4. The Quick Solution 5 File based message queue {root directory}

    messages success failed workers worker1 worker2 Generator 1 Generator 2 (Messages are JSON files) Worker 1 Worker 1 Worker 1 Worker 1 Worker 1 Worker 1
  5. Sample Message Handler 6 class TestFileQueueMessage(FileQueueAbstractMessage): def __init__(self): FileQueueAbstractMessage.__init__(self) def

    get_message_type(self): return MESSAGE_TYPE def generate_message_id(self): return "%s_%s_%i_%02i" % (self.get_message_type(), strftime("%Y%m%d_%H%M%S"), self.get_int("message_number"), self.get_int("delay")) def process_message(self, queue_manager): delay = self.get_int("delay") queue_manager.get_persistent_object(TestFileQueuePersistentObject(delay % 3)) sleep(delay % 10) if delay > 25: raise Exception("Incredibly long delay: %i seconds!" % delay) elif delay > 15: self.set_failure_message("Delay is a bit long - %i seconds" % delay) return False return True
  6. Sample App 7 manager = FileQueueManager("c:/temp/FileQueue", get_type_map()) manager.load_all() for i

    in range(1, 10): message = TestFileQueueMessage() message.set("message_number", i) message.set("description", "Testing %i" % i) message.set("delay", randint(1, 30)) manager.add_message(message) manager.save_all() print "There are %i messages..." % len(manager.get_all_messages()) while manager.acquire_next_message(): stats = manager.get_statistics() processed = stats[QUEUE_MANAGER_STATS_TOTAL] + 1 unprocessed = stats[QUEUE_MANAGER_STATS_UNPROCESSED] print " Processing message %i of %i: %s" % (processed, processed + unprocessed, manager.get_next_message().get_filename()) manager.process_next_message() manager.close() print "Done."