Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Tim McNamara: A look at NuPIC - A self-learning...

Tim McNamara: A look at NuPIC - A self-learning AI engine

= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
Tim McNamara:
A look at NuPIC - A self-learning AI engine
= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =
@ Kiwi PyCon 2013 - Saturday, 07 Sep 2013 - Track 1
http://nz.pycon.org/

**Audience level**

Intermediate

**Description**

NuPIC is an open source platform for building prediction models from data streams, such as sensor data. Two models will be discussed, an earthquake damage predictor built from from GeoNet data and a flood level warning system.

**Abstract**

This talk will discuss the speaker's experience with NuPIC - http://numenta.org/nupic.html - for building useful artificial intelligence applications. In particular, the author will discuss developing damage and flood prediction models based on public sensor data.

About the tool
NuPIC is an open source implementation of algorithms which are inspired heavily by our understanding of how the neocortex organises information. NuPIC uses online (or continuous) learning, providing a prediction after every input is received. This is intended to mirror how human brains operate, by acting quickly on new information based on prior knowledge, while being able to adapt to it for the next bit of information. The other main features are the inclusion of a temporal dimension to learning, partitioning models into hierarchies of sub-models and representing knowledge within a sparse, distributed matrix modelled after the brain.

About the talk
The claims made in the documentation of NuPIC are very bold. The developers claim that models are able to be developed without needing to create training and testing sets. The models that are developed in this manner are supposedly self-learning. Surely, this must be exaggeration! This talk is a presentation of an evaluation of building models for two sets of input data, one relating to earthquakes and the other relating to flood levels.

The anticipated case studies
GeoNet provides an extensive archive of seismic data, along with associated damage reports from people affected by particular shakes. In principle, we could feed this historical data to NuPIC and then ask it to tell us how likely and how intense damage will be for any particular quake. Perhaps we could even use NuPIC to model predict the likelihood of multiple quakes within a cluster.

With information about rain levels available in near-real time from NIWA's climate database (Cliflo) and information about river catchments and historical from district councils, it is (in principle) possible to create a flood risk prediction model for one's own use.

**YouTube**

http://www.youtube.com/watch?v=rY7GLyxINFY

New Zealand Python User Group

September 07, 2013
Tweet

More Decks by New Zealand Python User Group

Other Decks in Programming

Transcript

  1. Problem statement: • New Zealand has excellent environmental sensor data.

    Can we build useful applications based these sources to provide hazard impact guidance? More concretely: • Can we predict floods on rivers automatically based on rain data and historic pattern? Hiring machine learning expertise to create bespoke models is very expensive. • Can we get an early hunch for the human impact of an earthquake from location & magnitude data only?
  2. So

  3. A system for building AI models which has been built

    on the basis of neurological foundations.
  4. What are some features of that structure? - The neocortex

    is heavily hierarchical, with 5 celluar and 1 non-celluar level through most of it. - Cortical regions appear to appears to use sparse, distributed representations to store information. - Temporality is important, e.g. we can distinguish two senses of the same input sounds depending on context: "I ate eight apples."
  5. There is a trade off between how much memory is

    allocated to each level [of the model] and how many levels are needed. Fortunately, HTMs automatically learn the best possible representations at each level given statistics of the input and the amount of resources allocated. — Numenta White Paper (2011), p 28
  6. Real brains are highly “plastic”, regions of the neocortex can

    learn to represent entirely different things in reaction to various changes. If part of the neocortex is damaged, other parts will adjust to represent what the damaged part used to represent. … The system is self-adjusting. — Numenta White Paper (2011), p 28
  7. Information encoded as a 2048 bit array of 0s and

    1s. Within any array, only a small number of bits will be activated for any given input. Matching active bits means that two inputs are similiar.
  8. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  9. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  10. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  11. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  12. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  13. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  14. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  15. gym,address,timestamp,consumption string,string,datetime,float S,,T, Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah

    2093,2010-07-02 00:00:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:15:00.0,5.5 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:30:00.0,5.1 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 00:45:00.0,5.3 Balgowlah Platinum,Shop 67 197-215 Condamine Street Balgowlah 2093,2010-07-02 01:00:00.0,5.2 ...
  16. model = ModelFactory.create(model_params.MODEL_PARAMS) model.enableInference({'predictedField': 'consumption'}) reader = csv.reader(open(_DATA_PATH)) headers =

    reader.next() for i, record in enumerate(reader, start=1): modelInput = dict(zip(headers, record)) modelInput["consumption"] = float(modelInput["consumption"]) modelInput["timestamp"] = datetime.datetime( modelInput["timestamp"], "%m/%d/%y %H:%M") result = model.run(modelInput)
  17. model = ModelFactory.create(model_params.MODEL_PARAMS) model.enableInference({'predictedField': 'consumption'}) reader = csv.reader(open(_DATA_PATH)) headers =

    reader.next() for i, record in enumerate(reader, start=1): modelInput = dict(zip(headers, record)) modelInput["consumption"] = float(modelInput["consumption"]) modelInput["timestamp"] = datetime.datetime( modelInput["timestamp"], "%m/%d/%y %H:%M") result = model.run(modelInput)
  18. model = ModelFactory.create(model_params.MODEL_PARAMS) model.enableInference({'predictedField': 'consumption'}) reader = csv.reader(open(_DATA_PATH)) headers =

    reader.next() for i, record in enumerate(reader, start=1): modelInput = dict(zip(headers, record)) modelInput["consumption"] = float(modelInput["consumption"]) modelInput["timestamp"] = datetime.datetime( modelInput["timestamp"], "%m/%d/%y %H:%M") result = model.run(modelInput)
  19. model = ModelFactory.create(model_params.MODEL_PARAMS) model.enableInference({'predictedField': 'consumption'}) reader = csv.reader(open(_DATA_PATH)) headers =

    reader.next() for i, record in enumerate(reader, start=1): modelInput = dict(zip(headers, record)) modelInput["consumption"] = float(modelInput["consumption"]) modelInput["timestamp"] = datetime.datetime( modelInput["timestamp"], "%m/%d/%y %H:%M") result = model.run(modelInput)
  20. model = ModelFactory.create(model_params.MODEL_PARAMS) model.enableInference({'predictedField': 'consumption'}) reader = csv.reader(open(_DATA_PATH)) headers =

    reader.next() for i, record in enumerate(reader, start=1): modelInput = dict(zip(headers, record)) modelInput["consumption"] = float(modelInput["consumption"]) modelInput["timestamp"] = datetime.datetime( modelInput["timestamp"], "%m/%d/%y %H:%M") result = model.run(modelInput)
  21. model = ModelFactory.create(model_params.MODEL_PARAMS) model.enableInference({'predictedField': 'consumption'}) reader = csv.reader(open(_DATA_PATH)) headers =

    reader.next() for i, record in enumerate(reader, start=1): modelInput = dict(zip(headers, record)) modelInput["consumption"] = float(modelInput["consumption"]) modelInput["timestamp"] = datetime.datetime( modelInput["timestamp"], "%m/%d/%y %H:%M") result = model.run(modelInput)
  22. ModelResult( inferences={ 'multiStepPredictions': { 1: { 5.2825868514199987: 0.69999516634971859, 10.699999999999999: 0.07601257054965195,

    22.100000000000001: 0.055294648127235196, 22.899999999999999: 0.052690624183750749, }, 5: { 38.188079999999999: 0.2275438176777452, 47.359999999999992: 0.19538808382423584, 37.399999999999999: 0.12597931862094047, 45.399999999999999: 0.099123261272031596, 37.089999999999996: 0.082913215936932752, 39.280000000000001: 0.077935781935515161, 43.629999999999995: 0.076405289164189288 } }, 'multiStepBestPredictions': { 1: 5.2825868514199987, 5: 38.188079999999999 } } ... )
  23. MODEL_PARAMS = { # Type of model that the rest

    of these parameters apply to. 'model': "CLA", # Version that specifies the format of the config. 'version': 1, # Intermediate variables used to compute fields in modelParams and also # referenced from the control section. 'aggregationInfo': { 'days': 0, 'fields': [('consumption', 'sum')], 'hours': 1, 'microseconds': 0, 'milliseconds': 0, 'minutes': 0, 'months': 0, 'seconds': 0, 'weeks': 0, 'years': 0}, 'predictAheadTime': None, # Model parameter dictionary. 'modelParams': { # The type of inference that this model will perform 'inferenceType': 'TemporalMultiStep', 'sensorParams': { # Sensor diagnostic output verbosity control; # if > 0: sensor region will print out on screen what it's sensing # at each step 0: silent; >=1: some info; >=2: more info; # >=3: even more info (see compute() in py/regions/RecordSensor.py) 'verbosity' : 0, # Example: # dsEncoderSchema = [ # DeferredDictLookup('__field_name_encoder'), # ], # # (value generated from DS_ENCODER_SCHEMA) 'encoders': { 'consumption': { 'clipInput': True, 'fieldname': u'consumption', 'n': 100, 'name': u'consumption', 'type': 'AdaptiveScalarEncoder', 'w': 21}, 'timestamp_dayOfWeek': { 'dayOfWeek': (21, 1), 'fieldname': u'timestamp', 'name': u'timestamp_dayOfWeek', 'type': 'DateEncoder'}, 'timestamp_timeOfDay': { 'fieldname': u'timestamp', 'name': u'timestamp_timeOfDay', 'timeOfDay': (21, 1), 'type': 'DateEncoder'}, 'timestamp_weekend': { 'fieldname': u'timestamp', 'name': u'timestamp_weekend', 'type': 'DateEncoder', 'weekend': 21}}, # A dictionary specifying the period for automatically-generated # resets from a RecordSensor; # # None = disable automatically-generated resets (also disabled if # all of the specified values evaluate to 0). # Valid keys is the desired combination of the following: # days, hours, minutes, seconds, milliseconds, microseconds, weeks # # Example for 1.5 days: sensorAutoReset = dict(days=1,hours=12), # # (value generated from SENSOR_AUTO_RESET) 'sensorAutoReset' : None, }, 'spEnable': True, 'spParams': { # SP diagnostic output verbosity control; # 0: silent; >=1: some info; >=2: more info; 'spVerbosity' : 0, 'globalInhibition': 1, # Number of cell columns in the cortical region (same number for # SP and TP) # (see also tpNCellsPerCol) 'columnCount': 2048, 'inputWidth': 0, # SP inhibition control (absolute value); # Maximum number of active columns in the SP region's output (when # there are more, the weaker ones are suppressed) 'numActivePerInhArea': 40, 'seed': 1956, # coincInputPoolPct # What percent of the columns's receptive field is available # for potential synapses. At initialization time, we will # choose coincInputPoolPct * (2*coincInputRadius+1)^2 'coincInputPoolPct': 0.5, # The default connected threshold. Any synapse whose # permanence value is above the connected threshold is # a "connected synapse", meaning it can contribute to the # cell's firing. Typical value is 0.10. Cells whose activity # level before inhibition falls below minDutyCycleBeforeInh # will have their own internal synPermConnectedCell # threshold set below this default value. # (This concept applies to both SP and TP and so 'cells' # is correct here as opposed to 'columns') 'synPermConnected': 0.1, 'synPermActiveInc': 0.1, 'synPermInactiveDec': 0.01, }, # Controls whether TP is enabled or disabled; # TP is necessary for making temporal predictions, such as predicting # the next inputs. Without TP, the model is only capable of # reconstructing missing sensor inputs (via SP). 'tpEnable' : True, 'tpParams': { # TP diagnostic output verbosity control; # 0: silent; [1..6]: increasing levels of verbosity # (see verbosity in nta/trunk/py/nupic/research/TP.py and TP10X*.py) 'verbosity': 0, # Number of cell columns in the cortical region (same number for # SP and TP) # (see also tpNCellsPerCol) 'columnCount': 2048, # The number of cells (i.e., states), allocated per column. 'cellsPerColumn': 32, 'inputWidth': 2048, 'seed': 1960, # Temporal Pooler implementation selector (see _getTPClass in # CLARegion.py). 'temporalImp': 'cpp', # New Synapse formation count # NOTE: If None, use spNumActivePerInhArea # # TODO: need better explanation 'newSynapseCount': 20, # Maximum number of synapses per segment # > 0 for fixed-size CLA # -1 for non-fixed-size CLA # # TODO: for Ron: once the appropriate value is placed in TP # constructor, see if we should eliminate this parameter from # description.py. 'maxSynapsesPerSegment': 32, # Maximum number of segments per cell # > 0 for fixed-size CLA # -1 for non-fixed-size CLA # # TODO: for Ron: once the appropriate value is placed in TP # constructor, see if we should eliminate this parameter from # description.py. 'maxSegmentsPerCell': 128, # Initial Permanence # TODO: need better explanation 'initialPerm': 0.21, # Permanence Increment 'permanenceInc': 0.1, # Permanence Decrement # If set to None, will automatically default to tpPermanenceInc # value. 'permanenceDec' : 0.1, 'globalDecay': 0.0, 'maxAge': 0, # Minimum number of active synapses for a segment to be considered # during search for the best-matching segments. # None=use default # Replaces: tpMinThreshold 'minThreshold': 12, # Segment activation threshold. # A segment is active if it has >= tpSegmentActivationThreshold # connected synapses that are active due to infActiveState # None=use default # Replaces: tpActivationThreshold 'activationThreshold': 16, 'outputType': 'normal', # "Pay Attention Mode" length. This tells the TP how many new # elements to append to the end of a learned sequence at a time. # Smaller values are better for datasets with short sequences, # higher values are better for datasets with long sequences. 'pamLength': 1, }, 'clParams': { 'regionName' : 'CLAClassifierRegion', # Classifier diagnostic output verbosity control; # 0: silent; [1..6]: increasing levels of verbosity 'clVerbosity' : 0, # This controls how fast the classifier learns/forgets. Higher values # make it adapt faster and forget older patterns faster. 'alpha': 0.0001, # This is set after the call to updateConfigFromSubConfig and is # computed from the aggregationInfo and predictAheadTime. 'steps': '1,5', }, 'trainSPNetOnlyIfRequested': False, }, }
  24. 'spParams' { ... 'encoders': { 'consumption': { 'clipInput': True, 'fieldname':

    u'consumption', 'n': 100, 'name': u'consumption', 'type': 'AdaptiveScalarEncoder', 'w': 21 }, 'timestamp_dayOfWeek': { 'dayOfWeek': (21, 1), 'fieldname': u'timestamp', 'name': u'timestamp_dayOfWeek', 'type': 'DateEncoder' }, 'timestamp_timeOfDay': { 'fieldname': u'timestamp', 'name': u'timestamp_timeOfDay', 'timeOfDay': (21, 1), 'type': 'DateEncoder' }, 'timestamp_weekend': { 'fieldname': u'timestamp', 'name': u'timestamp_weekend', 'type': 'DateEncoder', 'weekend': 21} }, ... }
  25. 'spParams' { ... 'encoders': { 'consumption': { 'clipInput': True, 'fieldname':

    u'consumption', 'n': 100, 'name': u'consumption', 'type': 'AdaptiveScalarEncoder', 'w': 21 }, 'timestamp_dayOfWeek': { 'dayOfWeek': (21, 1), 'fieldname': u'timestamp', 'name': u'timestamp_dayOfWeek', 'type': 'DateEncoder' }, 'timestamp_timeOfDay': { 'fieldname': u'timestamp', 'name': u'timestamp_timeOfDay', 'timeOfDay': (21, 1), 'type': 'DateEncoder' }, 'timestamp_weekend': { 'fieldname': u'timestamp', 'name': u'timestamp_weekend', 'type': 'DateEncoder', 'weekend': 21} }, ... }
  26. 'spParams' { ... 'encoders': { 'consumption': { 'clipInput': True, 'fieldname':

    u'consumption', 'n': 100, 'name': u'consumption', 'type': 'AdaptiveScalarEncoder', 'w': 21 }, 'timestamp_dayOfWeek': { 'dayOfWeek': (21, 1), 'fieldname': u'timestamp', 'name': u'timestamp_dayOfWeek', 'type': 'DateEncoder' }, 'timestamp_timeOfDay': { 'fieldname': u'timestamp', 'name': u'timestamp_timeOfDay', 'timeOfDay': (21, 1), 'type': 'DateEncoder' }, 'timestamp_weekend': { 'fieldname': u'timestamp', 'name': u'timestamp_weekend', 'type': 'DateEncoder', 'weekend': 21} }, ... }
  27. 'spParams' { ... 'encoders': { 'consumption': { 'clipInput': True, 'fieldname':

    u'consumption', 'n': 100, 'name': u'consumption', 'type': 'AdaptiveScalarEncoder', 'w': 21 }, 'timestamp_dayOfWeek': { 'dayOfWeek': (21, 1), 'fieldname': u'timestamp', 'name': u'timestamp_dayOfWeek', 'type': 'DateEncoder' }, 'timestamp_timeOfDay': { 'fieldname': u'timestamp', 'name': u'timestamp_timeOfDay', 'timeOfDay': (21, 1), 'type': 'DateEncoder' }, 'timestamp_weekend': { 'fieldname': u'timestamp', 'name': u'timestamp_weekend', 'type': 'DateEncoder', 'weekend': 21} }, ... }
  28. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  29. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  30. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  31. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  32. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  33. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  34. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  35. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  36. { "includedFields": [ { "fieldName": "timestamp", "fieldType": "datetime" }, {

    "fieldName": "consumption", "fieldType": "float"} ], "streamDef": { "info": "test", "version": 1, "streams": [ { "info": "hotGym.csv", "source": "file://extra/hotgym/hotgym.csv", "columns": [ "*" ], "last_record": 100 } ], "aggregation": { "years": 0, "months": 0, "weeks": 0, "days": 0, "hours": 1, "minutes": 0, "seconds": 0, "microseconds": 0, "milliseconds": 0, "fields": [ [ "consumption", "sum" ], [ "gym", "first" ], [ "timestamp", "first" ] ], } }, "inferenceType": "MultiStep", "inferenceArgs": { "predictionSteps": [ 1 ], "predictedField": "consumption" }, "iterationCount": -1, "swarmSize": "medium" }
  37. - comprehensive, open data on NZ earthquakes - accessible via

    easy, flexible unauthenticated HTTP API - includes ~50 variables per quake - includes "felt reports"
  38. ...

  39. Problems - swarming takes a lot of time - predictedField

    is singular - wanted to predict likely values for numbers of felt reports between Modified Mercali 0-10
  40. - comprehensive, open(ish) data on NZ weather - accessible via

    an easy(ish), HTTP API - real-time(ish)
  41. ...

  42. Problems • regional council data flood level data is harder

    to access that I had anticipated • some licence uncertainty around CliFlo reuse
  43. Reflections on NuPIC • terminology is difficult, but you'll get

    there • lots of tools for building tools • well documented code • excellent community