in data reservoir as Hive tables •Upload triggered by manual call to BDD Data Processing CLI ‣Runs Oozie job in the background to profile,
enrich and then ingest data into DGraph Ingesting Logs and Tweet Data Samples into DGraph [oracle@bddnode1 ~]$ cd /home/oracle/Middleware/BDD1.0/dataprocessing/edp_cli [oracle@bddnode1 edp_cli]$ ./data_processing_CLI -t access_per_post_cat_author [oracle@bddnode1 edp_cli]$ ./data_processing_CLI -t rm_linked_tweets Hive Apache Spark pageviews X rows pageviews >1m rows Profiling pageviews >1m rows Enrichment pageviews >1m rows BDD pageviews >1m rows { "@class" : "com.oracle.endeca.pdi.client.config.workflow.
ProvisionDataSetFromHiveConfig", "hiveTableName" : "rm_linked_tweets", "hiveDatabaseName" : "default", "newCollectionName" : “edp_cli_edp_a5dbdb38-b065…”, "runEnrichment" : true, "maxRecordsForNewDataSet" : 1000000, "languageOverride" : "unknown" } 1 2 3 34