Warp10 Egress Warp10 Directory Warp10 Store @sysadmindays @ovh 18 Our cluster architecture Region server + Datanode Region server + Datanode Region server + Datanode Region server + Datanode Warp10 Ingress Warp10 Store Kafka Warp10 Directory Warp10 Egress
@sysadmindays @ovh 21 Hardware pitfalls Be sure how much controlers matches the number of disk & sata ports Be sure that your network link can handle your disk IO capacity Be sure of threads distributions, (IRQ, NUMA surprises,ingest+processing+gc+...)
@sysadmindays @ovh 33 Use cases • DC Temperature/Elec/Cooling map • Pay as you go billing (PCI/IPLB) • GSCAN • Monitoring • ML Model scoring (Anti-Fraude) • Pattern Detection for medical applications
@sysadmindays @ovh 63 Xreceiver ERROR org.apache.hadoop.dfs.DataNode: DatanodeRegistration(...): DataXceiver: java.io.IOException: xceiverCount 258 exceeds the limit of concurrent xcievers 256 HDFS INFO org.apache.hadoop.dfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Could not read from stream INFO org.apache.hadoop.dfs.DFSClient: Abandoning block blk_-546... WARN org.apache.hadoop.dfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block. WARN org.apache.hadoop.dfs.DFSClient: Error Recovery for block blk_-546.. bad dn[0] FATAL org.apache.hadoop.hbase.regionserver.Flusher: Replay of hlog required. Forcing server shutdown HBASE
@sysadmindays @ovh 67 Hardware pitfalls Be sure how much controlers matches the number of disk & sata ports Be sure that your network link can handle your disk IO capacity Be sure of threads distributions, (IRQ, NUMA surprises,ingest+processing+gc+...)
@sysadmindays @ovh 68 Hardware pitfalls Be sure how much controlers matches the number of disk & sata ports Be sure that your network link can handle your disk IO capacity Be sure of threads distributions, (IRQ, NUMA surprises,ingest+processing+gc+...)