Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Artificial Intelligence and Systems Laboratory (AISys): A Research Overview

Artificial Intelligence and Systems Laboratory (AISys): A Research Overview

A research overview of AISys lab at USC.

Pooyan Jamshidi

April 15, 2023
Tweet

More Decks by Pooyan Jamshidi

Other Decks in Research

Transcript

  1. Arti f icial Intelligence and Systems Laboratory (AISys) Research Overview

    Pooyan Jamshidi University of South Carolina https://pooyanjamshidi.github.io/AISys/
  2. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  3. Building reliable models that produce causal explanations for performance debugging

    and transfer better to new environments Cache Misses Throughput (FPS) 20 10 0 100k 200k Cache Misses Throughput (FPS) LRU FIFO LIFO MRU 20 10 0 100k 200k Cache Policy a ff ects Throughput via Cache Misses. Cache Policy Cache Misses Through put
  4. FlexiBO: A multi-objective optimization that tradeoff information gain with the

    cost of design evaluations • FlexiBO is a cost-aware approach for multi- objective optimization that iteratively selects a design and an objective for evaluation. • It allows us to trade o ff the additional information gained through an evaluation and the cost incurred due to the evaluation.
  5. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  6. Finding root causes of configuration issues in highly-configurable robots We

    discovered that root causes of task failures in robots could be captured by causal e ff ect estimation of task inputs and robot con fi gurations.
  7. Sim-to-real by enabling causal transfer learning    

                                      Causal models learned in simulation can be transferred to real robots to fi nd the root causes of failures of physical robots.
  8. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  9. Looking for winning tickets in over-parametrized networks • How robust

    are the discovered sub-networks (e.g., adversarial attack, distributional shift)? • Is there any always-winning lottery ticket hidden in a randomly initialized network? • Is it possible to train the sparse sub-network e ffi ciently?
  10. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  11. • Contrastive learning (CL) without label information is less robust

    than other learning schemes. • Semi-supervised learning (SL- CL or SCL-CL) is more robust than CL. Is there anything special about contrastive learning in terms of adversarial robustness?
  12. • Adversarial training causes similar representations between consecutive layers. •

    Fully adversarial fine-tuning can improve clean accuracy (red line) and robustness (blue line) by eliminating these similarities. • The lack of differentiated layer- wise representations after adversarial training may hinder neural networks from achieving high clean/adversarial accuracy. Is there anything special about contrastive learning in terms of adversarial robustness?
  13. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  14. Hardware-aware partitioning and mapping for multi-chiplet and multi-card AI inference

    systems Partitioned Computation Graph Pipeline Schedule M0-C0 B-1 B-1 M0-C1 M1-C8 B-n B-1 B-2 B-2 B-n B-n SG1 SG2 SG2 Mapping M1 M2 M3 M4 HOST/CPU PCIE Switch PCIE Switch M5 M6 M7 M8 C0 C1 C2 PCIE C4 C5 C6 C7 D2D C0 C1 C2 C3 PCIE C4 C5 C6 C7 D2D Hetrogeneous System Interconnect Graph PE0 PE3 PE1 PE2 D2D-M D2D-N DDR-N DDR-S PCIE Vendor A Intra-Chiplet Interconnect Graph PE0 PE3 CONV RISCV D2D-M D2D-N DDR-S PCIE Vendor B Intra-Chiplet Interconnect Graph Workload Computation Graph C3 Time Module# - Chiplet# FRAMEWORK OUTPUTS FRAMEWORK INPUTS 1 2 3 4 5 3 2 Inter-Chiplet Interconnect Graph Inter-Chiplet Interconnect Graph
  15. Partitioned Computation Graph Pipeline Schedule M0-C0 B-1 B-1 M0-C1 M1-C8

    B-n B-1 B-2 B-2 B-n B-n SG1 SG2 SG2 Mapping M1 M2 M3 M4 HOST/CPU PCIE Switch PCIE Switch M5 M6 M7 M8 C0 C1 C2 PCIE C4 C5 C6 C7 D2D C0 C1 C2 C3 PCIE C4 C5 C6 C7 D2D Hetrogeneous System Interconnect Graph PE0 PE3 PE1 PE2 D2D-M D2D-N DDR-N DDR-S PCIE Vendor A Intra-Chiplet Interconnect Graph PE0 PE3 CONV RISCV D2D-M D2D-N DDR-S PCIE Vendor B Intra-Chiplet Interconnect Graph Workload Computation Graph C3 Time Module# - Chiplet# FRAMEWORK OUTPUTS FRAMEWORK INPUTS 1 2 3 4 5 3 2 Inter-Chiplet Interconnect Graph Inter-Chiplet Interconnect Graph Hardware-aware partitioning and mapping for multi-chiplet and multi-card AI inference systems
  16. Partitioned Computation Graph Pipeline Schedule M0-C0 B-1 B-1 M0-C1 M1-C8

    B-n B-1 B-2 B-2 B-n B-n SG1 SG2 SG2 Mapping M1 M2 M3 M4 HOST/CPU PCIE Switch PCIE Switch M5 M6 M7 M8 C0 C1 C2 PCIE C4 C5 C6 C7 D2D C0 C1 C2 C3 PCIE C4 C5 C6 C7 D2D Hetrogeneous System Interconnect Graph PE0 PE3 PE1 PE2 D2D-M D2D-N DDR-N DDR-S PCIE Vendor A Intra-Chiplet Interconnect Graph PE0 PE3 CONV RISCV D2D-M D2D-N DDR-S PCIE Vendor B Intra-Chiplet Interconnect Graph Workload Computation Graph C3 Time Module# - Chiplet# FRAMEWORK OUTPUTS FRAMEWORK INPUTS 1 2 4 5 6 4 2 Inter-Chiplet Interconnect Graph Inter-Chiplet Interconnect Graph C C C C C C C CA C C C C C C C CB 3 Set of Chiplets Hardware-aware partitioning and mapping for multi-chiplet and multi-card AI inference systems
  17. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  18. Reconciling high accuracy, cost-efficiency, and low latency of inference serving

    systems • Model variants provide a different level of accuracy/ latency trade-offs. • Models’ performance varies under different resource assignments.
  19. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  20. A new paradigm that integrates probabilistic model checking with causal

    inference to enable planning and verification in autonomous systems • RQ1: How can structural causal models be integrated with probabilistic model checking to provide a framework for planning tasks in autonomous systems? • RQ2: How can counterfactual reasoning be integrated with probabilistic model checking to analyze the effect of interventions that have not been observed in the system's behavior?
  21. Independent modular networks for learning robust and disentangled representations •

    Modular networks can automatically decompose the shapes into different learnable representations. • With the introduction of the ID classifier, the decomposition is improved significantly, where a large majority of the images for each shape are passed through one module.
  22. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  23. Pretrained language models are symbolic mathematics solvers too! • Does

    this pre-trained model help us to use fewer data for fine-tuning? • Does the result of this fine-tuning depend on the languages used for pretraining? • How robust is this fine-tuned model with respect to the distribution shift of test data compared to fine-tuning data?
  24. Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen

    (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) co-advised by Forest Agostinelli Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)
  25. Unit cycle architecture: an intrinsically faster approach • Problem: •

    Single/multi-cycle microarchitectures waste time using the critical path. • If the longest instruction takes 1100 ps, then every instruction takes 1100 ps. • Solution: • Use a timer to measure the time. • Set the timer to the duration of the instruction. • When the timer runs out, move to the next instruction. Single-Cycle Multi-Cycle Unit-Cycle Clock Period (ps) 1100 300 100 Cycles Executed 360 1,316 2,748 Execution Time (ps) 396,000 394,800 274,800 Benchmark program: Square root Unit-Cycle is more than 40% faster than Single-Cycle or Multi-Cycle
  26. Arti fi cial Intelligence and Systems Laboratory (AISys) https://pooyanjamshidi.github.io/AISys/ Research

    Areas: - Causal AI - ML for Systems - Systems for ML - Adversarial ML - Robot Learning - Representation Learning Sponsors: Collaborators: Saeid Ghafouri (PhD student) Fatemeh Ghofrani (PhD student) Abir Hossen (PhD student) Shahriar Iqbal (PhD student) Sonam Kharde (Postdoc) Hamed Damirchi (PhD student) Mehdi Yaghouti (Postdoc) Samuel Whidden (Undergraduate) Rasool Shari fi (PhD student) Kimia Noorbakhsh (Undergraduate)