Machine Learning Applications in Grid Computing

Cad49e6ffc6048dc9c53c77a907632dc?s=47 Daniel Bilar
September 27, 1999

Machine Learning Applications in Grid Computing

Functional validation means that a client presents to a prospective service provider a sequence of challenges. The service provider replies to these challenges with corresponding answers. Only after the client is satisfied that the service provider’s answers are consistent with the client’s expectations is an actual commitment made to using the service.
Challenge the service provider with some test cases x1, x2, ..., xk . The remote service provider R offers the corresponding answers fR(x1), fR(x2), ..., fR(xk). The client C may or may not have independent access to the answers fC(x1), fC(x2), ..., fC(xk).

Possible situations and machine learning models:

C “knows” fC(x) and R provides fR(x).
PAC learning and Chernoff bounds theory

C “knows” fC(x) and R does not provide fR(x).
Zero-knowledge proof

C does not “know” fC(x) and R provides fR(x).
Simulation-based learning and reinforcement learning

Cad49e6ffc6048dc9c53c77a907632dc?s=128

Daniel Bilar

September 27, 1999
Tweet

Transcript

  1. Machine Learning Applications in Grid Computing George Cybenko, Guofei Jiang

    and Daniel Bilar Thayer School of Engineering Dartmouth College 22th Sept.,1999, 37th Allerton Conference Urbana-Champaign, Illinois Acknowledgements: This work was partially supported by AFOSR grants F49620-97-1-0382, NSF grant CCR-9813744 and DARPA contract F30602-98-2-0107.
  2. Grid vision ▪  Grid computing refers to computing in a

    distributed networked environment in which computing and data resources are located throughout the network. ▪  Grid infrastructures provide basic infrastructure for computations that integrate geographically disparate resources, create a universal source of computing power that supports dramatically new classes of applications. ▪  Several efforts are underway to build computational grids such as Globus, Infospheres and DARPA CoABS.
  3. Client Server Matchmaker Advertise Service Location Request Reply Request Service

    Grid services ▪  A fundamental capability required in grids is a directory service or broker that dynamically matches user requirements with available resources. ▪  Prototype of grid services
  4. Matching conflicts ▪  Brokers and matchmakers use keywords and domain

    ontologies to specify services. ▪  Keywords and ontologies cannot be defined and interpreted precisely enough to make brokering or matchmaking between grid services robust in a truly distributed, heterogeneous computing environment. ▪  Matching conflicts exist between client’s requested functionality and service provider’s actual functionality.
  5. An example ▪  A client requires a three-dimensional FFT. A

    request is made to a broker or matchmaker for a FFT service based on the keywords and possibly parameter lists. ▪  The broker or matchmaker uses the keywords to retrieve its catalog of services and returns with the candidate remote services. ▪  Literally dozens of different algorithms for FFT computations with different assumptions, dimensions, accuracy, input-output format and so on. ▪  The client must validate the actual functionality of these remote services before the client commits to use it.
  6. Functional validation ▪  Functional validation means that a client presents

    to a prospective service provider a sequence of challenges. The service provider replies to these challenges with corresponding answers. Only after the client is satisfied that the service provider’s answers are consistent with the client’s expectations is an actual commitment made to using the service. ▪  Three steps: –  Service identification and location. –  Service functional validation. –  Commitment to the service
  7. None
  8. Our approach ▪  Challenge the service provider with some test

    cases x1 , x2 , ..., xk . The remote service provider R offers the corresponding answers fR (x1 ), fR (x2 ), ..., fR (xk ). The client C may or may not have independent access to the answers fC (x1 ), fC (x2 ), ..., fC (xk ). ▪  Possible situations and machine learning models: –  C “knows” fC (x) and R provides fR (x). •  PAC learning and Chernoff bounds theory –  C “knows” fC (x) and R does not provide fR (x). •  Zero-knowledge proof –  C does not “know” fC (x) and R provides fR (x). •  Simulation-based learning and reinforcement learning
  9. Mathematical framework ▪  The goal of PAC learning is to

    use few examples as possible, and as little computation as possible to pick a hypothesis concept which is a close approximation to the target concept. ▪  Define a concept to be a boolean mapping . X is the input space. c(x)=1 indicates x is a positive example , i.e. the service provider can offer the “correct” service for challenge x. ▪  Define an index function ▪  Now define the error between the target concept c and the hypothesis h as . } , { X : c 1 0 ♦ ( ) ( ) ( ) γ ≤ − = otherwise x f x f if x F R C 0 1 ( ) [ ] x h ) x ( c ob Pr ) h ( error P x ? = !
  10. Mathematical framework(cont’d) ▪  The client can randomly pick m samples

    to PAC learn a hypothesis h about whether the service provider can offer the “correct” service . ▪  Theorem 1(Blumer et.al.) Let H be any hypothesis space of finite VC dimension d contained in , P be any probability distribution on X and the target concept c be any Borel set contained in X. Then for any , given the following m independent random examples of c drawn according to P , with probability at least , every hypothesis in H that is consistent with all of these examples has error at most . X 2 1 0 < δ ε < , δ − 1 ε √ ↵ # ε ε δ ε ? 13 8 2 4 log d , log max m ( ) ( ) ( ) { } ) x ( F , x , , ) x ( F , x , ) x ( F , x S m m m ! 2 2 1 1 =
  11. Simplified results ▪  Assuming that with regard to some concepts,

    all test cases have the same probability about whether the service provider can offer the “correct” service. ▪  Theorem 2(Chernoff bounds): Consider independent identically distributed samples , from a Bernoulli distribution with expectation . Define the empirical estimate of based on these samples as Then for any , if the sample size , then the probability . ▪  Corollary 2.1: For the functional validation problem described above, given any , if the sample size , then the probability . 1 x m x x , , 2 ! p m x p m i i = = 1 ˆ 2 2 2 ln ln ε δ − − ? m [ ] δ ε ≤ ? − p p probm ˆ 1 , 0 < < δ ε 1 , 0 < < δ ε 2 2 ln ε δ − ? m [ ] δ ε ≤ ? − ) 1 ( p probm
  12. Simplified results(cont’d) ▪  Given a target probability P, the client

    needs to know how many positive consecutive samples are required so that the next request to the service will be correct with probability P. ▪  So probabilities , and P have the following inequality: ▪  Formulate the sample size problem as the following nonlinear optimization problem: s.t. and δ ε ) 1 )( 1 ( δ ε − − ≤ P √ ↵ # − = 2 , 2 ln min ε δ δ ε m P ? − − ) 1 )( 1 ( δ ε 1 , 0 < < δ ε
  13. Simplified results(cont’d) ▪  From the constraint inequality, ▪  Then transfer

    the above two dimensional function optimization problem to the one dimensional one: s.t. ▪  Elementary nonlinear functional optimization methods. ε δ − − ≤ 1 1 P √ √ √ √ ↵ # − − − = 2 2 ) 1 1 ln( min ε ε ε P m P − < < 1 0 ε
  14. Mobile Functional Validation Agent User Interface User Agent Interface Agent

    Computing Server A Mobile Agent Machine A Machine C, D, E, ... Create Send Jump A’s Service Correct Incorrect B’s Service Correct Incorrect C, D, E, ….. MA Interface Agent Computing Server B MA Machine B MA Correct Service
  15. Future work and open questions ▪  Integrate functional validation into

    grid computing infrastructure as a standard grid service. ▪  Extend to other situations described(like zero- knowledge proofs, etc.). ▪  Formulate functional validation problems into more appropriate mathematical models. ▪  Explore solutions for more difficult and complicated functional validation situations. ▪  Thanks!!