[7] Suppose we observe ๐ธ environments E = {๐1 , โฆ , ๐๐ธ }, where ๐2 ๐ธ = 1, โ๐ โ [1, ๐ธ]. Then, for any ๐ > 1, there exists a featurizer ฮฆ๐ which, combined with the ERM-optimal classifier ฬ ๐ฝ = [๐ฝ๐ , ๐ฝ๐;๐ธ๐
๐ , ๐ฝ0 ]โค, satisfies the following 1. The regularization term of ฮฆ๐ , ฬ ๐ฝ is bounded as 1 ๐ธ โ ๐โE โโ ฬ ๐ฝ ๐
๐(ฮฆ๐ , ฬ ๐ฝ)โ 2 2 โ O (๐2 ๐ (๐๐ ๐๐ + 1 ๐ธ โ ๐โE โ๐๐ โ2 2 )) , (13) for some constants ๐๐ and ๐๐ โ exp{โ๐๐ min(๐ โ 1, (๐ โ 1)2/8)}. 2. ฮฆ๐ , ฬ ๐ฝ is equivalent to the ERM -optimal predicter on at least 1 โ ๐ fraction of the test distribution, where ๐ โ 2๐
โ๐๐ฟ exp{โ๐ฟ2}. output.tex 18 สข 24