systems from observational data. Nature Methods. • 5361変数・サンプルサイズ63 • 因果探索で因果グラフを推定し、それに基づいて因果効果(の下限) を推定 (下限が甘いことはある) • 実際の介入実験結果と照らしてランダムにやるより当たっていた 16 yielded only 5 ± 2.1 true positives (10% ± 4.2%). Moreover, IDA improved substantially on Lasso4 and Elastic-net5, two state-of- the-art high-dimensional regression approaches commonly used to determine variable importance but not designed for causal inference (Fig. 1a, Supplementary Table 1 and Supplementary Methods). For m = 10 and q = 50, these methods yielded 10 (20%) and 8 (16%) true positives, respectively. Finally, we found that the superior performance of IDA compared to that of the other methods was insensitive to the choice of m value for m = 1, ... 50 (Fig. 1b). As a second test, we used data from the DREAM4 In Silico Network Challenge6, a competition in reverse engineering of gene regulation networks. These data include several types of simulated mRNA expression levels, based on sophisticated bio- logically motivated simulation methods6, for five networks of 10 genes and five networks of 100 genes. We used two types of observational data: (i) steady-state gene expression levels from unknown multifactorial perturbations of the networks and (ii) time series data on gene expression levels from the response and recovery of the networks to unknown external perturba- primary interest in many fields of science. The od for determining such relationships uses ran- lled perturbation experiments. In many settings, xperiments are expensive and time consuming. rable to obtain causal information from observa- t is, from data obtained by observing the system out subjecting it to interventions. tablished methods to estimate causal effects onal data when the possible causal relationships riables are known1. Many real-world problems, e large-scale systems without such information. enerally impossible to estimate causal effects in we recently proposed and mathematically justi- l method to obtain bounds on total causal effects, umptions (Supplementary Methods). We call this ntion-calculus when the DAG is absent (IDA). en experimentally validated until now, and there rimental validation of causal inference methods ere an experimental validation of IDA. As a first a compendium of gene ofiles of Saccharomyces taining 267 full-genome ofiles of yeast deletion ventional data), together nome expression profiles rol experiments (observa- obtained under the same ter initial data cleaning y Methods), the interven- ained expression measure- genes for 234 single-gene nt strains, and the obser- ontained expression mea- e same 5,361 genes for 63 res. nterventional data as the for estimating the total m values 0 0.5 1.0 1.5 2.0 2.5 pAUC × 105 0 10 20 30 40 50 0 1,000 2,000 3,000 4,000 0 200 400 600 800 1,000 IDA Lasso Elastic-net Random True positives False positives a b IDA Lasso Elastic-net Random 未観測共通原因ありへの拡張 (Malinsky & Spirtes, 2017) Code: https://github.com/dmalinsk/lv-ida