Upgrade to Pro — share decks privately, control downloads, hide ads and more …

shell-vs-genome

 shell-vs-genome

シェル芸でヒトゲノムを構築する

Tazro Inutano Ohta

October 18, 2017
Tweet

More Decks by Tazro Inutano Ohta

Other Decks in Science

Transcript

  1. ゲノム科学 シェル芸 ゲノム: 生物のソー スコー ド ゲノム情報はどのように生物の機能をコー ドしているか? ゲノム多型 遺伝子発現

    遺伝子発現制御 etc. 何ができればゲノムを理解したと言えるのか 任意の表現型を示す生物のゲノムを設計する
  2. FASTQ を生成するワンライナー R E A D L E N G

    T H = 1 0 0 s e q 4 0 0 0 0 0 0 0 0 | a w k - v r l e n g t h = " $ { R E A D L E N G T H } " ' B E G I N { b a s e [ 0 ] = " A " ; b a s e [ 1 ] = " T " ; b a s e [ 2 ] = " G " ; b a s e [ 3 ] = " C " ; } N R % 4 = = 1 { p r i n t " @ r e a d . " i n t ( N R / 4 ) + 1 } N R % 4 = = 2 { s e q = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { s e q = s e q s p r i n t f ( " % c " , b a s e [ i n t ( r a n d ( ) * 1 0 0 % 4 ) ] ) } ; p r i n t s e q } N R % 4 = = 3 { p r i n t " + " } N R % 4 = = 0 { q l = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { q l = q l s p r i n t f ( " % c " , s u b s t r ( r a n d ( ) , 3 ) % 9 3 + 3 3 ) } ; p r i n t q l } ' 好きな数だけリー ド吐いてくれるので便利 これをリファレンスにマップできたらヒトゲノム出来たことにする
  3. 無限にFASTQ を生成する R E A D L E N G

    T H = 1 0 0 y e s | a w k - v r l e n g t h = " $ { R E A D L E N G T H } " ' B E G I N { b a s e [ 0 ] = " A " ; b a s e [ 1 ] = " T " ; b a s e [ 2 ] = " G " ; b a s e [ 3 ] = " C " ; } N R % 4 = = 1 { p r i n t " @ r e a d . " i n t ( N R / 4 ) + 1 } N R % 4 = = 2 { s e q = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { s e q = s e q s p r i n t f ( " % c " , b a s e [ i n t ( r a n d ( ) * 1 0 0 % 4 ) ] ) } ; p r i n t s e q } N R % 4 = = 3 { p r i n t " + " } N R % 4 = = 0 { q l = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { q l = q l s p r i n t f ( " % c " , s u b s t r ( r a n d ( ) , 3 ) % 9 3 + 3 3 ) } ; p r i n t q l } ' hashtag: #yes
  4. 無限に生成されるFASTQ を延々 とmap する R E A D L E

    N G T H = 1 0 0 y e s | a w k - v r l e n g t h = " $ { R E A D L E N G T H } " ' B E G I N { b a s e [ 0 ] = " A " ; b a s e [ 1 ] = " T " ; b a s e [ 2 ] = " G " ; b a s e [ 3 ] = " C " ; } N R % 4 = = 1 { p r i n t " @ r e a d . " i n t ( N R / 4 ) + 1 } N R % 4 = = 2 { s e q = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { s e q = s e q s p r i n t f ( " % c " , b a s e [ i n t ( r a n d ( ) * 1 0 0 % 4 ) ] ) } ; p r i n t s e q } N R % 4 = = 3 { p r i n t " + " } N R % 4 = = 0 { q l = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { q l = q l s p r i n t f ( " % c " , s u b s t r ( r a n d ( ) , 3 ) % 9 3 + 3 3 ) } ; p r i n t q l } ' | b w a m e m g e n o m e . f a - > o n e l i n e h u m a n . s a m 永遠に結果が返ってこないので生成できたかどうか分からない
  5. 無限にFASTQ をmap しつつdepth をカウント し続ける そのうちヒトゲノムが完成します 待っている間に研究をしましょう R E A

    D L E N G T H = 1 0 0 y e s | a w k - v r l e n g t h = " $ { R E A D L E N G T H } " ' B E G I N { b a s e [ 0 ] = " A " ; b a s e [ | a w k ' N R % 4 ! = 0 { p r i n t f $ 0 " \ t " } N R % 4 = = 0 { p r i n t $ 0 } ' \ \ | w h i l e r e a d s e q ; d o b w a m e m g e n o m e . f a < ( e c h o $ s e q | t r ' \ t ' ' \ n ' ) \ | s a m t o o l s v i e w - b - \ > $ ( e c h o $ s e q | a w k ' { s u b ( " @ " , " " ) ; p r i n t $ 1 } ' ) . b a m s a m t o o l s c a t * b a m | s a m t o o l s s o r t \ | s a m t o o l s d e p t h - a - \ | a w k ' { s u m + = $ 3 } E N D { p r i n t s u m / N R } ' d o n e
  6. おまけ: コピペ用一行野郎 R E A D L E N G

    T H = 1 0 0 ; y e s | a w k - v r l e n g t h = " $ { R E A D L E N G T H } " ' B E G I N { b a s e [ 0 ] = " A " ; b a s e [ 1 ] = " T " ; b a s e [ 2 ] = " G " ; b a s e [ 3 ] = " C " } N R % 4 = = 1 { p r i n t " @ r e a d . " i n t ( N R / 4 ) + 1 } N R % 4 = = 2 { s e q = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { s e q = s e q s p r i n t f ( " % c " , b a s e [ i n t ( r a n d ( ) * 1 0 0 % 4 ) ] ) } ; p r i n t s e q } N R % 4 = = 3 { p r i n t " + " } N R % 4 = = 0 { q l = " " ; f o r ( i = 1 ; i < r l e n g t h ; i + + ) { q l = q l s p r i n t f ( " % c " , s u b s t r ( r a n d ( ) , 3 ) % 9 3 + 3 3 ) } ; p r i n t q l } ' | a w k ' N R % 4 ! = 0 { p r i n t f $ 0 " \ t " } N R % 4 = = 0 { p r i n t $ 0 } ' | w h i l e r e a d s e q ; d o b w a m e m g e n o m e . f a < ( e c h o $ s e q | t r ' \ t ' ' \ n ' ) | s a m t o o l s v i e w - b - > $ ( e c h o $ s e q | a w k ' { s u b ( " @ " , " " ) ; p r i n t $ 1 } ' ) . b a m ; s a m t o o l s c a t * b a m | s a m t o o l s s o r t | s a m t o o l s d e p t h - a - | a w k ' { s u m + = $ 3 } E N D { p r i n t s u m / N R } ' ; d o n e