Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Analysis_of_promoter_proximal_pausing_in_the_hu...

Priyanka Rajan
February 11, 2025
2

 Analysis_of_promoter_proximal_pausing_in_the_human_genome.pdf

Priyanka Rajan

February 11, 2025
Tweet

Transcript

  1. ANALYSIS OF PROXIMAL PROMOTER PAUSING IN THE HUMAN GENOME Presented

    By: Priyanka Rajan Final year B.Tech Biotechnology Under the supervision of : Prof V.Ramamurthy, Dept. of Biotechnology, PSG Tech
  2. OBJECTIVE • To examine the transcription pattern of gene promoter

    sequences to look for the presence of RNA pol2 pausing pattern. • To identify and quantify the number of R-loops and Tandem repeats in samples of promoter sequences of 50 genes sampled randomly from each chromosome • To establish a correlation between the presence of R-loops and Tandem repeats with the instance of pausing
  3. METHODOLOGY Creation of Index files: • Human genome reference build

    used: GRCh38.p2 annotation release 105 • The assembly was accessed using NCBI Map Viewer(Entrez Genome Viewer) • The gene list of each chromosome was collected and sorted according to the direction of transcription ( + / - orientation). The genes in + orientation were chosen for study • Using MS Excel functions, random numbers were assigned and the genes were sorted in ascending order of the random numbers.(fig.1) • First 50 genes were selected for study. First 1000 nucleotides collected for each gene
  4. Query files: • Global run-on sequenced files submitted by L.J.Core

    were downloaded. Fig 2. Showing hierarchy of the files submitted by Core et al under GSE13518 • 3 query files in total were aligned with the index GSE13518 SRX003135 srr23.fasta srr2425.fasta SRX003136 srr2627.fasta
  5. Alignment using Bowtie2 • The set of 50 sequences from

    each chromosome were built as index files. They were named as chr1,2..Y. • Each of the 3 query files were aligned with index files. Alignment results were obtained in sam format. • 3 sam files obtained for each chromosome. Quantification of read density using Integrative Genomics Viewer • IGV supports bam format. Sam-> bam conversion was done using samtools. • Merging of the two query files(srr23.fasta and srr2425.fasta) was done to bring a single file representing the library srx003135 • Files were sorted and Index files created for each bam file
  6. Classification of genes according to the read density • Each

    gene was characterized under 3 heads : Transcription status Bidirectional transcription Consistency • Criteria for Transcription status classifying the gene as transcriptionally elongated : no of reads is six and above in each query file, Consistent read pattern in both query files Classifying the gene as transcriptionally paused : 5X more number of reads compared to the count downstream Classifying the gene as transcriptionally silent : less than five no. of reads in both query files.
  7. Fig 12 .showing the Pattern seen after +500 has been

    added. This shows pausing in both direction
  8. Consolidation of results • Total no. of genes analysed in

    each chromosome = 50 • Inconsistent,-elongated and –paused deducted from total number to give number of genes to be analysed. • No of genes to be analysed = + elongated (+) +paused (+) elon in both dir (+) paused in both dir (+) no transcription • Number transcribed = No of genes to be analysed (-) no transcription • From the Number transcribed – number elongated and number paused was calculated • From number elongated , number elongated in + only was calculated. Likewise for paused in + • Number of TR and Rloops found .
  9. Table 1. showing average number of elongated in + and

    paused in + genes. The average number of elon in + genes and paused in + genes found with Rloop and TR is also given. Average number of genes elongated in + direction Average number of genes paused in + direction 9.25 1.25 With rloop 4.29 0.7 With Tandem repeats 1.1 2