Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Mutational Analysis of Cancer using BioInformatics

Mayank Khanna
May 01, 2009
97

Mutational Analysis of Cancer using BioInformatics

Mayank Khanna

May 01, 2009
Tweet

Transcript

  1. What is Bioinformatics?? Bioinformatics is the application of Information Technology

    to field of Molecular Biology. The term “Bioinformatics” was coined by Paulien Hogweg in 1978. It involves the creation and development of Databases, Algorithms, Computational & Statistical Techniques and theory to solve formal and practical problems arising from analysis of Biological data.
  2. Origin of Bioinformatics: Research on Human Genome sequence generated records

    made up of around 20,500 human genes, high resolution maps of Chromosomes including billions of base pairs of DNA-sequence information. Laboratory information & Database management systems and Graphical User Interfaces were the computing tools required to help researchers decipher this pool of data. These computing challenges were overcome by this latest field of research ,Bioinformatics. Researchers have developed Public Databases connected to the internet to make genome data available worldwide along with analytical software to facilitate research on this data. Scientists now have a more detailed blueprint of the Human Genetic code.
  3. What is a Sequence Database?? A Database is usually a

    structured set of data stored in a computer. In the field of Bioinformatics, a Sequence Database is a large collection of DNA, Protein or other sequences of several organisms. Sequence Database are of two types: Primary Database Secondary Database
  4. Primary Database: It is the database containing the primary sequence

    data. It is the compilation of various gene sequences which have been researched since years. They are further classified as: Nucleic Acid Database: Protein Database: GenBank SWISS-PROT EMBL TrEMBL DDBJ
  5. Secondary Database: Secondary Database is the database of sequence information

    derived from the data in Primary databases. Examples for Secondary database are as follows: Secondary Database: Primary Database: PROSITE SWISS-PROT PRINTS OWL BLOCKS PROSITE
  6. GenBank: The GenBank Primary Sequence Database is an open access,

    annotated collection of Nucleotide Sequences & their protein translations. It is maintained by NCBI, USA as a part of the INSDC. It is built by direct submissions from individual laboratories, as well as by bulk submissions from large-scale sequencing centers wherein each sequence is assigned a unique accession number by the GenBank staff.
  7. By August 2006, GenBank comprised of over 65 million Nucleotide

    bases in more than 61 million sequences. WWW.ncbi.nlm.nih.gov/-- -- Searched Sequence-- Sequence in Fasta Format--
  8. BLAST: BLAST or Basic Local Alignment Search Tool is an

    algorithm for comparing primary biological sequence information such as amino-acid sequences of different proteins or nucleotides of DNA sequences. It enables comparison of a query sequence with a database of sequences and identify those which are similar to the query sequence. It was developed by Eugene Myers, Stephen Altschul, Warren Gish, David J. Lipman and Webb Miller at the NIH. WWW.ncbi.nlm.nih.gov/BLAST -- == Output of BLAST --
  9. Phylogram & Cladogram: Phylogenetic trees are used to show evolutionary

    relationships wherein nodes represent organisms and links are used to show lines of descent. Each tree is binary, so evolution of species are represented as series of bifurcations and are also termed as Cladogram. A cladogram which conveys a sense of evolutionary time using branch lengths is called a Phylogram. H C G O
  10. What is Sequence Alignment?? In Bioinformatics, Sequence Alignment is a

    method of arranging sequences of DNA, RNA or Protein to identify regions of similarity that may be a consequence of functional, structural or evolutionary relationships between sequences. Aligned sequences are represented as rows within a matrix. Gaps are inserted between the residues so that identical or similar characteristics are aligned in successive columns. The two types of Alignment are: Local Alignment Global Alignment
  11. Types Of Alignment: Local Alignment: An alignment which is similar

    over a small part of the sequence only. It is done using Smith-Waterman algorithm. Global Alignment: An alignment which is similar over the maximum possible length of the sequence. It is done using Needleman-Wunsch algorithm.
  12. Pairwise Sequence Alignment: Pairwise Sequence Alignment is a method to

    find the best matching piece-wise alignments of two query sequences. It can be used between two sequences at a time and is used when extreme precision is not required. The three primary methods of producing pairwise alignments are Dot-matrix method, Dynamic programming and word methods. Software used for Pairwise Alignment: EMBOSS
  13. EMBOSS: EMBOSS or European Molecular Biology Open Software Suite is

    free open source software analysis package specially developed for the needs of Molecular biology. It has a set of varied uses wherein it can be used to merge two sequences to make a consensus and even for finding difference between two sequences. It integrates a range of currently available packages and tools for sequence analysis. EMBOSS Homepage -- EMBOSS Output(NEEDLE) -- EMBOSS Output(WATER) --
  14. Multiple Sequence Alignment: Multiple Sequence Alignment is an extension of

    Pairwise alignment to incorporate more than two sequences at a time. It aligns all the sequences in a given query set. It is used in identifying conserved sequence regions across group of sequences which may be evolutionary related. Software to implement Multiple Alignment: ClustalW
  15. ClustalW: Clustal is a widely used Multiple Sequence Alignment computer

    program. Its two main variations are: ClustalW: Command Line Interface ClustalX: This version has Graphical user Interface. ClustalW Homepage -- ClustalW Output --
  16. Application Of Bioinformatics: Recent development and advancement in the field

    of Bioinformatics has led to its application in various fields : Gene Therapy Molecular Medicine Microbial Genome Climate change Application Studies Forensic Analysis of Bio-Weapon creation Microbes Crop Improvement & Insect Resistance It has found immense use in medicine wherein it is being used in Mutational Analysis of Cancer.
  17. Genetics & Cancer: As cancer develops, the diseased cells undergo

    a series of genetic changes that drastically alter their metabolism. Many genes are lost and others take over new roles in promoting tumor growth, invasion, etc. Genetic variation between the same tumor type in different individuals may result in metabolic differences bet. one tissue and another that can affect disease progression or therapy response. Role of Bioinformatics in this process is to analyze sequence and molecular data in order to sight the differences. It is used to collect information on all human genes & proteins which even includes history of evolutionary rearrangements & gene duplications that are source of the Genetic Variation.
  18. Analysis of Mutations in Cancer: Massive Sequencing efforts are used

    to identify previously unknown point mutations in the variety of genes. Bioinformaticians have been creating specialized automated systems to manage the sheer volume of sequence data produced and create advanced algorithms and softwares to compare the sequencing results. Latest physical detection techniques such as “Single Nucleotide Polymorphism” arrays is employed to detect point mutations. The data generated per experiment is in terabytes and is often found to contain variability, noise or other disturbances from which data is deduced using other special techniques.
  19. Future Bioinformatics Tools & Resources: The above approaches in cancer

    research can only be realized to their fullest potential with continued development of Bioinformatics Tools and Resources. Future development includes efficient management and organization of biological and medical information into well defined data objects. These technologies will greatly assist in development of new Anti-Cancer drugs. The cancer Biomedical Informatics Grid is being developed to allow sharing of such data objects between cancer centers.
  20. Future Prospects of Bioinformatics: Value of Bioinformatics is expected to

    increase from $1.02 Billion in 2002 to $3.0 Billion in 2010 at an avg. annual growth rate of 15.8%. Fastest growing market would be Analysis Software & Services, estimated to grow from $444.7 Million in 2005 to $1.2 Billion in 2010. Drug discovery & Development is expected to reduce the annual cost of developing a new drug by 33% & time for Drug Discovery by 30%.
  21. In future, a major portion of the R&D expenditures of

    Pharmaceutical companies is expected to go into Bioinformatics. Growth of Bioinformatics