DNA

DNA (deoxyribonucleic acid) is the molecule that carries the genetic instructions for the development, functioning, growth, and reproduction of all known organisms and many viruses. Often called the "blueprint of life," DNA contains the information needed to build and maintain an organism and is passed from parents to offspring.

Structure

The Double Helix

DNA's structure was discovered by James Watson and Francis Crick in 1953, building on Rosalind Franklin's X-ray crystallography work:

  • Two polynucleotide strands wound around each other
  • Right-handed helix (B-form DNA)
  • Diameter: approximately 2 nanometers
  • One complete turn: 3.4 nanometers (10 base pairs)
  • Strands are antiparallel (run in opposite directions)

Nucleotides

DNA is composed of four nucleotides, each containing:

  1. Phosphate group: Provides backbone negative charge
  2. Deoxyribose sugar: Five-carbon sugar
  3. Nitrogenous base: One of four types

The Four Bases

BaseTypePairs WithRings
Adenine (A)PurineThymine2
Guanine (G)PurineCytosine2
Thymine (T)PyrimidineAdenine1
Cytosine (C)PyrimidineGuanine1

Base Pairing

Complementary base pairing (Chargaff's rules):

  • A-T: Two hydrogen bonds
  • G-C: Three hydrogen bonds
  • This pairing ensures accurate replication
  • Amount of A always equals T; G always equals C

Major and Minor Grooves

The double helix has two grooves:

  • Major groove: Wider, more accessible
  • Minor groove: Narrower
  • Proteins bind to grooves to read DNA sequence

Organization

In Prokaryotes

Bacterial DNA organization:

  • Single circular chromosome
  • Located in nucleoid region (no membrane)
  • Plasmids: Additional small circular DNA
  • Supercoiled to fit in cell

In Eukaryotes

Complex packaging hierarchy:

  1. Nucleosomes: DNA wrapped around histone proteins
  2. Chromatin fiber: String of nucleosomes
  3. Looped domains: 30 nm fiber loops
  4. Chromosomes: Condensed during cell division

Chromosomes

Human genome organization:

  • 23 pairs of chromosomes (46 total)
  • ~3 billion base pairs
  • ~20,000 protein-coding genes
  • 98% non-coding DNA

Telomeres and Centromeres

Telomeres:

  • Protective caps at chromosome ends
  • TTAGGG repeats in humans
  • Shorten with each cell division
  • Associated with aging

Centromeres:

  • Attachment point for spindle fibers
  • Essential for chromosome separation
  • Highly repetitive sequences

DNA Replication

Overview

DNA replication is semi-conservative:

  • Each strand serves as template
  • Produces two identical DNA molecules
  • Each new molecule has one old and one new strand

The Replication Process

Initiation

  1. Origins of replication identified
  2. Helicase unwinds the double helix
  3. Single-strand binding proteins stabilize
  4. Topoisomerase relieves tension

Elongation

Key enzymes:

EnzymeFunction
PrimaseSynthesizes RNA primer
DNA Polymerase IIIMain replication enzyme
DNA Polymerase IRemoves primers, fills gaps
LigaseJoins Okazaki fragments

Leading strand vs. lagging strand:

  • Leading strand: Continuous synthesis (5' to 3')
  • Lagging strand: Discontinuous (Okazaki fragments)
  • Both strands synthesized simultaneously

Termination

  • Replication forks meet
  • RNA primers removed
  • Gaps filled and ligated
  • Proofreading and repair

Accuracy

DNA replication is highly accurate:

  • Error rate: ~1 per 10⁹-10¹⁰ base pairs
  • 3' to 5' exonuclease proofreading
  • Mismatch repair systems
  • Essential for genetic stability

Gene Expression

The Central Dogma

Information flow in biology:

DNA → RNA → Protein

Transcription

DNA to RNA:

  1. Initiation: RNA polymerase binds promoter
  2. Elongation: RNA synthesized 5' to 3'
  3. Termination: Polymerase releases

In eukaryotes, additional processing:

  • 5' cap added
  • Introns spliced out
  • 3' poly-A tail added

Translation

RNA to protein:

  • mRNA carries genetic code
  • tRNA brings amino acids
  • Ribosomes catalyze protein synthesis
  • Genetic code: 64 codons for 20 amino acids

Regulation

Gene expression control:

  • Transcription factors: Activate/repress genes
  • Enhancers/silencers: Regulatory DNA sequences
  • Epigenetics: DNA methylation, histone modifications
  • Post-transcriptional: RNA processing, stability

Mutations and Repair

Types of Mutations

Point Mutations

TypeEffectExample
SilentNo amino acid changeAAA→AAG (both = Lys)
MissenseDifferent amino acidGAG→GTG (Glu→Val)
NonsensePremature stop codonTAT→TAA (Tyr→Stop)

Larger Mutations

  • Insertions: Addition of nucleotides
  • Deletions: Removal of nucleotides
  • Frameshift: Insertions/deletions not divisible by 3
  • Duplications: Segment repeated
  • Inversions: Segment reversed
  • Translocations: Movement between chromosomes

Causes of Mutations

  • Spontaneous: Replication errors, tautomeric shifts
  • Induced: Radiation, chemicals, viruses
  • Environmental: UV light, carcinogens

DNA Repair Mechanisms

MechanismDamage TypeProcess
Direct repairUV damagePhotolyase reverses
Base excision repairSingle base damageRemove and replace
Nucleotide excision repairBulky lesionsRemove segment
Mismatch repairReplication errorsCorrect mismatches
Double-strand break repairBreaksNHEJ or homologous recombination

Genetic Variation

Sources of Variation

  • Mutation: Ultimate source of new alleles
  • Recombination: New combinations during meiosis
  • Sexual reproduction: Combines parental genomes

Single Nucleotide Polymorphisms (SNPs)

  • Single base differences between individuals
  • ~4-5 million SNPs per person
  • Used in genetic studies
  • Some associated with disease risk

Structural Variation

  • Copy number variants (CNVs)
  • Insertions and deletions
  • Inversions
  • Translocations

DNA Technology

Polymerase Chain Reaction (PCR)

Amplifies specific DNA sequences:

  1. Denaturation: Heat separates strands (95°C)
  2. Annealing: Primers bind (50-65°C)
  3. Extension: Polymerase synthesizes (72°C)
  4. Repeat 25-35 cycles

Applications:

  • Forensics
  • Diagnosis
  • Cloning
  • Research

DNA Sequencing

Sanger Sequencing

  • Chain termination method
  • Uses dideoxynucleotides
  • Gold standard for accuracy
  • Limited throughput

Next-Generation Sequencing

PlatformRead LengthOutputApplication
Illumina150-300 bpHighGeneral purpose
PacBio10-25 kbMediumLong reads
Oxford NanoporeUnlimitedVariablePortable, real-time

CRISPR-Cas9

Revolutionary gene editing:

  • Cas9 protein cuts DNA at specific location
  • Guide RNA directs to target
  • Enables precise editing
  • Applications: Disease treatment, research, agriculture

DNA Fingerprinting

Individual identification:

  • Analyzes variable regions (STRs)
  • Unique pattern for each person
  • Used in forensics, paternity testing
  • Probability of random match: ~1 in billions

The Human Genome

Human Genome Project (1990-2003)

Landmark achievement:

  • Sequenced entire human genome
  • International collaboration
  • Cost: ~$3 billion
  • Revealed ~20,000 genes

Genome Composition

ComponentPercentageDescription
Protein-coding~1.5%Genes
Regulatory~8%Controls gene expression
Introns~25%Non-coding gene portions
Repetitive~45%Transposons, satellites
Other non-coding~20%Various functions

Personal Genomics

Modern DNA analysis:

  • Direct-to-consumer testing
  • Ancestry information
  • Health risk assessment
  • Pharmacogenomics
  • Ethical considerations

Evolution and DNA

Molecular Evolution

DNA provides evolutionary evidence:

  • Sequence comparison reveals relationships
  • Molecular clock estimates divergence times
  • Common genetic code supports common ancestry

Comparative Genomics

OrganismGenes Shared with Humans
Chimpanzee~99%
Mouse~85%
Fruit fly~60%
Yeast~30%
BacteriaCore metabolic genes

See Also

References

  1. Watson, J.D. (1968). The Double Helix. Atheneum.
  2. Alberts, B., et al. (2015). Molecular Biology of the Cell (6th ed.). Garland Science.
  3. Mukherjee, S. (2016). The Gene: An Intimate History. Scribner.