Pair-wise sequence alignment MCQs With Answer

This collection of pair-wise sequence alignment MCQs is tailored for M.Pharm students to strengthen both theoretical and practical understanding of sequence comparison used in drug discovery and molecular biology. The questions focus on core algorithms (Needleman–Wunsch, Smith–Waterman, Gotoh, Hirschberg), scoring systems (PAM, BLOSUM, log-odds), gap models including affine penalties, and statistical evaluation (E-value, bit score). Emphasis is placed on algorithmic complexity, traceback interpretation, alignment strategies for proteins versus nucleotides, and interpreting alignment outputs in a pharmacological research context. Use these targeted MCQs to evaluate, revise, and deepen your competence in pair-wise alignment concepts and applications.

Q1. In the Needleman–Wunsch global alignment algorithm, how should the first row and first column of the dynamic programming matrix be initialized for a linear gap penalty?

  • All zeros
  • With cumulative gap penalties proportional to position
  • Random small negative values
  • With high positive values to favor gaps

Correct Answer: With cumulative gap penalties proportional to position

Q2. What is the key conceptual difference between Smith–Waterman and Needleman–Wunsch algorithms?

  • Smith–Waterman performs global alignment; Needleman–Wunsch performs local alignment
  • Smith–Waterman finds local optimal alignments and allows initialization with zeros; Needleman–Wunsch computes a global optimal alignment with cumulative gap penalties
  • Smith–Waterman uses affine gap penalties only; Needleman–Wunsch uses linear gap penalties only
  • Smith–Waterman uses substitution matrices; Needleman–Wunsch does not

Correct Answer: Smith–Waterman finds local optimal alignments and allows initialization with zeros; Needleman–Wunsch computes a global optimal alignment with cumulative gap penalties

Q3. What is the time complexity of the standard dynamic programming algorithm for exact pair-wise alignment of two sequences of lengths n and m?

  • O(n + m)
  • O(nm)
  • O(n log m)
  • O(max(n,m))

Correct Answer: O(nm)

Q4. In an affine gap penalty model, the cost of a gap of length k is usually expressed as:

  • gap_open + k × gap_extend
  • k × gap_open
  • gap_open × gap_extend × k
  • gap_open + gap_extend (independent of k)

Correct Answer: gap_open + k × gap_extend

Q5. Gotoh’s algorithm for affine gap penalties requires how many DP matrices to compute optimal alignment scores efficiently?

  • One matrix
  • Two matrices
  • Three matrices
  • Four matrices

Correct Answer: Three matrices

Q6. Hirschberg’s algorithm is used to compute an optimal global alignment with reduced memory. What is the space complexity achieved by Hirschberg’s algorithm for sequences of lengths n and m?

  • O(nm)
  • O(log(n + m))
  • O(min(n, m))
  • O(max(n, m))

Correct Answer: O(min(n, m))

Q7. Which statement correctly distinguishes BLOSUM matrices from PAM matrices?

  • BLOSUM matrices are derived from global alignments of closely related sequences; PAM matrices are from conserved local blocks
  • BLOSUM matrices are log-odds scores computed from local conserved blocks at given identity thresholds; PAM matrices are derived from evolutionary models of global substitutions
  • BLOSUM matrices are linear gap models; PAM matrices are affine gap models
  • BLOSUM matrices are only for nucleotides; PAM matrices are only for proteins

Correct Answer: BLOSUM matrices are log-odds scores computed from local conserved blocks at given identity thresholds; PAM matrices are derived from evolutionary models of global substitutions

Q8. A positive log-odds substitution score in a substitution matrix indicates what about that residue pair?

  • The substitution is less likely than expected by chance
  • The substitution is more likely than expected by chance
  • The substitution will always be a functional change
  • The substitution represents a gap opening event

Correct Answer: The substitution is more likely than expected by chance

Q9. According to Karlin–Altschul statistics, the E-value for a sequence alignment represents:

  • The exact probability that the observed alignment is biologically meaningful
  • The expected number of alignments with score at least S in a random database of given size
  • The raw alignment score without normalization
  • The minimum number of mismatches allowed

Correct Answer: The expected number of alignments with score at least S in a random database of given size

Q10. During Smith–Waterman traceback for local alignment, when should the traceback stop?

  • When the start of either sequence is reached
  • When a cell with score zero is encountered
  • After a fixed number of steps equal to the alignment length
  • When a negative score is encountered

Correct Answer: When a cell with score zero is encountered

Q11. Which substitution matrix is generally recommended when aligning very distantly related protein sequences (low sequence identity)?

  • BLOSUM80
  • PAM30
  • PAM250
  • BLOSUM90

Correct Answer: PAM250

Q12. What is the purpose of applying a banded (diagonal-restricted) dynamic programming approach in pair-wise alignment?

  • To guarantee finding the global optimal alignment regardless of similarity
  • To reduce time and memory when sequences are expected to be similar by restricting the DP to a diagonal band
  • To enforce local alignment instead of global alignment
  • To avoid using substitution matrices

Correct Answer: To reduce time and memory when sequences are expected to be similar by restricting the DP to a diagonal band

Q13. When aligning a short read or fragment against a full-length reference, which alignment style is most appropriate to avoid penalizing terminal overhangs?

  • Global alignment with standard end-gap penalties
  • End-gap free (semi-global) alignment that does not penalize terminal gaps of the fragment
  • Smith–Waterman algorithm with strict end penalties
  • Pairing only exact matches without gaps

Correct Answer: End-gap free (semi-global) alignment that does not penalize terminal gaps of the fragment

Q14. If you increase the size of the sequence database used for similarity searches, how does it generally affect the reported E-value for the same raw alignment score?

  • E-value decreases (better significance)
  • E-value remains constant
  • E-value increases (worse significance)
  • E-value becomes negative

Correct Answer: E-value increases (worse significance)

Q15. What does the BLAST bit score represent in sequence alignments?

  • The raw alignment score without normalization
  • A normalized score (log-scaled) that is independent of the scoring system and database size
  • The E-value multiplied by database size
  • The number of gaps in the alignment

Correct Answer: A normalized score (log-scaled) that is independent of the scoring system and database size

Q16. In protein pair-wise alignments, how do “identity” and “similarity” differ when reporting alignment quality?

  • Identity counts chemically similar substitutions; similarity counts exact matches
  • Identity counts exact residue matches; similarity counts conservative substitutions as well as identical residues
  • They are equivalent terms used interchangeably
  • Identity measures gap frequency; similarity measures alignment length

Correct Answer: Identity counts exact residue matches; similarity counts conservative substitutions as well as identical residues

Q17. In affine-gap dynamic programming implementations, the three matrices M, Ix, and Iy typically represent:

  • M: match/mismatch; Ix: insertion in sequence X; Iy: insertion in sequence Y (gaps in the other sequence)
  • M: gap-only alignment; Ix: match-only alignment; Iy: mismatch-only alignment
  • M: local scores; Ix: global scores; Iy: semiglobal scores
  • M: matrix for nucleotides; Ix: matrix for proteins; Iy: matrix for codons

Correct Answer: M: match/mismatch; Ix: insertion in sequence X; Iy: insertion in sequence Y (gaps in the other sequence)

Q18. Which alignment characteristic is most informative when the goal is to infer conserved biochemical function between two proteins?

  • High percent identity only
  • High percent similarity (conservative substitutions) and conserved motifs
  • Long alignment length regardless of substitutions
  • Many gaps indicating structural rearrangements

Correct Answer: High percent similarity (conservative substitutions) and conserved motifs

Q19. For nucleotide pair-wise alignment, why might one choose to weight transitions (A↔G, C↔T) differently from transversions?

  • Transitions are less frequent and should be penalized more heavily
  • Transitions are more frequent and often more tolerated, so they may be scored less negatively than transversions
  • Transversions do not affect protein coding, so they are ignored
  • Transitions always create stop codons, so they require special handling

Correct Answer: Transitions are more frequent and often more tolerated, so they may be scored less negatively than transversions

Q20. Which statement best describes how BLAST approximates exact local pair-wise alignment?

  • BLAST computes full dynamic programming over the entire matrix to guarantee optimal local alignments
  • BLAST uses k-mer (word) matches to seed alignments and extends these seeds heuristically to find high-scoring local alignments quickly
  • BLAST only aligns sequences globally using Needleman–Wunsch
  • BLAST uses random sampling of sequence positions to estimate alignment scores

Correct Answer: BLAST uses k-mer (word) matches to seed alignments and extends these seeds heuristically to find high-scoring local alignments quickly

Leave a Comment