In-silico gene expression and microarrays MCQs With Answer

Introduction

This quiz collection focuses on in-silico gene expression and microarray analysis tailored for M.Pharm students. It covers core computational concepts used in preprocessing, normalization, differential expression, quality control, and downstream functional interpretation. Questions target practical knowledge of microarray platforms, statistical methods (e.g., limma, SAM), normalization strategies (quantile, loess, RMA), batch correction, and visualization tools such as MA/volcano plots and PCA. Emphasis is placed on interpretation for pharmaceutical research: biomarker discovery, target validation, and pathway analysis. These MCQs reinforce understanding of experimental design, common pitfalls, and bioinformatics workflows essential for reliable in-silico gene expression studies.

Q1. Which normalization method equalizes the empirical distribution of probe intensities across arrays by making their quantiles identical?

  • Log2 transformation
  • Quantile normalization
  • Background correction
  • Median polish

Correct Answer: Quantile normalization

Q2. Which algorithm is commonly used for summarizing probe-level Affymetrix data to produce gene-level expression estimates with background adjustment and quantile normalization?

  • MOSER
  • RMA (Robust Multi-array Average)
  • MANCOVA
  • k-means summarization

Correct Answer: RMA (Robust Multi-array Average)

Q3. In two-color microarray experiments, what design element helps correct for dye-specific bias?

  • Using only a single dye for all samples
  • Dye-swap replicates
  • Increasing array scan resolution
  • Using housekeeping genes only

Correct Answer: Dye-swap replicates

Q4. Which statistical approach in microarray analysis uses empirical Bayes moderation to improve variance estimates for small sample sizes?

  • SAM (Significance Analysis of Microarrays)
  • ANOVA without moderation
  • limma (linear models for microarray data)
  • Student’s t-test without pooling

Correct Answer: limma (linear models for microarray data)

Q5. Which plot displays log-intensity ratios (M) versus average log-intensity (A) and is useful for assessing intensity-dependent biases?

  • Heatmap
  • MA plot
  • PCA biplot
  • Volcano plot

Correct Answer: MA plot

Q6. When testing thousands of genes for differential expression, which correction controls the expected proportion of false discoveries?

  • Bonferroni correction
  • Family-wise error rate
  • False discovery rate (Benjamini-Hochberg)
  • No correction

Correct Answer: False discovery rate (Benjamini-Hochberg)

Q7. Which technique identifies modules of highly co-expressed genes across samples to infer gene networks and potential regulatory modules?

  • Principal Component Analysis (PCA)
  • Hierarchical clustering only
  • WGCNA (Weighted Gene Co-expression Network Analysis)
  • DESeq normalization

Correct Answer: WGCNA (Weighted Gene Co-expression Network Analysis)

Q8. Which quality control metric indicates array-wide hybridization performance by comparing probe intensities to expected control probe signals?

  • Normalized enrichment score (NES)
  • Percent present/call rate
  • Gene ontology p-value
  • Fold change threshold

Correct Answer: Percent present/call rate

Q9. Which method is best for correcting batch effects when samples were processed in different batches and batch labels are known?

  • Quantile normalization only
  • ComBat (empirical Bayes batch correction)
  • PCR amplification
  • Log transformation

Correct Answer: ComBat (empirical Bayes batch correction)

Q10. In microarray probe annotation, why is remapping probes to the latest genome/transcriptome important?

  • To increase raw signal intensity
  • To correct probe-gene assignments due to updated gene models and avoid cross-hybridization errors
  • To change fluorescence wavelengths
  • To reduce the number of probes on the array

Correct Answer: To correct probe-gene assignments due to updated gene models and avoid cross-hybridization errors

Q11. Which downstream analysis evaluates whether a set of differentially expressed genes is over-represented in predefined biological categories?

  • Normalization
  • Pathway or Gene Ontology (GO) enrichment analysis
  • MA plotting
  • Background correction

Correct Answer: Pathway or Gene Ontology (GO) enrichment analysis

Q12. Significance Analysis of Microarrays (SAM) introduces a tuning parameter “s0” primarily to:

  • Adjust intensity scaling across arrays
  • Stabilize variance estimates by adding a small constant to the denominator of the test statistic
  • Increase fold-change thresholds arbitrarily
  • Filter out low-intensity probes

Correct Answer: Stabilize variance estimates by adding a small constant to the denominator of the test statistic

Q13. Which visualization is most appropriate for simultaneously displaying fold change and statistical significance of thousands of genes?

  • MA plot
  • Volcano plot
  • Boxplot
  • Pearson correlation matrix

Correct Answer: Volcano plot

Q14. What is the principal difference between microarray and RNA-Seq for in-silico gene expression profiling?

  • Microarrays measure sequence counts directly; RNA-Seq relies on hybridization intensities
  • RNA-Seq provides digital (count-based) measurement and greater dynamic range and sensitivity compared to hybridization-based microarrays
  • Microarrays have no need for normalization; RNA-Seq always needs none
  • RNA-Seq cannot detect splice variants while microarrays can

Correct Answer: RNA-Seq provides digital (count-based) measurement and greater dynamic range and sensitivity compared to hybridization-based microarrays

Q15. During preprocessing, which step aims to remove systematic non-biological background noise from probe intensities?

  • Background correction
  • Gene set enrichment
  • Differential expression testing
  • Pathway mapping

Correct Answer: Background correction

Q16. Which multivariate method reduces dimensionality of expression data to identify major sources of variance and potential outliers?

  • Hierarchical clustering with Euclidean distance
  • Principal Component Analysis (PCA)
  • Fisher exact test
  • Benjamini-Hochberg correction

Correct Answer: Principal Component Analysis (PCA)

Q17. In probe-level microarray analysis, “probe summarization” refers to:

  • Calculating the GC content of each probe
  • Combining multiple probe intensities that target the same transcript into a single expression value
  • Annotating probes with GO terms
  • Converting intensities to fold changes only

Correct Answer: Combining multiple probe intensities that target the same transcript into a single expression value

Q18. Which approach is commonly used to identify statistically significant differentially expressed genes while accounting for multiple groups or factors in the experimental design?

  • Pairwise t-tests without design consideration
  • Linear modeling with contrasts (e.g., limma with design matrix)
  • Single-sample normalization
  • Random guessing

Correct Answer: Linear modeling with contrasts (e.g., limma with design matrix)

Q19. What is a key advantage of using technical replicates in microarray experiments?

  • They substitute for biological replicates entirely
  • They help assess platform reproducibility and reduce measurement noise
  • They remove the need for normalization
  • They increase the number of genes on the array

Correct Answer: They help assess platform reproducibility and reduce measurement noise

Q20. Which in-silico validation strategy strengthens confidence in candidate biomarkers discovered from microarray data?

  • Ignoring independent datasets and using only discovery data
  • Validating candidates using independent datasets or orthogonal techniques (qPCR, independent cohorts)
  • Relying solely on fold-change without statistical testing
  • Using only housekeeping genes for validation

Correct Answer: Validating candidates using independent datasets or orthogonal techniques (qPCR, independent cohorts)

Leave a Comment