Introduction: In pharmaceutical research, regression modeling is a vital statistical tool for exploring relationships between drug responses and predictors. Simple regression examines how one independent variable (for example, dose) affects an outcome (for example, plasma concentration), while multiple regression evaluates several predictors (dose, age, formulation, and co-medication) simultaneously. Hypothesis testing in regression—using t-tests for individual coefficients and F-tests for overall model fit—helps determine whether predictors significantly influence outcomes. Understanding core assumptions (linearity, independence, homoscedasticity, normality) and diagnostics (residual analysis, variance inflation factor, Durbin–Watson test) ensures valid conclusions in formulation development and pharmacokinetics. Now let’s test your knowledge with 30 MCQs on this topic.
Q1. What is the primary difference between simple regression and multiple regression?
- Simple regression uses categorical predictors while multiple regression uses continuous predictors
- Simple regression models one predictor variable; multiple regression models two or more predictor variables
- Simple regression uses hypothesis testing; multiple regression does not
- Simple regression requires normality; multiple regression does not require any assumptions
Correct Answer: Simple regression models one predictor variable; multiple regression models two or more predictor variables
Q2. In testing the slope coefficient (β1) in simple linear regression, the null hypothesis is usually:
- β1 = 1
- β1 ≠ 0
- β1 = 0
- β1 > 0
Correct Answer: β1 = 0
Q3. Which test is commonly used to assess whether an individual regression coefficient differs significantly from zero?
- Chi-square test
- t-test
- Fisher’s exact test
- Z-test
Correct Answer: t-test
Q4. What does the F-test assess in multiple regression?
- Whether the residuals are normally distributed
- Whether at least one predictor variable is significantly related to the response
- The presence of multicollinearity
- Whether the slope is equal to 1
Correct Answer: Whether at least one predictor variable is significantly related to the response
Q5. Which statement best describes R-squared (R²)?
- Proportion of variance in predictors explained by the response
- Proportion of variance in the response explained by the model
- Average squared residual
- Standardized regression coefficient
Correct Answer: Proportion of variance in the response explained by the model
Q6. Why is adjusted R-squared often preferred to R-squared in multiple regression?
- Adjusted R-squared increases with every added predictor regardless of relevance
- Adjusted R-squared penalizes adding non-informative predictors and adjusts for number of predictors
- Adjusted R-squared is always equal to R-squared
- Adjusted R-squared measures multicollinearity directly
Correct Answer: Adjusted R-squared penalizes adding non-informative predictors and adjusts for number of predictors
Q7. A variance inflation factor (VIF) greater than which value is commonly taken as evidence of problematic multicollinearity?
- 1
- 2
- 5
- 10
Correct Answer: 10
Q8. Which diagnostic test is commonly used to detect heteroscedasticity (non-constant variance of residuals)?
- Durbin-Watson test
- Breusch-Pagan test
- Levene’s test for equality of variances between two groups only
- Kruskal-Wallis test
Correct Answer: Breusch-Pagan test
Q9. The Durbin–Watson statistic is used to detect which issue in regression residuals?
- Heteroscedasticity
- Non-linearity
- Autocorrelation (serial correlation)
- Multicollinearity
Correct Answer: Autocorrelation (serial correlation)
Q10. How are categorical predictors typically included in a regression model?
- As continuous variables scaled 0–1 without change
- Using dummy (indicator) variables
- They cannot be included in regression
- By converting them to z-scores
Correct Answer: Using dummy (indicator) variables
Q11. What does an interaction term in multiple regression represent?
- The sum of two predictors
- The multiplicative effect showing that the effect of one predictor depends on the level of another predictor
- An error in the model
- A method to remove collinearity
Correct Answer: The multiplicative effect showing that the effect of one predictor depends on the level of another predictor
Q12. Which test is commonly used to check residuals for normality in small to moderate sample sizes?
- Shapiro-Wilk test
- Kolmogorov-Smirnov test with no parameters
- ANOVA
- Chi-square goodness-of-fit
Correct Answer: Shapiro-Wilk test
Q13. A 95% confidence interval for a regression slope β1 that does not include zero implies:
- The slope estimate is biased
- The corresponding predictor is statistically significant at approximately α = 0.05
- The predictor has no practical significance
- The residuals are heteroscedastic
Correct Answer: The corresponding predictor is statistically significant at approximately α = 0.05
Q14. What is the purpose of calculating standardized (beta) coefficients in regression?
- To test normality of residuals
- To compare the relative importance of predictors measured on different scales
- To produce categorical predictors
- To reduce heteroscedasticity
Correct Answer: To compare the relative importance of predictors measured on different scales
Q15. Which measure identifies influential observations that can unduly affect regression estimates?
- Cook’s distance
- VIF
- R-squared
- Adjusted R-squared
Correct Answer: Cook’s distance
Q16. In pharmacokinetic modeling, applying a log transformation to concentration data is often useful because:
- It always makes data categorical
- It stabilizes variance and linearizes exponential decay relationships
- It increases heteroscedasticity
- It removes the need to check assumptions
Correct Answer: It stabilizes variance and linearizes exponential decay relationships
Q17. One criticism of stepwise variable selection methods is that they:
- Always find the true causal predictors
- Can produce models that capitalize on random noise and lack reproducibility
- Are guaranteed to minimize prediction error on new data
- Do not require hypothesis testing
Correct Answer: Can produce models that capitalize on random noise and lack reproducibility
Q18. Which statement best distinguishes prediction from causation in regression?
- Good predictive performance implies causation
- Regression coefficients always indicate causal effects
- Prediction focuses on accurate forecasts; causal inference requires design or assumptions to support cause-effect claims
- There is no difference between prediction and causation
Correct Answer: Prediction focuses on accurate forecasts; causal inference requires design or assumptions to support cause-effect claims
Q19. A p-value associated with a regression coefficient indicates:
- The probability that the null hypothesis is true
- The probability of observing data as extreme as observed, assuming the null hypothesis is true
- The magnitude of the effect
- The sample size required
Correct Answer: The probability of observing data as extreme as observed, assuming the null hypothesis is true
Q20. A Type I error in regression hypothesis testing means:
- Failing to detect a true effect
- Incorrectly concluding a predictor has an effect when it does not
- The model has perfect fit
- Residuals are normally distributed
Correct Answer: Incorrectly concluding a predictor has an effect when it does not
Q21. For a multiple regression with n observations and k predictors (excluding intercept), the residual degrees of freedom is:
- n
- n – 1
- n – k – 1
- k
Correct Answer: n – k – 1
Q22. Which effect is expected when multicollinearity among predictors increases?
- Standard errors of coefficient estimates increase, making inference less precise
- R-squared decreases dramatically
- Model always becomes more accurate for prediction
- Residual variance becomes zero
Correct Answer: Standard errors of coefficient estimates increase, making inference less precise
Q23. Which pattern suggests overfitting when comparing training and test performance?
- High R-squared on training data but low R-squared on test data
- Low training error and low test error
- Equal performance on training and test sets
- High bias and low variance on training data
Correct Answer: High R-squared on training data but low R-squared on test data
Q24. Why is centering continuous predictors (subtracting the mean) useful when including interaction terms?
- It eliminates the need for dummy variables
- It reduces multicollinearity between main effects and interaction terms and eases interpretation
- It guarantees residual normality
- It increases VIF values
Correct Answer: It reduces multicollinearity between main effects and interaction terms and eases interpretation
Q25. The null hypothesis for the overall F-test in multiple regression is:
- All residuals are normally distributed
- All regression coefficients (except intercept) are zero
- At least one coefficient is non-zero
- The model explains 100% variance
Correct Answer: All regression coefficients (except intercept) are zero
Q26. In regression decomposition, the relationship SST = SSR + SSE means:
- Total sum of squares equals explained sum of squares plus unexplained sum of squares
- Standard sum of terms equals square root of SSR times SSE
- Sample size equals sum of squares
- SST is always less than SSR
Correct Answer: Total sum of squares equals explained sum of squares plus unexplained sum of squares
Q27. When is multiple regression particularly preferable to simple regression in pharmaceutical studies?
- When only one predictor is available
- When researchers want to control for confounding variables and assess independent effects of several predictors
- When assumptions of linearity are violated
- When sample size is extremely small (n < 10)
Correct Answer: When researchers want to control for confounding variables and assess independent effects of several predictors
Q28. Which graphical diagnostic is most useful to assess the linearity assumption between predictors and response?
- Histogram of predictors
- Residuals vs. fitted values plot
- Bar chart of categorical counts
- Pareto chart
Correct Answer: Residuals vs. fitted values plot
Q29. In a multiple regression model, the coefficient for a predictor represents:
- The unadjusted correlation between predictor and response
- The expected change in the response for a one-unit change in the predictor, holding other predictors constant
- The variance explained by that predictor alone
- The p-value of the predictor
Correct Answer: The expected change in the response for a one-unit change in the predictor, holding other predictors constant
Q30. Between AIC and BIC for model selection, which statement is true?
- AIC penalizes model complexity more strongly than BIC
- BIC penalizes model complexity more strongly than AIC, favoring simpler models as sample size increases
- Both criteria always select the same model
- Lower BIC indicates worse model fit
Correct Answer: BIC penalizes model complexity more strongly than AIC, favoring simpler models as sample size increases

I am a Registered Pharmacist under the Pharmacy Act, 1948, and the founder of PharmacyFreak.com. I hold a Bachelor of Pharmacy degree from Rungta College of Pharmaceutical Science and Research. With a strong academic foundation and practical knowledge, I am committed to providing accurate, easy-to-understand content to support pharmacy students and professionals. My aim is to make complex pharmaceutical concepts accessible and useful for real-world application.
Mail- Sachin@pharmacyfreak.com
