Regression modeling – simple and multiple regression hypothesis testing MCQs With Answer

Introduction: In pharmaceutical research, regression modeling is a vital statistical tool for exploring relationships between drug responses and predictors. Simple regression examines how one independent variable (for example, dose) affects an outcome (for example, plasma concentration), while multiple regression evaluates several predictors (dose, age, formulation, and co-medication) simultaneously. Hypothesis testing in regression—using t-tests for individual coefficients and F-tests for overall model fit—helps determine whether predictors significantly influence outcomes. Understanding core assumptions (linearity, independence, homoscedasticity, normality) and diagnostics (residual analysis, variance inflation factor, Durbin–Watson test) ensures valid conclusions in formulation development and pharmacokinetics. Now let’s test your knowledge with 30 MCQs on this topic.

Q1. What is the primary difference between simple regression and multiple regression?

Simple regression uses categorical predictors while multiple regression uses continuous predictors
Simple regression models one predictor variable; multiple regression models two or more predictor variables
Simple regression uses hypothesis testing; multiple regression does not
Simple regression requires normality; multiple regression does not require any assumptions

Correct Answer: Simple regression models one predictor variable; multiple regression models two or more predictor variables

Q2. In testing the slope coefficient (β1) in simple linear regression, the null hypothesis is usually:

β1 = 1
β1 ≠ 0
β1 = 0
β1 > 0

Correct Answer: β1 = 0

Q3. Which test is commonly used to assess whether an individual regression coefficient differs significantly from zero?

Chi-square test
t-test
Fisher’s exact test
Z-test

Correct Answer: t-test

Q4. What does the F-test assess in multiple regression?

Whether the residuals are normally distributed
Whether at least one predictor variable is significantly related to the response
The presence of multicollinearity
Whether the slope is equal to 1

Correct Answer: Whether at least one predictor variable is significantly related to the response

Q5. Which statement best describes R-squared (R²)?

Proportion of variance in predictors explained by the response
Proportion of variance in the response explained by the model
Average squared residual
Standardized regression coefficient

Correct Answer: Proportion of variance in the response explained by the model

Q6. Why is adjusted R-squared often preferred to R-squared in multiple regression?

Adjusted R-squared increases with every added predictor regardless of relevance
Adjusted R-squared penalizes adding non-informative predictors and adjusts for number of predictors
Adjusted R-squared is always equal to R-squared
Adjusted R-squared measures multicollinearity directly

Correct Answer: Adjusted R-squared penalizes adding non-informative predictors and adjusts for number of predictors

Q7. A variance inflation factor (VIF) greater than which value is commonly taken as evidence of problematic multicollinearity?

Correct Answer: 10

Q8. Which diagnostic test is commonly used to detect heteroscedasticity (non-constant variance of residuals)?

Durbin-Watson test
Breusch-Pagan test
Levene’s test for equality of variances between two groups only
Kruskal-Wallis test

Correct Answer: Breusch-Pagan test

Q9. The Durbin–Watson statistic is used to detect which issue in regression residuals?

Heteroscedasticity
Non-linearity
Autocorrelation (serial correlation)
Multicollinearity

Correct Answer: Autocorrelation (serial correlation)

Q10. How are categorical predictors typically included in a regression model?

As continuous variables scaled 0–1 without change
Using dummy (indicator) variables
They cannot be included in regression
By converting them to z-scores

Correct Answer: Using dummy (indicator) variables

Q11. What does an interaction term in multiple regression represent?

The sum of two predictors
The multiplicative effect showing that the effect of one predictor depends on the level of another predictor
An error in the model
A method to remove collinearity

Correct Answer: The multiplicative effect showing that the effect of one predictor depends on the level of another predictor

Q12. Which test is commonly used to check residuals for normality in small to moderate sample sizes?

Shapiro-Wilk test
Kolmogorov-Smirnov test with no parameters
ANOVA
Chi-square goodness-of-fit

Correct Answer: Shapiro-Wilk test

Q13. A 95% confidence interval for a regression slope β1 that does not include zero implies:

The slope estimate is biased
The corresponding predictor is statistically significant at approximately α = 0.05
The predictor has no practical significance
The residuals are heteroscedastic

Correct Answer: The corresponding predictor is statistically significant at approximately α = 0.05

Q14. What is the purpose of calculating standardized (beta) coefficients in regression?

To test normality of residuals
To compare the relative importance of predictors measured on different scales
To produce categorical predictors
To reduce heteroscedasticity

Correct Answer: To compare the relative importance of predictors measured on different scales

Q15. Which measure identifies influential observations that can unduly affect regression estimates?

Cook’s distance
VIF
R-squared
Adjusted R-squared

Correct Answer: Cook’s distance

Q16. In pharmacokinetic modeling, applying a log transformation to concentration data is often useful because:

It always makes data categorical
It stabilizes variance and linearizes exponential decay relationships
It increases heteroscedasticity
It removes the need to check assumptions

Correct Answer: It stabilizes variance and linearizes exponential decay relationships

Q17. One criticism of stepwise variable selection methods is that they:

Always find the true causal predictors
Can produce models that capitalize on random noise and lack reproducibility
Are guaranteed to minimize prediction error on new data
Do not require hypothesis testing

Correct Answer: Can produce models that capitalize on random noise and lack reproducibility

Q18. Which statement best distinguishes prediction from causation in regression?

Good predictive performance implies causation
Regression coefficients always indicate causal effects
Prediction focuses on accurate forecasts; causal inference requires design or assumptions to support cause-effect claims
There is no difference between prediction and causation

Correct Answer: Prediction focuses on accurate forecasts; causal inference requires design or assumptions to support cause-effect claims

Q19. A p-value associated with a regression coefficient indicates:

The probability that the null hypothesis is true
The probability of observing data as extreme as observed, assuming the null hypothesis is true
The magnitude of the effect
The sample size required

Correct Answer: The probability of observing data as extreme as observed, assuming the null hypothesis is true

Q20. A Type I error in regression hypothesis testing means:

Failing to detect a true effect
Incorrectly concluding a predictor has an effect when it does not
The model has perfect fit
Residuals are normally distributed

Correct Answer: Incorrectly concluding a predictor has an effect when it does not

Q21. For a multiple regression with n observations and k predictors (excluding intercept), the residual degrees of freedom is:

n
n – 1
n – k – 1
k

Correct Answer: n – k – 1

Q22. Which effect is expected when multicollinearity among predictors increases?

Standard errors of coefficient estimates increase, making inference less precise
R-squared decreases dramatically
Model always becomes more accurate for prediction
Residual variance becomes zero

Correct Answer: Standard errors of coefficient estimates increase, making inference less precise

Q23. Which pattern suggests overfitting when comparing training and test performance?

High R-squared on training data but low R-squared on test data
Low training error and low test error
Equal performance on training and test sets
High bias and low variance on training data

Correct Answer: High R-squared on training data but low R-squared on test data

Q24. Why is centering continuous predictors (subtracting the mean) useful when including interaction terms?

It eliminates the need for dummy variables
It reduces multicollinearity between main effects and interaction terms and eases interpretation
It guarantees residual normality
It increases VIF values

Correct Answer: It reduces multicollinearity between main effects and interaction terms and eases interpretation

Q25. The null hypothesis for the overall F-test in multiple regression is:

All residuals are normally distributed
All regression coefficients (except intercept) are zero
At least one coefficient is non-zero
The model explains 100% variance

Correct Answer: All regression coefficients (except intercept) are zero

Q26. In regression decomposition, the relationship SST = SSR + SSE means:

Total sum of squares equals explained sum of squares plus unexplained sum of squares
Standard sum of terms equals square root of SSR times SSE
Sample size equals sum of squares
SST is always less than SSR

Correct Answer: Total sum of squares equals explained sum of squares plus unexplained sum of squares

Q27. When is multiple regression particularly preferable to simple regression in pharmaceutical studies?

When only one predictor is available
When researchers want to control for confounding variables and assess independent effects of several predictors
When assumptions of linearity are violated
When sample size is extremely small (n < 10)

Correct Answer: When researchers want to control for confounding variables and assess independent effects of several predictors

Q28. Which graphical diagnostic is most useful to assess the linearity assumption between predictors and response?

Histogram of predictors
Residuals vs. fitted values plot
Bar chart of categorical counts
Pareto chart

Correct Answer: Residuals vs. fitted values plot

Q29. In a multiple regression model, the coefficient for a predictor represents:

The unadjusted correlation between predictor and response
The expected change in the response for a one-unit change in the predictor, holding other predictors constant
The variance explained by that predictor alone
The p-value of the predictor

Correct Answer: The expected change in the response for a one-unit change in the predictor, holding other predictors constant

Q30. Between AIC and BIC for model selection, which statement is true?

AIC penalizes model complexity more strongly than BIC
BIC penalizes model complexity more strongly than AIC, favoring simpler models as sample size increases
Both criteria always select the same model
Lower BIC indicates worse model fit

Correct Answer: BIC penalizes model complexity more strongly than AIC, favoring simpler models as sample size increases

G S Sachin

I am a Registered Pharmacist under the Pharmacy Act, 1948, and the founder of PharmacyFreak.com. I hold a Bachelor of Pharmacy degree from Rungta College of Pharmaceutical Science and Research. With a strong academic foundation and practical knowledge, I am committed to providing accurate, easy-to-understand content to support pharmacy students and professionals. My aim is to make complex pharmaceutical concepts accessible and useful for real-world application.

Mail- Sachin@pharmacyfreak.com

Leave a Comment Cancel reply