Multiple correlation is a fundamental statistical tool in pharmaceutics that quantifies the combined relationship between several independent variables (formulation or process factors) and a single dependent outcome (e.g., dissolution rate, bioavailability, stability). B. Pharm students should understand the multiple correlation coefficient (R), coefficient of determination (R²), adjusted R², assumptions of multiple regression, and multicollinearity diagnostics such as VIF. Practical pharmaceutical applications include predicting drug release, optimizing formulations using Quality by Design (QbD), and modeling pharmacokinetic parameters. Mastery of calculation, interpretation, validation, and limitations enables robust experimental design and data-driven decision making in formulation development and stability studies. Now let’s test your knowledge with 30 MCQs on this topic.
Q1. What does the multiple correlation coefficient (R) measure in multiple regression?
- The slope of the regression line
- The combined linear association between all predictors and the dependent variable
- The error variance in the model
- The number of predictors in the model
Correct Answer: The combined linear association between all predictors and the dependent variable
Q2. How is the coefficient of determination (R²) interpreted in a pharmaceutical regression model?
- Proportion of total variance in predictors explained by the dependent variable
- Proportion of variance in the dependent variable explained by the predictors
- Average prediction error of the model
- Significance level of the regression coefficients
Correct Answer: Proportion of variance in the dependent variable explained by the predictors
Q3. Which of the following formulas relates R and R²?
- R = R² + 1
- R = sqrt(R²)
- R² = 1 / R
- R² = R – 1
Correct Answer: R = sqrt(R²)
Q4. Which assumption is NOT required for valid inference in multiple linear regression?
- Linearity between predictors and response
- Homoscedasticity of residuals
- Normality of predictor variables
- Independence of errors
Correct Answer: Normality of predictor variables
Q5. What does a high Variance Inflation Factor (VIF) indicate in a regression model?
- Strong predictive power of a variable
- Severe multicollinearity among predictors
- Large residuals for that predictor
- High measurement precision
Correct Answer: Severe multicollinearity among predictors
Q6. Why is adjusted R² preferred over R² when comparing models with different numbers of predictors?
- Adjusted R² always increases with added predictors
- Adjusted R² penalizes model complexity and avoids overfitting
- Adjusted R² is unaffected by sample size
- Adjusted R² equals the p-value of the model
Correct Answer: Adjusted R² penalizes model complexity and avoids overfitting
Q7. In multiple correlation, what is a partial correlation?
- Correlation between two variables ignoring all other variables
- Correlation between two variables while controlling for one or more additional variables
- Correlation computed only for categorical variables
- Correlation of residuals with predictors
Correct Answer: Correlation between two variables while controlling for one or more additional variables
Q8. Which test assesses the overall significance of a multiple regression model?
- t-test for individual coefficients
- ANOVA F-test for regression
- Chi-square goodness-of-fit
- Kruskal-Wallis test
Correct Answer: ANOVA F-test for regression
Q9. How can multicollinearity affect interpretation of regression coefficients in formulation studies?
- It makes coefficients more precise and stable
- It inflates standard errors and makes coefficient estimates unreliable
- It reduces R² to zero
- It changes categorical predictors into continuous ones
Correct Answer: It inflates standard errors and makes coefficient estimates unreliable
Q10. In a model predicting dissolution from polymer concentration and tablet hardness, adding a new relevant predictor typically does what to R²?
- Always decreases R²
- Never changes R²
- Cannot be determined without p-values
- Usually increases or leaves R² unchanged
Correct Answer: Usually increases or leaves R² unchanged
Q11. Which method is useful to reduce multicollinearity by creating uncorrelated components?
- Stepwise regression
- Principal Component Analysis (PCA)
- K-means clustering
- Kaplan-Meier estimation
Correct Answer: Principal Component Analysis (PCA)
Q12. For prediction of bioavailability, which validation approach checks model performance on unseen data?
- Leave-one-out cross-validation
- Using the same training data only
- Replacing predictors with random noise
- Calculating VIF values
Correct Answer: Leave-one-out cross-validation
Q13. Which statistic indicates the proportion of variance explained by a single predictor in the presence of others?
- Tolerance
- Partial R²
- Cook’s distance
- Durbin-Watson
Correct Answer: Partial R²
Q14. What is the impact of outliers on the multiple correlation coefficient?
- Outliers always reduce R
- Outliers can disproportionately increase or decrease R depending on their leverage
- Outliers have no effect on correlation
- Outliers convert R into R²
Correct Answer: Outliers can disproportionately increase or decrease R depending on their leverage
Q15. Which diagnostic measures influence of an observation on regression estimates?
- VIF
- Cook’s distance
- Adjusted R²
- Pearson correlation
Correct Answer: Cook’s distance
Q16. In a formulation study, an R² of 0.85 indicates what?
- 85% of predictor variables are significant
- 85% of variance in response is explained by the model
- Model has 15 predictors
- Model residuals explain 85% variation
Correct Answer: 85% of variance in response is explained by the model
Q17. Which procedure helps select a parsimonious set of predictors based on statistical criteria?
- Ridge regression without penalty
- Stepwise regression (forward/backward)
- Random imputation
- Wilcoxon signed-rank test
Correct Answer: Stepwise regression (forward/backward)
Q18. What does a negative coefficient for a predictor indicate in the multiple regression model?
- Predictor increases the dependent variable
- Predictor has no effect on the dependent variable
- Predictor is associated with a decrease in the dependent variable, holding others constant
- Model is invalid
Correct Answer: Predictor is associated with a decrease in the dependent variable, holding others constant
Q19. Which adjustment is used to account for small sample sizes and number of predictors?
- Adjusted R²
- Unadjusted R²
- Pearson r
- Spearman rho
Correct Answer: Adjusted R²
Q20. In pharmaceutical QbD, multiple regression helps primarily with what task?
- Randomizing clinical trials
- Quantifying relationships between CPPs/CQAs and product performance for optimization
- Manufacturing sterile products
- Calculating expiration dates from TGA curves
Correct Answer: Quantifying relationships between CPPs/CQAs and product performance for optimization
Q21. How is the standardized regression coefficient (beta) useful?
- It measures the absolute unit change in response
- It enables comparison of predictor importance by removing units
- It increases multicollinearity
- It is only used for categorical predictors
Correct Answer: It enables comparison of predictor importance by removing units
Q22. Which of the following is a remedy for severe multicollinearity?
- Increasing the number of highly correlated predictors
- Dropping one of the correlated predictors or using regularization (ridge)
- Ignoring VIF values
- Using simple correlation instead of regression
Correct Answer: Dropping one of the correlated predictors or using regularization (ridge)
Q23. What does a low R but significant predictors suggest?
- Model fits perfectly
- Predictors are significant but explain little of outcome variability
- Data are homoscedastic
- There is no linear relationship at all
Correct Answer: Predictors are significant but explain little of outcome variability
Q24. When including categorical formulation factors in multiple regression, which technique is applied?
- Log-transform categories directly
- Create dummy (indicator) variables
- Exclude categorical factors always
- Use Pearson correlation only
Correct Answer: Create dummy (indicator) variables
Q25. What is the effect of adding an irrelevant predictor to a large sample regression model?
- Substantially reduces R²
- May slightly increase R² but adjusted R² may decrease
- Always makes all coefficients significant
- Removes multicollinearity
Correct Answer: May slightly increase R² but adjusted R² may decrease
Q26. Which statistic helps detect serial correlation in residuals from batch process modeling?
- Shapiro-Wilk test
- Durbin-Watson statistic
- VIF
- Cronbach’s alpha
Correct Answer: Durbin-Watson statistic
Q27. How is R² used in in vitro–in vivo correlation (IVIVC) development?
- To prove causation between formulation and toxicity
- To quantify how well in vitro predictors explain in vivo outcomes
- To schedule clinical visits
- To derive dissolution medium composition
Correct Answer: To quantify how well in vitro predictors explain in vivo outcomes
Q28. Which cross-validation metric best summarizes prediction error for continuous outcomes?
- Confusion matrix
- Root Mean Squared Error (RMSE)
- Cohen’s kappa
- Log-rank statistic
Correct Answer: Root Mean Squared Error (RMSE)
Q29. If two predictors are perfectly collinear, what happens to the multiple regression solution?
- Unique least-squares estimates cannot be obtained (singular matrix)
- Model automatically removes one predictor
- R becomes negative
- Adjusted R² equals zero
Correct Answer: Unique least-squares estimates cannot be obtained (singular matrix)
Q30. In stability prediction using multiple regression, which practice improves model generalizability?
- Overfitting by including all interaction terms without validation
- Using external validation and reducing unnecessary predictors
- Reporting only R² from the training set
- Ignoring heteroscedasticity
Correct Answer: Using external validation and reducing unnecessary predictors

