000639: Unable to estimate the model due to multicollinearity (data redundancy).


The regression model cannot solve (the design matrix cannot be inverted) in the presence of multicollinearity. Multicollinearity occurs when two or more variables are redundant (meaning that they are telling the same "story" or almost the same story). An effective model will have explanatory variables that each gets at a different facet of the dependent variable you are trying to predict/understand.


  1. Remove any redundant fields from the set of explanatory variables.
  2. Identify and remove any explanatory variables that have the same value for all features (for example, a field containing all zeros).
  3. Create a scatterplot matrix for your explanatory variables and assess whether there are near perfect correlations. If so, consider dropping one of the corresponding variables from your model.