In regression modeling, the principle of effect hierarchy maintains that main (first-order) effects tend to account for the largest amounts of variation in the response. Second-order effects, that is, interaction effects and quadratic terms, are next in terms of accounting for variation. Then come higher-order terms, in hierarchical order.
The principle of effect heredity relates to the inclusion in the model of lower-order components of higher-order effects. The motivation for this principle is observational evidence that factors with small main effects tend not to have significant interaction effects.
Strong effect heredity requires that all lower-order components of a model effect be included in the model. Suppose that a three-way interaction (ABC) is in the model. Then all of its component main effects and two-way interactions (A, B, C, AB, AC, BC) must also be in the model.
Weak effect heredity requires that only a sequence of lower-order components of a model effect be included. If a three-way interaction is in the model, then the model must contain one of the factors involved and one two-way interaction involving that factor. Suppose that the three-way interaction ABC is in the model. Then if B and BC are also in the model, the model satisfies weak effect heredity.
The principle of effect sparsity asserts that most of the variation in the response is explained by a relatively small number of effects. Screening designs, where many effects are studied, rely heavily on effect sparsity. Experience shows that the number of runs used in a screening design should be at least twice the number of effects that are likely to be significant.
Designed experiments are typically constructed to require as few runs as possible, consistent with the goals of the experiment. With too few runs, only extremely large effects can be detected. For example, for a given effect, the t-test statistic is the ratio of the change in response means to their standard error. If there is only one error degree of freedom (df), then the critical value of the test exceeds 12. So, for such a nearly saturated design to detect an effect, it has to be very large.
A similar observation applies to the lack-of-fit test. The power of this test to detect lack-of-fit depends on the numbers of degrees of freedom in the numerator and denominator. If you have only 1 df of each kind, you need an F value that exceeds 150 to declare significance at the 0.05 level. If you have 2 df of each kind, then the F value must exceed 19. In order for the test to be significant in this second case, the lack-of-fit mean square must be 19 times larger than the pure error mean square. It is also true that the lack-of-fit test is sensitive to outliers.