A very helpful post on Stata Blog (here) discusses problems presented by two distinct (though too often conflated) phenomena that can distort samples: truncation and censoring. As the post describes: "Data are left-truncated when individuals below a threshold are not present in the sample. For example, if we want to study the size of certain fish based on the specimens captured with a net, fish smaller than the net grid won’t be present in our sample." In contrast, data are "left-censored at κ if every individual with a value below κ is present in the sample, but the actual value is unknown. This happens, for example, when we have a measuring instrument that cannot detect values below a certain level." Helpful examples in the post illustrate these issues with vivid clarity. Finally, having reminded us of a potential data problem, the post goes on to identify a potential solution. Stata commands now permit estimating the underlying population parameters for truncated data ("truncreg") as well as for censored data ("intreg" or "tobit").
Comments