Sunday, July 13, 2014

You say continuous, I say discrete

Data types are a critical factor for quality statistical analysis. This helps to understand what kind of statistical analysis can be utilized to make decisions. Making decisions is the goal of data. If data does not draw you closer to a decision it is worthless. There are many considerations that ensure that your data will be useful but one of the first is the data type. There are two primary data types that will affect the future analysis of your data. Continuous is the most common type and most useful. Continuous data is data on a continuum. This does not mean that the data extremes can be infinite but just means that the data can be subdivided into infinite number of sections. Consider the weight of my dog, say 10.1lbs. She could be 10.2lbs tomorrow and 9.8lbs next week. Discrete data is data in buckets. Consider the weight data, if we had 3 dogs, a doberman, bichon, and a corgi. If we were classifing these dogs using discrete data, we would say that the doberman is heavy, the corgi is medium and the bichon is light. One way to think of it is that the continous data gives us more information whereas the discrete data is less specific, less information. One area that is often confused is count data. Consider counting the number of ears of corn harvested in a season. It could be 12.145 ears of corn. This seems like continuous data, but don't be confused by the high numbers, it's actually discrete. It is still in buckets, 1, 2, 3, 4, and so on. However if we took the weight, it provides much more information. If you were harvesting your crop this fall and could choose between the count or the weight in grams, which would you choose? Which would give you a better estimate of the value of your crop?


To give you an idea of the amount of information or value in each type of data, lets utlize some sample size calculations to show how much data would be needed for analysis. Check out this site, http://www.fulcruminquiry.com/calculating_sample_size.htm, where you can determine sample size for discrete and continuous data sets. For equivalent confidence, a discrete data set of of 2000 would be equivalent to a continuous data set of 30. 

Data types are the tip of the iceberg for quality statistical analysis, however if you mess it up, your results will be messed up.

No comments:

Post a Comment