More Normal than Normal

  1. What is the overriding concern of the HOT based approach? (section 1)
  2. What is a subexponential distribution? Are they scaling? (section 2)
  3. Give practical examples (from the article) of the scaling distributions invariance properties aggregation, mixture, and maximization? (section 2.3)
  4. What are the tradeoffs between assuming Gaussian for low variability data and assuming scaling for high variability data?
  5. What is the 4 step approach to conventional model fitting? Why does this approach fail to select the right models for scaling distributions? (section 3)
  6. If we assume that a single process generates a data set then what two effects should we see if we employ a dynamic (nested) approach to examining the data? (section 3)
  7. Why does figure 4 illustrate that the second moment (standard deviation) does not exist?
  8. How does figure 5 show that the Lognormal fit is "certifiable wrong"?
  9. How does figure 6 allow us to conclude that the Pareto fit is a good one?
  10. Why are scaling distributions more robust than Gaussian distributions? (4.1)
  11. What is the "bad news" and "good news" in the new treatment of internet traffic that relies on robust control? (section 4.3)
  12. Why are the popular "scale-free network" models not appropriate for internet router traffic models? (section 4.3)

Clauset et al. paper

  1. In equation 1.1 the range for the powers of interest are said to lie in (2,3). How is this consistent with LATDW's range of (1,2)?
  2. Assuming x_min is known, be able to explain equations (3.1) and (3.7).
  3. How does table 3.1 support the use of equations (3.1) and (3.7)?
  4. What minimum size data set do the authors recommend? Why are small data sets problematic?
  5. How does figure 3.3 illustrate the importance of correctly estimating x_min?
  6. How do subjective approaches estimate x_min? (section 3.3)
  7. Describe the second objective approach for estimating x_min.