More Normal than Normal
- What is the overriding concern of the HOT based approach? (section 1)
- What is a subexponential distribution? Are they scaling? (section 2)
- Give practical examples (from the article) of the scaling distributions
invariance properties aggregation, mixture, and maximization? (section 2.3)
- What are the tradeoffs between assuming Gaussian for low variability data and
assuming scaling for high variability data?
- What is the 4 step approach to conventional model fitting? Why does this
approach fail to select the right models for scaling distributions? (section 3)
- If we assume that a single process generates a data set then what two effects should
we see if we employ a dynamic (nested) approach to examining the data? (section 3)
- Why does figure 4 illustrate that the second moment (standard deviation) does not exist?
- How does figure 5 show that the Lognormal fit is "certifiable wrong"?
- How does figure 6 allow us to conclude that the Pareto fit is a good one?
- Why are scaling distributions more robust than Gaussian distributions? (4.1)
- What is the "bad news" and "good news" in the new treatment of internet traffic
that relies on robust control? (section 4.3)
- Why are the popular "scale-free network" models not appropriate for internet router
traffic models? (section 4.3)
Clauset et al. paper
- In equation 1.1 the range for the powers of interest are said to lie in (2,3).
How is this consistent with LATDW's range of (1,2)?
- Assuming x_min is known, be able to explain equations (3.1) and (3.7).
- How does table 3.1 support the use of equations (3.1) and (3.7)?
- What minimum size data set do the authors recommend? Why are small data sets problematic?
- How does figure 3.3 illustrate the importance of correctly estimating x_min?
- How do subjective approaches estimate x_min? (section 3.3)
- Describe the second objective approach for estimating x_min.