The Bernoulli Gene Code traced in Lesson 54.

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*

Skip to content
# A summary of the major probability distributions

# Chi-square distribution

# Lognormal distribution

# Celebrating One Year

# Central Limit Theorem

# Convolution

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*
# Trends and Attributions

# Gamma distribution — Time to ‘r’th arrival

# Memoryless Property of Exponential Distribution

# Exponential distributions: Waiting Time

The Bernoulli Gene Code traced in Lesson 54.

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*

Chi-square distribution is the sum of squares of normals. Learn about it in lesson 53. Bring a tequila.

Lesson 53 – Sum of squares: The language of Chi-square distribution

The lognormal distribution is explained in lesson 52. Mumble transforms the data.

Lesson 52 – Transformation: The language of lognormal distribution

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*

Today, we celebrate the one-year milestone for our data science blog. Prof. Upmanu Lall from Columbia University and Water Center joins us to teach kernel density in lesson 51.

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*

Learn about the Central Limit Theorem, “central” for probability theory.

The full derivation of how the Binomial distribution converges to Normal distribution in the limit is also provided.

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*

After last week’s conversation with Devine about Gamma distribution, an inspired Joe wanted to derive the probability density function of the Gamma distribution from the exponential distribution using the idea of convolution.

But first, he has to understand convolution. So he called upon Devine for his usual dialog.

** J**: Hello D, I can wait no longer, nor can I move on to a different topic when this idea of convolution is not clear to me. I feel anxious to know at least the basics that relate to our lesson last week.

** D**: It is a good anxiety to have. Will keep you focused on the mission. Where do we start?

** J**: We are having a form of dialog since the time we met. Why don’t you provide the underlying reasoning, and I will knit the weave from there.

**Their discussion continues in Lesson 46**. Devine explains the basics of convolution and Joe applies it to derive the pdf of Gamma distribution.

In 2017, we did 46 lessons. !

Look at this chart.

On the x-axis, I am showing 80 years starting with some reference. On the y-axis, I am plotting the counts or the number of extreme rainfall days each year.

Do you think there is an increasing trend is this data? Is there an overall increase in the number of extreme rainfall days?

Perhaps. Let’s add a trend line and see.

Once we superimpose a linear trend line on the plot, it becomes apparent. We still see variability from year to year, but in the long run, we see that the number of extreme rainfall days each year is increasing.

Can you guess why there is a trend in the data? In other words, can you think of some reasons for this trend?

Wait. Hold on to your reasons till I unveil how I created this data.

Look at this chart. It is **time vs. a constant number of extreme rainfall events**, 10, in each year.

Now, look at this chart which is a **pure sine wave with a periodicity of 5 years** and amplitude of 1.

Another **sine wave with a periodicity of 10 years** and amplitude of 1.

This one is with a **periodicity of 20 years** and amplitude of 1.

And, finally, take a look at this **sine wave with a periodicity of 100 years**and amplitude of 1.

If we add these sine waves and the constant plus some noise to the resultant number we get the original data.

Did you see the pattern? **The trend we initially observed is due to a combination of four different periodic sine waves**. Were these periodic oscillations in your reasons?

*If not, why not?*

Saman Armal, a Ph.D. student in the Department of Civil Engineering and NOAA CREST at the City College of New York, CUNY, working on extreme rainfall events, was also asking this question.

“We find trends in the data. What can we attribute these trends to?”

We started with **anthropogenic influence**, but, anthropogenic forcings cannot solely explain the trend. Climate has a **cyclical nature**. In a particular region, its manifestation can be entirely different for a given decade or century.

For instance, if we suppose that rainfall in a given area is influenced by interannual to decadal to multidecadal climate oscillations (like the periodic sine waves we saw before), any given decade or a block of time can manifest as runs of wet or dry years.

If the region has observed records long enough to capture these cyclicities, periods of wet years will be transposed by periods of dry years and the resulting long-term time trend as a result of climate cycles in rainfall will be nonexistent. On the contrary, if the region has limited observed records, one can detect a long-term increasing or decreasing trend in the data depending on whether the climate is manifested as wet or dry years.

The effect of natural climate variability in rainfall patterns including the impact of El Niño–Southern Oscillation (**ENSO**), the interdecadal Pacific oscillation (**IPO**), the Pacific decadal oscillation (**PDO**), the North Atlantic Oscillation (**NAO**), and the Atlantic multidecadal oscillation (**AMO**) is well documented. Hence, **we wanted to understand the influence of anthropogenic forcing and natural climate variability on the occurrence of extreme events in an integrated framework**.

This objective motivated Armal’s recent work which got published in the Journal of Climate. The paper provides a hypothesis-driven methodology to understand the association of trends in extreme rainfall event frequency to anthropogenic forcing and natural climate variability over the contiguous United States.

In our analysis, we consider two hypotheses:

- The monotonic trend in the annual frequency of extreme rainfall events is solely attributed to anthropogenic forcing, and
- The monotonic trend in the annual frequency of extreme rainfall events is attributed to anthropogenic forcing and cyclical climate variability.

The models get information from global near-surface temperature and climate indices, and the residual trends for each hypothesis are examined. The choice of the best alternative hypothesis is made based on the Watanabe–Akaike information criterion, a Bayesian pointwise predictive accuracy measure.

Statistically significant time trends are observed in 742 of the 1244 stations in the continental United States. Trends in 409 of these stations, predominantly found in the U.S. Southeast and Northeast climate regions can be attributed to changes in global surface temperature anomalies. The trends in 274 of these stations, mainly found in the U.S. Northwest, West and Southwest climate regions can be attributed to El Niño–Southern Oscillation, the North Atlantic Oscillation, the Pacific decadal oscillation, and the Atlantic multidecadal oscillation along with changes in global surface temperature anomalies.

**Please read the paper and let us know what you think**. You can get the paper from AMS website here. If you need a copy of it, please write to me. I will be happy to share. We welcome any comments and critics.

In lesson 45, Joe and Devine meet again, for the eighth time, to discuss Gamma distribution.

Joe summarizes how to derive the probability density function for the exponential distribution. He identifies that it is the continuous analog of the Geometric distribution.

Being a curious kid, he asks the right question.

Does the exponential distribution also have a related distribution that measures the wait time till the ‘r’th arrival?

Devine says that there is a related distribution that can be used to estimate the time to the ‘r’th arrival. It is called the **Gamma distribution**.

They both discuss how to derive the probability density function for the Gamma distribution using convolution.

The Gamma distribution has two control parameters, the the scale parameter (lambda) and the shape parameter (r).

Gamma distribution is frequently used to fit data with significant skewness such as the rainfall and insurance claims data.

Read the full lesson here.

Lesson 45 – Time to ‘r’th arrival: The language of Gamma distribution

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*

The exponential distribution has an interesting memoryless property.

The probability distribution of the remaining time until the event occurs is always the same regardless of the time that passed.

There is no memory in the process. The history is not relevant. The time to next arrival is not influenced by when the last event or arrival occurred.

This property is unique to the exponential and the geometric distributions.

Learn about this in detail in lesson 44. I promise you will never forget it.

Lesson 44 – Keep waiting: The memoryless property of exponential distribution

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*

Your every day “waiting for” experiences are exponential distributions. I explain how in lesson 43. You will find at least one thing in common.

You will also learn how to derive the probability functions of exponential distribution from first principles.

It is a fun and educational read. Trust me.

Lesson 43 – Wait time: The language of exponential distribution

*If you find this useful, please like, share and subscribe to my data analysis classroom.*

*You can also follow me on Twitter **@realDevineni** for updates on new lessons.*