The basics of continuous probability distributions

The characteristic feature in all the discrete distributions is that the random variable X is discrete. The possible outcomes are distinct numbers, which is why we called them discrete probability distributions.

Have you asked yourself, “what if the random variable X is continuous?” What is the probability that X can take any particular value x on the real number line which has infinite possibilities?

For a continuous random variable, the number of possible outcomes is infinite, hence,

P(X = x) = 0.

For continuous random variables, the probability is defined in an interval between two values. It is computed using continuous probability distribution functions.

Learn more about these fundamentals in Lesson 41.

Lesson 41 – Struck by a smooth function

If you find this useful, please like, share and subscribe to my data analysis classroom.
You can also follow me on Twitter @realDevineni for updates on new lessons.

R coding for Geometric, Negative Binomial and Poisson distributions

Today’s temperature in New York is below 30F — a cold November day.

  1. Do you want to know what the probability of a cold November day is?
  2. Do you want to know what the return period of such an event is?
  3. Do you want to know how many such events happened in the last five years?

Get yourself some warm tea. Let the room heater crackle. We are diving into rest of the discrete distributions in R. The lesson with complete code is here. Happy coding.

Lesson 40 – Discrete distributions in R: Part II

If you find this useful, please like, share and subscribe to my data analysis classroom.
You can also follow me on Twitter @realDevineni for updates on new lessons.

Learn discrete distributions and how to create Animations in R

Today’s lesson includes a journey through Bernoulli trials and Binomial distribution in R.

I use data from New York City’s parking violations. Since we are learning discrete probability distributions, the violation tickets data can serve as a neat example.

We also learn how to create GIFs in R. We first save the plots as “.png” files and then combine them into a GIF using the “animation” and “magick”packages.

The lesson with complete code is here. Happy coding.

Lesson 39 – Discrete distributions in R: Part I

If you find this useful, please like, share and subscribe.
You can also follow me on Twitter @realDevineni for updates on new lessons.

 

 

 

 

What is Hypergeometric Distribution?

If there are R Pepsi cans in a total of N cans (N-R Cokes) and we are asked to identify them correctly, in our choice selection of R Pepsi, we can get k = 0, 1, 2, … R Pepsi. The number of correct guesses and the probability of correctly selecting k Pepsi cans is Hypergeometric distribution.

Hypergeometric distribution is typically used in quality control analysis for estimating the probability of defective items out of a selected lot.

The Pepsi-Coke marketing analysis is another example application. Companies can analyze the preferences of one product to other among a subset of customers in their region.

Learn more about Hypergeometric distribution and how to derive the probability from the ground up in lesson 38 of our data analysis classroom.

Lesson 38 – Correct guesses: The language of Hypergeometric distribution

If you find this useful, please like, share and subscribe.
You can also follow me on Twitter @realDevineni for updates on new lessons.

Poisson distribution

If we assume events are independent (the occurrence of one event does not affect the probability that a second event will occur), then the counts per unit interval can be assumed a random variable that follows a probability distribution. Counts, i.e., the number of times an event occurs in an interval follows a Poisson distribution.

Poisson distribution has one control parameter. It is the rate of occurrence; the average number of events per unit interval.

In lesson 36, we learn the fundamentals of Poisson distribution. You will also meet Able and Mumble two of my friends.

Lesson 36 – Counts: The language of Poisson distribution

 

What is Return Period

Your recent vocabulary may include “100-year event” (happening more often), (drainage system designed for) “10-year storm,” and so on, courtesy mainstream media and news outlets.

Does a 10-year return period event occur diligently every ten years? Can a 100-year event occur three times in a row?

If we define T as a random variable that measures the time between the events (wait time or time to the next event or time to the first event since the previous event), the return period of the event is the expected value of T, i.e., E[T], its average measured over a large number of such occurrences.

In lesson 34, we learn about return period through Bob and his reappearance. Bob’s time of occurrence also relates to Geometric distribution.

Lesson 34 – I’ll be back: The language of Return Period

If you find this useful, please like, share and subscribe.
You can also follow me on Twitter @realDevineni for updates on new lessons.

Geometric distribution and its basics

Try, try and try again till you succeed. That is Geometric distribution.

If we consider independent Bernoulli trials of 0s and 1s with some probability of occurrence p and assume X to be a random variable that measures the number of trials it takes to see the first success, then, X is said to be Geometrically distributed.

In lesson 33, we learn the basics of Geometric distribution.

Lesson 33 – Trials to first success: The language of Geometric distribution

 

Binomial Distribution Explained

In lesson 31, we learned the idea of Bernoulli sequence. In lesson 32, we take this idea as the basis to understand Binomial distribution. When we are interested in the random variable that is the number of successes in so many trials, it follows a Binomial distribution. “Exactly k successes” is the language of Binomial distribution.

Full lesson here.

Lesson 32 – Exactly k successes: The language of Binomial distribution