## METHODOLOGY that time interval. Then, Y is called

METHODOLOGY

1.
Poison Distribution

We Will Write a Custom Essay about METHODOLOGY that time interval. Then, Y is called
For You For Only \$13.90/page!

order now

Poison distribution is discrete probability
distribution which is used to describe the number of events occurring within a
given time interval, length, volume or area. In many practical situations we
are interested in measuring how many times a certain event occurs in a specific
time interval, specific length, volume or area.

For instance:

·
The number of
phone calls received at a call Centre in an hour,

·
The number of
cases of a disease in a week,

·
The number of
flaws on a length of cable,

·
The number of
cases of a disease in different town,

·
The number of
defects per square yard, …etc.

The Poison
distribution is based on four assumptions 1

1.
The probability of observing a single event over a
small interval is approximately proportional to the size of that interval.

2.
The probability of two events occurring in the same
narrow interval is negligible.

3.
The probability of an event within a certain interval
does not change over different intervals.

4.
The probability of an event in one interval is
independent of the probability of an event in any other non-overlapping
interval.

Under the above assumptions,

? be the rate at which event
occur, t be the length of a time interval, and Y be the total number of events
occurred in that time interval. Then, Y is called a Poisson random variable and
the probability distribution of Y is called the Poisson distribution.

Then, the probability mass
function of Y is:

The
mean and variance of the poison distribution are both equal to ?.

E(Y)=? and Var(Y)= ?2= ?

Poison distribution can be identified as the
limiting case of binomial distribution under the following conditions. If the
binomial distribution Bin(n,p) met the following condition, then  Bin( n, p) can be well-approximated by the
Poison  distribution Poi(?).

·
number of
trials(n) gets larger and probability of successes (p) gets smaller

·
the
distributions have the same means; i.e. ?=np.

References

1.
http://www.pmean.com/definitions/poisson.htm

2.
Negative binomial distribution

The
negative binomial distribution is a discrete probability distribution that is
used with discrete random variable. It is also known as the Pascal distribution
or as Polya’s distribution.

The
negative binomial random variables and distribution are based on the following
conditions: 1

1.      The
experiment consists of a sequence of independent trials.

2.      Each
trial can result in either a success (S) or a failure (F).

3.      The
probability of success is constant from trial to trial, so for i = 1, 2, 3, …

4.      The
experiment continues (trials are performed) until a total of r successes have
been observed, where r is a specified positive integer.

Under
the above conditions, there are r Bernoulli trials with probability of success
p, and where r is fixed integer. X is number of trials needed to get to the rth
success. X is called negative binomial random variable and probability
distribution of X is called negative binomial distribution with parameter r and
p.

The
probability mass function of X is:

Where
x= r+1, r+2, ….

Alternative
form of the negative binomial distribution

Let
Y is the number of failures before rth success. Sometimes, the
negative binomial distribution defined in terms of the random variable Y. Y
= X?r

The
probability mass function of Y is:

Where
y= 0,1,2, ….

The
negative binomial distribution with parameter r and p has mean(µ) and variance() ;

and

3.
The Negative Binomial distribution as a Gamma–Poisson distribution

A
mixture of a family of Poison distribution with Gamma distribution is one of
the most important application of the Negative binomial distribution. Then the
negative binomial distribution can be viewed as a poison distribution where the
poison parameter(?) is a random variable, distributed to a Gamma distribution.

Let
Y be the number event occurred in a given time interval. The conditional
probability mass function of Y given that the rate ? is the Poisson
distribution defined by

Suppose
has gamma distribution with scale parameter ?
and shape parameter ?. Then probability density function of  is given by

The
unconditional distribution of Y is obtained by summing out ? in;

It
is of the form of negative binomial distribution. Y is called negative binomial
random variable and distributed as negative binomial distribution with
parameter  and .

The
negative binomial distribution with parameter  and  has mean(µ) and variance() ;

and

This
poison- gamma mixture distribution was developed to account for over-dispersion
that is commonly observed in real life discrete or count data.

The
poison distribution requires the mean and variance to be equal, it is
unsuitable for data with larger variance than mean. The conditional variance is
always larger than the conditional mean for negative binomial distribution.
Therefore, this negative binomial distribution appropriate in such settings.

1.      “Hypergeometric
and Negative Binomial Distributions”, online: http://www.stat.purdue.edu/~zhanghao/STAT511/handout/Stt511%20Sec3.5.pdf

4.
Generalized Linear Models(GLM)

Generalized linear models are a class
of non-linear regression models that can be used in certain cases where linear
models are not appropriate. It is a powerful generalization of linear
regression to more general exponential family. In generalized linear models,
the dependent variable is linearly related to the factors and covariates via a
specific link function. Further, the model allows for the dependent variable to
have a non-normal distribution. Linear regression, ANOVA, logistic models, log-linear
models, Poison regression and multinomial response models are most common generalized
linear models.

Generalized linear model is specified
by three components, they are random component which is the response and an
associated probability distribution, systematic component which is include
explanatory variable and relationship among them, and finally link component
which is provide relationship between the systematic component and random
component.

i.
The
random component

The
independent observations  having a distribution which belongs
to the exponential family. Example of distributions belonging to the exponential
family: exponential, poison, binomial, gamma, normal, negative binomial …etc.

ii.
Systematic
component

The
systematic component specifies the explanatory variables (X1, X2, …, Xk) as
linear predictor(?) in the model. In a generalized linear model, this always
done via

? is model
parameters and Xi is explanatory variables

iii.
The

This
component of a GLM is a link between the random component and systematic
component. Suppose  then µ is linked to  by  where g(.)
is any monotonic differentiable function and is known as the link function. The
generalized linear model takes the form

The link function you choose will
depend on which exponential family distribution you are choosing for dependent
variable. Here are some examples of link function: identity link function used
with any distribution, log link function also used with any distribution, logit
only binomial distribution, … etc.

Reference

1.
James K. Lindsey ,”Applying Generalized Linear Models”