Probability and Statistics Seminar   RSS

Past sessions

Newer session pages: Next 6 5 4 3 2 1 Newest 

16/05/2012, 14:30 — 15:30 — Room P3.10, Mathematics Building
Verena Hagspiel , CentER, Department of Econometrics and Operations Research Tilburg University, The Netherlands

Optimal Technology Adoption when the Arrival Rate of New Technologies Changes

Our paper contributes to the literature of technology adoption. In most of these models it is assumed that after the arrival of a new technology the probability of the next arrival is constant. We extend this approach by assuming that after the last technology jump the probability of a new arrival can change. Right after the arrival of a new technology the intensity equals a specific value that switches if no new technology arrival has taken place within a certain period after the last technology arrival. We look at different scenarios, dependent on whether the firm is threatened by a drop in the arrival rate after a certain time period or expects the rate of new arrivals to rise. We analyze the effect of variance of time between two consecutive arrivals on the optimal investment timing and show that larger variance accelerates investment in a new technology. We find that firms often adopt a new technology a time lag after its introduction, which is a phenomenon frequently observed in practice. Regarding a firm's technology releasing strategy we explain why clear signals set by regular and steady release of new product generations stimulates customers buying behavior. Depending on whether the arrival rate is assumed to change or be constant over time, the optimal technology adoption timing changes significantly. In a further step we add an additional source of uncertainty to the problem and assume that the length of the time period after which the arrival intensity changes is not known to the firm in advance. Here, we find that increasing uncertainty accelerates investment, a result that is opposite to the standard real options theory.

02/05/2012, 14:30 — 15:30 — Room P3.10, Mathematics Building
, Departamento de Matemática - CEMAT - IST

On the Aging Properties of the Run Length of Markov-Type Control Charts

A change in a production process must be detected quickly so that a corrective action can be taken. Thus, it comes as no surprise that the run length (RL) is usually used to describe the performance of a quality control chart.

This popular performance measure has a phase-type distribution when dealing with Markov-type charts, namely, cumulative sum (CUSUM) and exponentially weighted moving average (EWMA) charts, as opposed to a geometric distribution, when standard Shewhart charts are in use.

In this talk, we briefly discuss sufficient conditions on the associated probability transition matrix to deal with run lengths with aging properties such as new better than used in expectation, new better than used, and increasing hazard rate.

We also explore the implications of these aging properties of the run lengths, namely when we decide to confront the in control and out-of-control variances of the run lengths of matched in control Shewhart and Markov-type control charts.


Phase-type distributions; Run length; Statistical process control; Stochastic ordering.


Morais, M.C. and Pacheco, A. (2012). A note on the aging properties of the run length of Markov-type control charts. Sequential Analysis 31, 88-98.

16/04/2012, 15:00 — 16:00 — Room P3.10, Mathematics Building
, Matemática/DCEB, ISA/UTL e CEAUL/UL

Espaço das variáveis: onde estatística e geometria se casam. O caso das distâncias de Mahalanobis.

A forma usual de conceptualizar a representação gráfica duma matriz $X_{n\times p}$ de dados de indivíduos $\times$ variáveis consiste em associar um eixo a cada variável e nesse referencial cartesiano representar cada individuo por um ponto, cujas coordenadas são dadas pela linha de $X$ correspondente ao individuo. A popularidade desta representação no espaço dos individuos ($\mathbb{R}^p$) resulta, em grande medida, do facto de ser visualizável para dados bivariados ou tri-variados. No entanto, para um número maior de variáveis ($p \gt 3$) essa vantagem deixa de existir.

Uma representação alternativa é importante na análise e modelação dos dados. No espaço das variáveis, cada eixo corresponde a um individuo e cada variável é representada por um vector a partir da origem, definido pelas $n$ coordenadas da respectiva coluna matricial. Esta representação das variáveis em $\mathbb{R}^n$ tem a enorme vantagem de casar conceitos estatísticos e conceitos geométricos, permitindo uma melhor compreensão dos primeiros. Tem raízes sólidas na escola francesa de análise de dados, mas o seu potencial nem sempre é explorado.

Nesta comunicação começa-se por relembrar os conceitos geométricos correspondentes a indicadores fundamentais da estatística univariada e bivariada (média, desvio padrão, coeficiente de variação ou coeficiente de correlação) ou multivariada (exemplificando com o caso da análise em componentes principais). Aprofunda-se a discussão no contexto de regressões lineares múltiplas, cujos conceitos fundamentais (coeficiente de determinação, as três somas de quadrados e a sua relação fundamental) têm interpretação geométrica no espaço das variáveis.

Seguidamente, discute-se a utilidade desta representação geométrica no estudo das distâncias de Mahalanobis, que desempenham um papel de primeiro plano na estatística multivariada. Mostra-se como as distâncias (ao quadrado) de Mahalanobis medem a inclinação do subespaço de $\mathbb{R}^n$ gerado pelas colunas da matriz centrada dos dados, o subespaço $\mathcal{C}(X_c)$, em relação ao sistema de eixos. Em particular, mostra-se como as distâncias de Mahalanobis ao centro, \[D^2_{x_i,\overline{x}}=(x_i-\overline{x})^t \S^{-1} (x_i-\overline{x}),\] são apenas função de $n$ e do ângulo $\theta_i$ entre o eixo correspondente ao indivíduo $i$ e $\mathcal{C}(X_c)$, enquanto que a distância (ao quadrado) de Mahalanobis entre dois individuos, \[D^2_{x_i,x_j}=(x_i-x_j)^t \S^{-1} (x_i-x_j),\] é também função apenas de $n$ e do ângulo entre $\mathcal{C}(X_c)$ e a bissectriz gerada por $e_i-e_j$, sendo $e_i$ e $e_j$ os vectores canónicos de $\mathbb{R}^n$ associados aos dois individuos. Algumas recentes majorações e outras propriedades importantes destas distâncias (Gath & Hayes, 2006 e Branco & Pires, 2011) são expressão directa destas relações geométricas. Apesar das distâncias de Mahalanobis dizerem respeito aos individuos, os conceitos geométricos que lhes estão associados no espaço das variáveis podem ser explorados para aprofundar e estender esses resultados.

26/03/2012, 14:30 — 15:30 — Room P3.10, Mathematics Building
Russell Alpizar-Jara , Research Center in Mathematics and Applications (CIMA-U.E.) Department of Mathematics, University of Évora

An overview of capture-recapture models

Capture-recapture methods have been widely used in Biological Sciences to estimate population abundance and related demographic parameters (births, deaths, immigration, or emigration). More recently, these models have been used to estimate community dynamics parameters such as species richness, rates of extinction, colonization and turnover, and other metrics that require presence/absence data of species counts. In this presentation, we will use the latest application to illustrate some of the concepts and the underlying theory of capture-recapture models. In particular, we will review basic closed-population, open-population, and combination of closed and open population models. We will briefly mention about other applications of these models to Medical, Social and Computer Sciences.

Keywords: Capture-recapture experiments; multinomial and mixture distributions; non-parametric and maximum likelihood estimation; population size estimation.

07/03/2012, 14:30 — 15:30 — Room P3.10, Mathematics Building
, CEAUL - DEIO - FCUL - University of Lisbon

Why we need non-linear time series models and why we are not using them so often

The Wold Decomposition theorem says that under fairly general conditions, a stationary time series X t has a unique linear causal representation in terms of uncorrelated random variables. However, The Wold Decomposition theorem gives us a representation, not a model for X t, in the sense that we can only recover uniquely the moments of X t up to second order from this representation, unless the input series is a Gaussian sequence. If we look for models for X t, then we should look for such model within the class of convergent Volterra series expansions. If we have to go beyond second order properties, and many real data sets from financial and environmental sciences indicate that we should, then linear models with iid Gaussian input are a very tiny, insignificant fraction of possible models for a stationary time series, corresponding to the first term of the infinite order Volterra expansion. On the other hand, Volterra series expansions are not particularly useful as a possible class of models, as conditions of stationarity and invertibility are hard to check, if not impossible, therefore they have very limited use as models for time series, unless the input series is observable. From a prediction point of view, the Projection Theorem for Hilbert spaces tells us how to obtain the best linear predictor for X t+k within the linear span of {X t,X t1,,} , but when linear predictors are not sufficiently good, it is not straightforward to find, if possible at all, the best predictor within richer subspaces constructed over {X t,X t1,,}. It is therefore important to look for classes of nonlinear models to improve upon the linear predictor, which are sufficiently general, but at the same time are sufficiently flexible to work with. There are many ways a time series can be nonlinear. As a consequence, there are many classes of nonlinear models to explain such nonlinearities, but whose probabilistic characteristics are difficult to study, not to mention the difficulties associated with modeling issues. Likelihood based inference is particularly a difficult issue as for most nonlinear processes, we can not even write the likelihood. However, recently there has been very exciting advances in simulation based inferential methods such as sequential Markov Chain Monte Carlo, Particle filters and Approximate Bayesian Computation methods for generalized state space models which we will mention briefly.

22/02/2012, 14:30 — 15:30 — Room P3.10, Mathematics Building
, CEAUL-DEIO- FC - Universidade de Lisboa

Até onde pode ir o H(h)omem?

Neste seminário será abordada a questão do “Qual é o Maior Salto em Comprimento ao alcance do H(h)omem, dado o actual state of the art”? Para responder a essa pergunta será usado o crème de la crème, i.e., os dados são coligidos a partir dos melhores atletas olímpicos na modalidade, a partir da base de dados do World Athletics Competitions - Long Jump Men Outdoors. Esta abordagem do problema é baseada na Teoria de Valores Extremos e as respectivas técnicas estatísticas. Usar-se-ão apenas os melhores desempenhos das World top lists. A estimativa final do potencial recorde, i.e., o limite superior do acontecimento salto em comprimento, permite inferir acerca da melhor marca individual possível, dadas as condições actuais, quer em termos de conhecimento do fenómeno, quer relativamente às condições e regras de registo na modalidade desportiva. Actualmente o recorde de 8,95m é detido por Mike Powell (USA) em Tokyo, 30/08/1991. Em Valores Extremos insere-se na estimativa do limite superior do suporte para uma distribuição no Max-domínio da Gumbel.

Palavras-chave: Valores Extremos em Desporto, Teoria de Valores Extremos, Estimação do Limite Superior do Suporte no Domínio Gumbel, Abordagem Semi-paramétrica para Estatística de Extremos.

10/02/2012, 11:00 — 12:00 — Room P3.10, Mathematics Building
Patrícia Ferreira, CEMAT - Departamento de Matemática - IST

Sinais erróneos em esquemas conjuntos para o valor esperado e paraa variância de processos

Quando se pretende controlar simultaneamente o valor esperado e a variância de um processo é comum utilizar-se um esquema conjunto. Este tipo de esquema é constituído por duas cartas de controlo que operam em simultâneo, uma que controla o valor esperado e outra que controla a variância do processo. A utilização deste tipo de esquemas pode levar à ocorrência de sinais erróneos, associados, por exemplo, às seguintes situações:

  • o valor esperado do processo está fora de controlo, no entanto a carta para a variância emite um sinal antes da carta usada para controlar o valor esperado;
  • a variância do processo está fora de controlo mas a carta para o valor esperado é a primeira a emitir sinal.

Os sinais erróneos são sinais válidos que podem levar o operador de controlo de qualidade a desencadear acções inadequadas para corrigir uma causa inexistente. Posto isto, é importante considerar a frequência com que estes sinais ocorrem como uma medida de desempenho dos esquemas conjuntos. Neste trabalho analisa-se o desempenho de esquemas conjuntos do ponto de vista da probabilidade de ocorrência de um sinal erróneo com especial enfoque em esquemas conjuntos para processos univariados i.i.d. e autocorrelacionados.

19/01/2012, 11:00 — 12:00 — Room P3.10, Mathematics Building
Peter Kort, Tilburg University

Strategic Capacity Investment Under Uncertainty

In this talk we consider investment decisions within an uncertain dynamic and competitive framework. Each investment decision involves to determine the timing and the capacity level. In this way we extend the main bulk of the real options theory where the capacity level is given. We consider a monopoly setting as well as a duopoly setting. Our main results are the following. In the duopoly setting we provide a fully dynamic analysis of entry deterrence/accommodation strategies. Contrary to the seminal industrial organization analyses that are based on static models, we find that entry can only be deterred temporarily. To keep its monopoly position as long as possible the first investor overinvests in capacity. In very uncertain economic environments the first investor eventually ends up being the largest firm in the market. If uncertainty is moderately present, a reduced value of waiting implies that the preemption mechanism forces the first investor to invest so soon that a large capacity cannot be afforded. Then it will end up with a capacity level being lower than the second investor.

04/05/2011, 14:00 — 15:00 — Room P4.35, Mathematics Building
Verena Hagspiel, Tilburg University, Netherlands

Production Flexibility and Capacity Investment under Demand Uncertainty

he paper takes a real option approach to consider optimal capacity investment decisions under uncertainty. Besides the timing of the investment, the firm also has to decide on the capacity level. Concerning the production decision, we study a flexible and an inflexible scenario. The flexible firm can costlessly adjust production over time with the capacity level as the upper bound, while the inflexible firm fixes production at capacity level from the moment of investment onwards. We find that the flexible firm invests in higher capacity than the inflexible firm, where the capacity difference increases with uncertainty. For the flexible firm the initial occupation rate can be quite low, especially when investment costs are concave and the economic environment is uncertain. As to the timing of the investment there are two contrary effects. First, the flexible firm has an incentive to invest earlier, because flexibility raises the project value. Second, the flexible firm has an incentive to invest later, because costs are larger due to the higher capacity level. The latter effect dominates in highly uncertain economic environments.

01/03/2011, 11:00 — 12:00 — Room P3.10, Mathematics Building
Christine Fricker, INRIA, France

Performance of passive optical networks

We introduce PONs (Passive Optical Networks), which are designed to provide high speed access to users via fiber links. The problem for the OLT (Optical Line Terminal) is to share dynamically the wavelength bandwidth among the ONUs (Optical Network Units). For that, with an optimal algorithm, the system can be modeled as a relatively standard polling system. Due to technological constraints, in the polling system, the number of servers which visit one queue at the same time is limited. The performance of the system is directly related to the stability condition of the polling model. It is unknown in general. A mean field approach provides a limit stability condition when the system gets large.

07/10/2010, 16:30 — 17:30 — Room P3.10, Mathematics Building
Magnus Fontes, Lund University

Mathematics-A Catalyst for Innovation- Giving European Industry an Edge

We will discuss the role of Mathematics in Industry and in innovation processes. The focus will be European and we will look at good examples provided e.g. by the experiences of the network European Consortium for Mathematics in Industry (ECMI). I will also present the ongoing ESF Forward Look: "Mathematics and Industry" (see and discuss possible future developments on a European scale.

21/07/2010, 15:00 — 16:00 — Room P3.10, Mathematics Building
Graciela Boente, Universidad de Buenos Aires and CONICET, Argentina

Robust inference in generalized linear models with missing responses

he generalized linear model GLM (McCullagh and Nelder, 1989) is a popular technique for modelling a wide variety of data and assumes that the observations are independent such that the conditional distribution of y|x belongs to the canonical exponential family. In this situation, the mean $E(y|x)$ is modelled linearly through a known link function. Robust procedures for generalized linear models have been considered among others by Stefanski et al. (1986), Künsch et al. (1989), Bianco and Yohai (1996), Cantoni and Ronchetti (2001), Croux and Haesbroeck (2002) and Bianco et al. (2005). Recently, robust tests for the regression parameter under a logistic model were considered by Bianco and Martínez (2009).

In practice, some response variables may be missing, by design (as in two-stage studies) or by happenstance. As it is well known, the methods described above are designed for complete data sets and problems arise when missing responses may be present, while covariates are completely observed. Even if there are many situations in which both the response and the explanatory variables are missing, we will focus our attention only when missing data occur only in the responses. Actually, missingness of responses is very common in opinion polls, market research surveys, mail enquiries, social-economic investigations, medical studies and other scientific experiments, where the explanatory variables can be controlled. This pattern is common, for example, in the scheme of double sampling proposed by Neyman (1938). Hence, we will be interested on robust inference when the response variable may have missing observations but the covariate x is totally observed.

In the regression setting with missing data, a common method is to impute the incomplete observations and then proceed to carry out the estimation of the conditional or unconditional mean of the response variable with the completed sample. The methods considered include linear regression (Yates, 1933), kernel smoothing (Cheng, 1994; Chu and Cheng, 1995) nearest neighbor imputation (Chen and Shao, 2000), semiparametric estimation (Wang et al., 2004, Wang and Sun, 2007), nonparametric multiple imputation (Aerts et al. , 2002, González-Manteiga and Pérez-Gonzalez, 2004), empirical likelihood over the imputed values (Wang and Rao, 2002), among others. All these proposals are very sensitive to anomalous observations since they are based on least squares approaches.

In this talk, we introduce a robust procedure to estimate the regression parameter under a GLM model, which includes, when there are no missing data, the family of estimators previously studied. It is shown that the robust estimates of are root-$n$ consistent and asymptotically normally distributed. A robust procedure to test simple hypothesis on the regression parameter is also considered. The finite sample properties of the proposed procedure are investigated through a Monte Carlo study where the robust test is also compared with nonrobust alternatives.

Ana Pires 01/06/2010, 16:00 — 17:00 — Room P4.35, Mathematics Building
Ana Pires, Universidade Técnica de Lisboa - Instituto Superior Técnico and CEMAT

CSI: are Mendel's data "Too Good to be True?"

Gregor Mendel (1822-1884) is almost unanimously recognized as the founder of modern genetics. However, long ago, a shadow of doubt was cast on his integrity by another eminent scientist, the statistician and geneticist, Sir Ronald Fisher (1890-1962), who questioned the honesty of the data that form the core of Mendel's work. This issue, nowadays called "the Mendel-Fisher controversy", can be traced back to 1911, when Fisher first presented his doubts about Mendel's results, though he only published a paper with his analysis of Mendel's data in 1936.

A large number of papers have been published about this controversy culminating with the publication in 2008 of a book (Franklin et al., "Ending the Mendel-Fisher controversy"), aiming at ending the issue, definitely rehabilitating Mendel's image. However, quoting from Franklin et al., "the issue of the `too good to be true' aspect of Mendel's data found by Fisher still stands".

We have submitted Mendel's data and Fisher's statistical analysis to extensive computations and simulations attempting to discover an hidden explanation or hint that could help finding an answer to the questions: is Fisher right or wrong, and if Fisher is right is there any reasonable explanation for the "too good to be true", other than deliberate fraud? In this talk some results of this investigation and the conclusions obtained will be presented.

18/05/2010, 16:00 — 17:00 — Room P4.35, Mathematics Building
Alex Trindade, Texas Tech University

Fast and Accurate Inference for the Smoothing Parameter in Semiparametric Models

We adapt the method developed in Paige, Trindade, and Fernando (2009) in order to make approximate inference on optimal smoothing parameters for penalized spline, and partially linear models. The method is akin to a parametric bootstrap where Monte Carlo simulation is replaced by saddlepoint approximation, and is applicable whenever the underlying estimator can be expressed as the root of an estimating equation that is a quadratic form in normal random variables. This is the case under a variety of common optimality criteria such as ML, REML, GCV, and AIC. We apply the method to some well-known datasets in the literature, and find that under the ML and REML criteria it delivers a performance that is nearly exact, with computational speeds that are at least an order of magnitude faster than exact methods. Perhaps most importantly, the proposed method also offers a computationally feasible alternative where no known exact methods exist, e.g. GCV and AIC.

04/05/2010, 16:00 — 17:00 — Room P3.31, Mathematics Building
Rui Santos, Instituto Politécnico de Leiria

Probability Calculus - the construction of Pacheco D’Amorim in 1914

At the end of the XIXth Century, the classical definition of Probability and its extension to the continuous case were too restrictive and some geometrical applications, based in ingenious interpretations of Bernoulli-Laplace principle of insufficient reason, led to several paradoxes. David Hilbert, in his celebrated address at the International Congress of Mathematicians of 1900, included the axiomatization of Probability in his list of 23 important unsolved problems. Only in 1933 did Kolmogorov lay down a rigorous setup for Probability, inspired by Fréchet’s idea of using Measure Theory. But before this some other efforts to build up a proper axiomatization of Probability deserve to be more widely credited. Among those, the construction of Diogo Pacheco d’Amorim, in his 1914 doctoral thesis, is one of the most interesting. His discussion of a standard model, based on the idea of random choice instead of the concept of probability itself, seems limited, but his final discussion on how to use the law of large numbers and the central limit theorem to have an objective appraisal of whether sampling made by others, or even by a mechanical device, is indistinguishable from a random choice made by one-self, is impressive, since it anticipates the ideas of Monte Carlo by almost 30 years.

29/03/2010, 16:00 — 17:00 — Room P4.35, Mathematics Building
Maria Eduarda Silva, Universidade do Porto

Integer-valued AR models

During the last decades there has been considerable interest in integer-valued time series models and a large volume of work is now available in specialized monographs. Motivation to study discrete data models comes from the need to account for the discrete nature of certain data sets, often counts of events, objects or individuals. Examples of applications can be found in the analysis of time series of count data in many areas. Among the most successful integer-valued time series models proposed in the literature are the INteger-valued AutoRegressive model of order 1 (INAR(1)). In this talk the statistical and probabilistic properties of the INAR(1) models are reviewed.

11/03/2010, 16:00 — 17:00 — Amphitheatre Pa2, Mathematics Building
Sujit Samanta, Universidade Técnica de Lisboa - Instituto Superior Técnico e CEMAT

Analysis of stationary discrete-time GI/D-MSP/1 queue with finite and infinite buffers

This paper considers a single-server queueing model with finite and infinite buffers in which customers arrive according to a discrete-time renewal process. The customers are served one at a time under discrete-time Markovian service process (D-MSP). This service process is similar to the discrete-time Markovian arrival process (D-MAP), where arrivals are replaced with service completions. Using the imbedded Markov chain technique and the matrix-geometric method, we obtain the system-length distribution at a prearrival epoch. We also provide the steady-state system-length distribution at an arbitrary epoch by using the supplementary variable technique and the classical argument based on renewal-theory. The analysis of actual waiting-time (in the queue) distribution (measured in slots) has also been investigated. Further, we derive the coefficient of correlation of the lagged interdeparture intervals. Moreover, computational experiences with a variety of numerical results in the form of tables and graphs are discussed.

11/02/2010, 14:00 — 15:00 — Room P2, Mathematics Building, IST
Paulo Rodrigues, Banco de Portugal and Universidade Nova de Lisboa

Robust Inference in Predictive Regressions

In this paper we discuss new tests for predictability which are inspired in the work of Vogelsang (1998) on testing for trend. The proposed tests, use by design the same critical values irrespectively of whether the predictor is I(0 ) or I(1 ) and are therefore capable of detecting a more general set of alternatives, which are presently by available procedures (exceptions being the tests of Deo and Chen, 2008 and Maynard and Shimotsu, 2009). Numerical evidence suggests that our proposed procedures have good finite sample performance which coupled with the simplicity of application makes them appealing approaches for empirical research and useful alternatives to available procedures.

21/01/2010, 14:00 — 15:00 — Room P3.10, Mathematics Building
Daniela Rodriguez, Universidad Buenos Aires

Nonparametric estimation on Riemannian manifolds

In many situations, the random variables take values in a Riemannian manifold $(M, g)$ instead of $\mathbb{R}^d$, and this structure needs to be taken into account when we generate estimation procedures. For the nonparametric regression model, we study two families of robust estimators for the regression function when the explanatory variables take values in a Riemannian manifold.

In this talk, we will give a brief introduction of the geometric objects needed to define the nonparametric estimators adapted to a manifold. We discuss the classical proposals and we introduce two families of robust estimators for the regression function. We show the asymptotic properties obtained for both proposal. Finally, through a simulation study, we compare the behavior of the robust estimators against the alternative classic. This is a joint work with Guillermo Henry.

15/01/2010, 14:00 — 15:00 — Room P12, Mathematics Building
Kuno Huisman, Tilburg University

Strategic Capacity Investment Under Uncertainty

Contrary to most of the papers in the literature of investment under uncertainty we study models that not only capture the timing, but also the size of the investment. We consider a monopoly setting as well as a duopoly setting and compare the results with the standard models in which the firms do not have the capacity choice. Our main results are the following. First, for low uncertainty values the follower chooses a higher capacity than the leader and for high uncertainty values the leader chooses a higher capacity. Second, compared to the model without capacity choice, the monopolist and the follower invest later in a higher capacity for higher values of uncertainty. However, the leader will invest earlier in a higher capacity for higher values of uncertainty. The reverse results apply for lower values of uncertainty.

Older session pages: Previous 8 9 10 11 12 Oldest