###
10/12/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Ana Luísa Papoila, *CEAUL, Faculdade de Ciências Médicas da UNL*

```
```###
Statistical methods in cancer research

Understanding trends and long-term trends in the incidence of diseases, particularly in cancer, is a major concern of epidemiologists. Several statistical methodologies are available to study cancer incidence rates. Age-Period-Cohort (APC) models may be used to study the variation of incidence rates through time. They analyse age-specific incidence according to three time scales: age at diagnosis (age), date of diagnosis (period) and date of birth (cohort). Classic and Bayesian APC models are available. Understanding geographical variations in health, particularly in small areas, has also become extremely important. Several types of spatial epidemiology studies are available such as disease mapping, usually used in ecological studies. The geographic mapping of diseases is very important in the definition of policies in oncology, namely on the allocation of resources, and on the identification of clusters with high incidence of disease. Geographical association studies, that allow the identification of risk factors associated with the spatial variation of a disease, are also indispensable and deserve special attention in disease incidence studies. For this purpose, Bayesian Hierarchical models are a common choice.

To quantify cancer survival in the absence of other causes of death, relative survival is also considered in cancer population-based studies. Several approaches to estimate regression models for relative survival using the method of maximum likelihood are available.

Finally, having an idea of the future burden of cancer is also of the utmost importance, namely for planning health services. This is why projections of cancer incidence are so important. Several projection models that differ according to cancer incidence trends are available.

The aim of this study is to investigate spatial and temporal trends in the incidence of colorectal cancer, to estimate relative survival and to make projections. It is a retrospective population-based study that considers data on all colorectal cancers registered by the Southern Portuguese Cancer Registry (ROR Sul) between 1998 and 2006.

###
03/12/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Ana Freitas, *Instituto Superior de Educação e Ciências; CEMAT*

```
```###
Testes de hipóteses para comparar probabilidades de recombinação de dois seres vivos com um conjunto de marcadores comuns

A recombinação genética, embora seja um fenómeno estudado desde há longo tempo, continua a ser um tema muito actual, pois encontra-se bastante relacionada com a evolução e diversificação das espécies. É um processo que gera novas combinações de genes onde depois a selecção natural actua. Por outro lado, é com base nas estimativas das probabilidades de recombinação que são construídos os mapas genéticos que nos dão uma imagem da constituição dos diversos cromossomas que existem em cada espécie. O objectivo deste trabalho é apresentar um método, se possível exacto, para estimar a estrutura de variância-covariância dos estimadores conjuntos das probabilidades de recombinação genética, e propor testes de hipóteses para comparar probabilidades de recombinação entre dois grupos de seres vivos que tenham um conjunto de marcadores comuns. As metodologias propostas são aplicadas a um conjunto de dados relativos a uma espécie de Eucaliptos (*Eucalyptus globulus*) resultantes de um cruzamento obtido por backcross. Os resultados obtidos, permitem-nos concluir que as metodologias propostas são apropriadas para comparar probabilidades de recombinações entre os dois sexos de uma espécie e, podem constituir um método alternativo aos testes habitualmente utilizados para resolver este tipo de questões.

###
19/11/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Mário A. T. Figueiredo, *Instituto de Telecomunicações, IST*

```
```###
Network Inference from Co-Occurrences

Inferring network structures is a central problem arising in many fields of science and technology, including communication systems, biology, sociology, and neuroscience. In this talk, after briefly reviewing several network inference problems, we will focus on that of inferring network structure from ``co-occurrence" observations. These observations identify which network components (e.g., switches, routers, genes) co-occur in a path, but do not indicate the order in which they occur in that path. Without order information, the number of structures that are data-consistent grows exponentially with the network size. Yet, the basic engineering/evolutionary principles underlying most networks strongly suggest that no all data-consistent structures are equally likely. In particular, nodes that often co-occur are probably closer than nodes that rarely co-occur. This observation suggests modeling co-occurrence observations as independent realizations of a random walk on the network, subjected to random permutations. Treating these permutations as missing data, allows deriving an expectation–maximization (EM) algorithm for estimating the random walk parameters. The model and EM algorithm significantly simplify the problem, but the computational complexity still grows exponentially in the length of each path. We thus propose a polynomial-time Monte Carlo EM algorithm based on importance sampling and derive conditions that ensure convergence of the algorithm with high probability. Finally, we report simulations and experiments with Internet measurements and inference of biological networks that demonstrate the performance of this approach.The work reported in this talk was done in collaboration with Michael Rabbat (McGill University, Canada) and Robert D. Nowak (University of Wisconsin, USA).

###
05/11/2014, 14:30 — 15:30 — Sala P3.10, Pavilhão de Matemática

José Brázio, *Instituto de Telecomunicações*

```
```###
A Stochastic Model for Throughput in Wireless Data Networks with Single-Frequency Operation and Partial Connectivity under Contention-Based Multiple Access Modes

Wireless data (packet) networks operating in a common frequency channel, as happens for example with a Basic Service Set in the IEEE 802.11 (WiFi) system, are subject to intrinsic impairments. One such major impairment results from the broadcast nature of the radio channel: if two different transmissions arrive at a receiver with any time overlap, they will interfere destructively and thus one or both of the corresponding packets will not be correctly received (packet collision), thus wasting radio channel transmission time and possibly requiring a retransmission of the original packet(s).

In order to achieve a better utilization of the scarce radio channel resource, stations in wireless networks use multiple acess algorithms to attempt to usefully coordinate their radio transmissions. One example is given by Carrier Sense Multiple Access (CSMA), used as a basis for the sharing of the radio channel in the WiFi system, which establishes that a station should not start a new packet transmission if it can hear any other station transmitting. In a network with radio connectivity between any pair of stations (fully connected) and negligible propagation delays, such algorithm succeeds at completely preventing the existence of collisions. That is however not the case if there exist pairs of stations that cannot directly hear each other (partial connectivity). Many other multiple access algorithms have been proposed and studied.

In this talk will be presented a stochastic model for the study of throughput (i.e., the long-term fraction of time that the radio channel is occupied with successful packet transmissions) in the class of networks described above. The talk will start with a short description of the communication functions and structure of the system under study and of the class of multiple access algorithms considered. Following will be presented a Markovian model for the representation of the time evolution of the packet transmissions taking place in the network and a result given on the existence of a product form for its stationary probabilities. The next step will be to show how the desired throughputs can be obtained from the steady-state probabilities of this process and the average durations of the successful packet transmissions. The latter are obtained from the times to absorption in a set of auxiliary (absorbing) derived Markov chains. Finally, and time permitting, a reference will be made to results concerning the insensitivity of product form steady state solutions, when they exist, to the distribution of packet lengths and retransmission time intervals, by means of a Generalized Semi-Markov Process representation.

###
24/10/2014, 11:30 — 12:30 — Sala P4.35, Pavilhão de Matemática

Patrícia Ferreira Ramos, *CEMAT-IST, UAL*

```
```###
On the misleading signals in simultaneous schemes for the mean vector and covariance matrix of multivariate i.i.d. output

The performance of a product often depends on several quality characteristics. Simultaneous schemes for the process mean vector and the covariance matrix are essential to determine if unusual variation in the location and dispersion of a multivariate normal vector of quality characteristics has occurred.

Misleading signals (MS) are likely to happen while using such simultaneous schemes and correspond to valid signals that lead to a misinterpretation of a shift in mean vector (resp. covariance matrix) as a shift in covariance matrix (resp. mean vector).

This paper focuses on numerical illustrations that show that MS are fairly frequent, and on the use of stochastic ordering to qualitatively assess the impact of changes in th emean vector and covariance matrix in the probabilities of misleading signals in simultaneous schemes for these parameters while dealing with multivariate normal i.i.d. output.

(Joint work with: Manuel Cabral Morais, António Pacheco, CEMAT-IST; Wolfgang Schmid, European University Viadrina.)

###
08/10/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Manuel Cabral Morais, *CEMAT-IST*

```
```###
On hitting times for Markov time series of counts with applications to quality control

Examples of time series of counts arise in several areas, for instance in epidemiology, industry, insurance and network analysis. Several time series models for these counts have been proposed and some are based on the binomial thinning operation, namely the integer-valued autoregressive (INAR) model, which mimics the structure and the autocorrelation function of the autoregressive (AR) model.

The detection of shifts in the mean of an INAR process is a recent research subject and it can be done by using quality control charts. Underlying the performance analysis of these charts, there is an indisputable popular measure: the run length (RL), the number of samples until a signal is triggered by the chart. Since a signal is given as soon as the control statistic falls outside the control limits, the RL is nothing but a hitting time.

In this paper, we use stochastic ordering to assess: the ageing properties of the RL of charts for the process mean of Poisson INAR(1) output; the impact of shifts in model parameters on this RL. We also explore the implications of all these properties, thus casting interesting light on this hitting time for a Markov time series of counts.

(Joint work with António Pacheco, CEMAT-IST.)

###
24/09/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Nuno Moreira, *Instituto Português do Mar e da Atmosfera (IPMA)*

```
```###
Previsões Meteorológicas — mais do que probabilidades?

A previsão meteorológica depende da existência de observações de variáveis físicas (temperatura do ar, humidade relativa do ar, vento, …) e da capacidade de prever a evolução temporal destas variáveis. Depende também de outros parâmetros meteorológicos (convergência, vorticidade, índices de estabilidade, …) que, não sendo diretamente mensuráveis mas indiretamente calculados, permitem construir o cenário futuro da atmosfera no prazo de dias a semanas.

A previsão de parâmetros meteorológicos está assente na corrida de modelos numéricos de previsão em supercomputadores de elevado desempenho, disponíveis quer a nível internacional quer a nível nacional. A nível europeu, estes desenvolvimentos são realizados pelo Centro Europeu de Previsão a Médio Prazo (ECMWF), com a disponibilização, duas vezes por dia para todo o Globo, de resultados determinísticos com resolução espacial de 16 km até 10 dias e de resultados probabilísticos com resolução espacial de 64 km até 15 dias. Em Portugal são corridos modelos numéricos de área local, com resolução espacial de 9 km até 72 horas (modelo ALADIN) e de 2.5 km até 48 horas (modelo AROME), este último em 3 domínios distintos – Continente, Madeira e Açores.

Enquanto as previsões determinísticas apresentam uma única solução para um instante futuro, as previsões probabilísticas permitem considerar intervalos de variação dos diversos parâmetros meteorológicos, o que se torna particularmente importante para parâmetros mais difíceis de prever, dos quais é exemplo a precipitação. As variações nos parâmetros meteorológicos resultam de perturbações nas condições iniciais dos modelos numéricos, que procuram representar a influência dos erros existentes nas observações meteorológicas.

Deste modo, as previsões meteorológicas em formato probabilístico recorrem a parâmetros e produtos disponíveis operacionalmente, tais como: probabilidade de ocorrência, média de *ensemble*, dispersão, *shift of tail*, meteograma, *Extreme Forecast Index*, *Spaghetti*, *Clusters*,… . Para prazos mais longos (semanas a meses) e, em parte, de forma experimental são ainda utilizadas anomalias (em relação a períodos passados - climatologia) de alguns parâmetros como a precipitação, a temperatura do ar, a temperatura da água do mar e a pressão atmosférica ao nível médio do mar.

###
17/09/2014, 12:00 — 13:00 — Sala P3.10, Pavilhão de Matemática

Cláudia Pascoal, *CEMAT-IST*

```
```###
Contributions to Variable Selection and Robust Anomaly Detection in Telecommunications

Over the years, we have witnessed an incredible high level of technological development where Internet plays the leading role. The Internet not only brought benefits but also originates new threats expressed by anomalies/outliers. Consequently, new and improved outlier detection methodologies need to be developed. Expectedly, we propose an anomaly detection method that combines a robust variable selection method and a robust outlier detection procedure based on Principal Component Analysis.

Our method was evaluated using a data set obtained from a network scenario capable of producing a perfect ground-truth under real (but controlled) traffic conditions. The robust variable selection step was essential to eliminate redundant and irrelevant variables that were deteriorating the performance of the anomaly detector. The variable selection methods we considered use a filter strategy based on Mutual Information and Entropy for which we have developed robust estimators. The filter methods incorporate a redundancy component which tries to capture overlaps among variables.

The performance of eight variable selection methods was studied under a theoretical framework that allows reliable comparisons among them, determining the true/theoretical variable ordering under specific evaluation scenarios, and unveiled problems in the construction of the associated objective functions. Our proposal, maxMIFS, which is associated with a simple objective function, revealed to be unaffected by these problems and achieved outstanding results. For these reasons, it was chosen to be applied in the preprocessing step. With this approach, the results improved substantially and the main objective of this work was fulfilled: improving the detection of anomalies in Internet traffic flows.

###
28/05/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Eunice Carrasquinha, *CEMAT, IST*

```
```###
Seeing, hearing, doing multivariate statistics

Signal processing is an important task in our
days that arise in various areas such as engineering and applied
mathematics. A signal represents time-varying or spatially varying
physical quantities. Signals of importance can include sound,
electromagnetic radiation, images, telecommunication transmission
signals, and many others. A signal carries information, and the
objective of signal processing is to extract useful information
carried by the signal. The received signal is usually disturbed by
electrical, atmospheric or deliberate interferences. Due to the
random nature of the signal, statistical techniques play an
important role in analyzing the signal.

There are many techniques used to analyze
these types of data, depending on the focus or research question of
the study. Some of these techniques are Principal Component
Analysis (PCA) and Fourier transform, in particular discrete
Fourier transform (DFT). The main goal in this work is to explore
the relations between PCA and others mathematical transforms, based
on Toeplitz and circulant matrices. In this sense, the proposed
method relates the theory behind the Fourier transform through the
Toeplitz and circulant matrices and the PCA. To illustrate the
methodology we will consider sounds and images.

Keywords: Circulant Matrix, Fourier Transform,
Principal Component Analysis, Signal Processing, Toeplitz
Matrix.

###
14/05/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Tiago Matos, *Celfocus: Telco & Media*

```
```###
Mathematics in consultancy

Major telecommunications companies currently hold complex
systems, with varying interdependencies and still remarkable
reliability. The constant development and changes in their core,
along with different types of users involved in the use of these
systems, makes them extremely interesting. The maintenance, the
service assurance and the positive evolution of this network
depends on the deep understanding and control of these complex
systems. This knowledge, which refers to different interactions
between internal and exogenous systems, allows you, for instance,
to control and increase the speed of problem solving, to correctly
approach trouble tickets and to detect the root of the
problems.

To address these problems, the deconstruction of an error
safeguarding and fuzzy logic has been implemented, emphasizing the
control, the precision and the planning of the systems behaviour.
Statistics and Quality Control brought another perspective to
approach these issues. The first results, obtained from field
operations and with a direct influence on the user's experience,
are the focus of this presentation.

###
30/04/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Maria Kulikova, *CEMAT, Instituto Superior Técnico, Universidade de Lisboa*

```
```###
Estimating Adaptive Market Efficiency Using the Kalman Filter

This paper addresses the adaptive market hypothesis (AMH), which
suggests that market efficiency is not a stable property, but
rather that it evolves with time. The test of evolving efficiency
(TEE) investigates the efficiency of a particular market by using a
multi-factor model with time-varying coefficients and GARCH errors.
The model is a variant of the stochastic GARCH in Mean (GARCH-M)
proposed in 1990, which tests for market efficiency in an absolute
sense, i.e. by assuming that market efficiency is unchanged over
time. To resolve this problem, the TEE extends all previous tests
and provides a mechanism for observing the market learning process
by estimating the changes in market efficiency over time. Both
stochastic GARCH-M and TEE models are estimated using Kalman
filtering techniques. The contribution of this paper is
two-fold:

- we explain in detail the quasi-maximum likelihood estimation
(QMLE) procedure based on the standard Kalman filter applied to the
stochastic GARCH-M and TEE models;
- we estimate the changes in the level of market efficiency in
three markets over a period that includes the financial markets
crisis of 2007/2008.

The three markets are specifically chosen to reflect a developed
(London LSE), mature emerging (Johannesburg JSE) and immature
emerging market (Nairobi NSE) perspective. Our empirical study
suggests that, in spite of the financial crisis, all three markets
maintained their pre-crisis level of weak-form efficiency.

###
09/04/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Nélson Antunes, *Universidade do Algarve and CEMAT*

```
```###
Stochastic modeling in Communications Networks

Research in stochastic modeling is strongly influenced by
applications in diverse domains. Communication networks constitute
a lively field that motivate the study of new stochastic models and
sometimes the development of new methods of analysis to understand
the overall functioning of these complex systems. In this talk, it
is presented some stochastic models found in communications
networks from the perspective of the speaker's work over the last
decade.

###
02/04/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Gonçalo Jacinto, *FCT/DMAT of Évora University and CIMA-UE*

```
```###
Traffic Estimation of a M/G/1 Queue Using Probes

The huge growth of the Internet associated to the appearance of
new multimedia applications requiring high demands of traffic,
gives an important role to the monitorization of Internet traffic
for quality of service assessment.

In this context, Internet probing has been a subject of great
interest for researchers, since it permits to measure the internet
performance by sending controlled probe packets to the network
whose observed performance can be used to estimate the
characteristics of the original traffic.

In this work we consider the estimation of the arrival rate and
the service time moments of a Internet router modelled as a M/G/1
queue with probing. The probe inter-arrival times are i.i.d. and
probe service times follow a general positive distribution. The
only observations used are the arrival times, service times and
departure times of probes. We derive the main equations from which
the quantities of interest can be estimated. Two particular probe
arrivals, deterministic and Poisson, are investigated.

Joint work with with Nélson Antunes (FCT/DMAT of Algarve
University and CEMAT) and António Pacheco (Instituto Superior
Técnico and CEMAT)

###
26/03/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

João F. D. Rodrigues, *IN+, Center for Innovation, Technology and Policy Research, Instituto Superior Técnico*

```
```###
The Prior Uncertainty and Correlation of Statistical Economic Data

Empirical estimates of source statistical economic data such as
transaction flows, greenhouse gas emissions or employment are
always subject to measurement errors but empirical estimates of
source data errors are often missing. This paper uses concepts from
Bayesian inference and the Maximum Entropy Principle to estimate
the prior probability distribution, uncertainty and correlations of
source data when such information is not explicitly provided. In
the absence of additional information, an isolated datum is
described by a truncated Gaussian distribution, and if an
uncertainty estimate is missing, its prior equals the best guess.
When the sum of a set of disaggregate data is constrained to match
an aggregate datum, it is possible to determine the prior
correlations among disaggregate data. If aggregate uncertainty is
missing, all prior correlations are positive. If aggregate
uncertainty is available, prior correlations can be either all
positive, all negative, or a mix of both. An empirical example is
presented, which reports uncertainty and correlation priors for the
County Business Patterns database.

###
19/03/2014, 11:30 — 12:30 — Sala P4.35, Pavilhão de Matemática

Ana Ferreira, *Instituto Superior de Agronomia and CEAUL*

```
```###
High quantile estimation and spatial aggregation applied to
precipitation extremes

We shall address the problem of high quantile estimation in
univariate and spatial Extreme Value Theory. Univariate methods are
well known under the Maximum Domain of Attraction Condition and the
Pareto tail approximation is the basis for many estimators. It
turns out that the Pareto tail approximation is also valid under
spatial aggregation but a spatial effect comes out. We shall
address the problem both theoretically and in practice, by
presenting a case study on 100-year return value estimation for
precipitation data collected at rain-gauge stations.

###
05/03/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

António Pacheco, *CEMAT and Instituto Superior Técnico*

```
```###
Level Crossing Ordering of Stochastic Processes

Stochastic Ordering is an important area of Applied Probability
tailored for qualitative comparisons of random variables, random
vectors, and stochastic processes. In particular, it may be used to
investigate the impact of parameter changes in important
performance measures of stochastic systems, avoiding exact
computation of those performance measures. In this respect, the
great diversity of performance measures used in applied sciences to
characterize stochastic systems has inspired the proposal of many
types of stochastic orderings.

In this talk we address the level crossing ordering, proposed by
A. Irle and J. Gani in 2001, that compares stochastic processes in
terms of the times they take to reach high levels (states). After
introducing some motivation for the use of the level crossing
ordering, we present tailored sufficient conditions for the level
crossing ordering of (univariate and multivariate) Markov and
semi-Markov processes. These conditions are applied to the
comparison of birth-and- death processes with catastrophes,
queueing networks, and particle systems.

Our analysis highlights the benefits of properly using the
sample path approach, which compares directly trajectories of the
compared processes defined on a common probability space. This
approach provides, as a by-product, the basis for the construction
of algorithms for the simulation of stochastic processes ordered in
the level crossing ordering sense. In the case of continuous Markov
chains, we resort additionally to the powerful uniformization
technique, which uniformizes the rates at which transitions take
place in the processes being compared.

*Joint work with Fátima Ferreira (CM-UTAD and Universidade
de Trás-os-Montes e Alto Douro).*

###
19/02/2014, 11:30 — 12:30 — Sala P3.10, Pavilhão de Matemática

Francisco Macedo, *IST/EPFL, Portugal/Switzerland*

```
```###
A low-rank tensor method for structured large-scale Markov Chains

A number of practical applications lead to Markov Chains with
extremely large state spaces. Such an instance arises from the
class of Queuing Networks, which lead to a number of applications
of interest including, for instance, the analysis of the well-known
tandem networks. The state space of a Markov process describing
these interactions typically grows exponentially with the number of
queues. More generally, Stochastic Automata Networks (SANs) are
networks of interacting stochastic automata. The dimension of the
resulting state space grows exponentially with the number of
involved automata. Several techniques have been established to
arrive at a formulation such that the transition matrix has
Kronecker product structure. This allows, for example, for
efficient matrix-vector multiplications. However, the number of
possible automata is still severely limited by the need of
representing a single vector (e.g., the stationary vector)
explicitly. We propose the use of low-rank tensor techniques to
avoid this barrier. More specifically, an algorithm will be
presented that allows to approximate the solution of certain SAN s
very efficiently in a low-rank tensor format.

###
20/11/2013, 11:00 — 16:00 — Sala P3.10, Pavilhão de Matemática

Isabel Rodrigues, *CEMAT and DM-IST*

```
```###
Regressão linear múltipla com alguns erros correlacionados: métodos clássicos e robustos

Este trabalho teve como motivação principal a análise de um conjunto de dados de medicina. Esses dados resultaram de um estudo observacional destinado a identificar os factores que influenciam o resultado de um determinado método de tratamento cirúrgico da escoliose (curvatura lateral anormal da coluna vertebral). O número de observações do conujnto de dados é 392 mas algumas dessas observações são relativas ao mesmo doente (curvas duplas). O modelo de regressão linear múltipla surge à partida como uma possibilidade para analisar estes dados mas, dado que as curvas duplas não devem ser ignoradas, a suposição de erros não correlacionados é claramente violada, o que foi de facto confirmado no diagnóstico dos resíduos do modelo usual (teste de Durbin-Watson e teste “runs”). Um modelo mais apropriado obtém-se mantendo a estrutura linear mas admitindo a existência de correlações não nulas entre os erros relativos ao mesmo doente. Para este modelo consideram-se dois métodos de estimação: mínimos quadrados generalizados com parâmetros de correlação estimados iterativamente (o que pode ser visto como uma adaptação do método de Cochrane-Orcutt para erros auto-correlacionados) e estimação de todos os parâmetros por máxima verosimilhança. Finalmente apresentam-se versões robustas destes dois métodos assim como do teste de Durbin-Watson e que foram as que permitiram analisar de forma mais satisfatória os dados referidos.

###
06/11/2013, 11:00 — 16:00 — Sala P3.10, Pavilhão de Matemática

Isabel Natário, *Faculdade de Ciências e Tecnologia/UNL and CEAUL*

```
```###
Modelling of road accidents, in Lisbon, with casualities using spatial point processes

Localização de acidentes rodoviários, local da ignição de incêndios florestais, a distribuição de corais no mar das Caraíbas são exemplos de acontecimentos que ocorrem tipicamente em localizações aleatórias. Se o nosso interesse é sobre esta caraterística dos dados, os processos pontuais espaciais são os modelos estatísticos mais adequados. Num problema desta natureza, principia-se o estudo tentando estabelecer se as localizações observadas satisfazem a hipótese de aleatoriedade completa, correspondendo a uma situação de uniformidade da distribuição das localizações (regularidade), ou se pelo contrário o padrão espacial observado pode ser considerado como agregado. No primeiro caso recorre-se a uma modelação baseada em processos de Poisson de taxa proporcional à area em estudo e, no segundo caso, tem de se modelar o padrão espacial agregado através de modelos que levem em conta a dependência espacial evidenciada por esses padrões. A abordagem que se faz no contexto dos processos pontuais espaciais é tradicionalmente não paramétrica, com o recurso a estatísticas resumo, tendo mais recentemente evoluído para a utilização de modelos paramétricos flexíveis, com eventual inclusão de covariáveis, sendo a inferência feita através do método da máxima verosimilhança assistido por métodos numéricos. Neste seminário faz-se uma breve descrição dos modelos mais usuais neste tipo de modelação e para este tipo de dados, numa perspectiva essencialmente aplicada a um conjunto de dados de acidentes rodoviários com vítimas na cidade de Lisboa (projecto SACRA, PTDC/TRA/66161/2006), e recorrendo ao pacote estatístico spatstat do R-project.