CSDS2022

4th Conference on Statistics and Data Science

December 1-3, 2022 Salvador, Brazil (Virtual conference - 100% free)

All sessions will be broadcast on the YouTube channel of the Department of Statistics of the Federal University of Bahia

youtube.com/@DEst_UFBA

The CSDS-2022

The organization of the 4th Conference on Statistics and Data Science will be carried out in colaboration with the Department of Statistics at the Federal University of Bahia, Brazil. The purpose of the CSDS 2022 is to bring together researchers and practitioners, from the academy and from the industry, that develop and apply statistical and computational methods for data science. This conference will provide a forum to share and discuss ways to improve the access to knowledge, and promote interdisciplinary collaborations. The scientific program will be very appealing for most statisticians and data scientists interested in quantitative methods for decision making and will include plenary talks, invited sessions, short courses, round tables, and contributed posters.

Important dates

Abstract Submission:
Until November 06, 2022.

Decision on the acceptance of the abstracts: Until November 08, 2022.

Submission of the 3 minutes video and 4 slides poster: Until November 20, 2022.

Registration for the papers to be included in the scientific program: Until November 20, 2022.

Registration for non-presenting participants: Until November 27, 2022.

NOTE: For a paper to be included in the scientific program, it must have the abstract approved by the Scientific Program Committee and the authors must have submitted the 4 slides poster by November 20, 2022.

Our Speakers

Alexandra M. Schmidt

Bio:

Alexandra M. Schmidt is Professor of Biostatistics and holds the endowed University Chair in the Department of Epidemiology, Biostatistics and Occupational Health (EBOH) at McGill University. She is an Elected Fellow of the American Statistical Association (2020) and an Elected Member of the International Statistical Institute (2010). She was awarded the Distinguished Achievement Medal (2017) from the American Statistical Association’s Section on Statistics and the Environment and the Abdel El-Shaarawi Young Investigator Award (2008), from The International Environmetrics Society. Her main area of research is on the development of flexible spatial and spatio-temporal models.

Dalton Andrade

Bio:

Degree in Mathematics and Master in Statistics, University of São Paulo/Brazil, and PhD in Biostatistics, University of North Carolina at Chapel Hill/USA. Professor at Federal University of Santa Catarina, working in the Graduate Programs: PPGEP, Department of Production Engineering, and PPGMGA, Department of Informatics and Statistics. Associate Researcher at Vunesp Foundation and Consultant at INEP/MEC in Quantitative Methods for Educational Assessment. He has experience in the areas of Probability and Statistics, with an emphasis on Data Analysis, working mainly on the following topics: Item Response Theory, Educational Assessment, Latent Variable Models, Longitudinal Data and Linear and Nonlinear Hierarchical/Multilevel Models.

Genevera Allen

Bio:

Genevera Allen is an Associate Professor of Electrical and Computer Engineering, Statistics, and Computer Science at Rice University and an investigator at the Jan and Dan Duncan Neurological Research Institute at Texas Children’s Hospital and Baylor College of Medicine. She is also the Founding Director of the Rice Center for Transforming Data to Knowledge, informally called the Rice D2K Lab.
Dr. Allen’s research develops new statistical machine learning tools to help people make reproducible data-driven discoveries. She is known for her methods and theory work in the areas of unsupervised learning, interpretable machine learning, data integration, graphical models, and high-dimensional statistics. Her work is often motivated by solving real scientific problems, especially in the areas of neuroscience and bioinformatics. Dr. Allen is also a leader in data science education. In 2018, she founded the Rice D2K Lab, a campus hub for experiential learning and data science education. Through her leadership of the D2K Lab, Dr. Allen developed new interdisciplinary data science degree programs, established a novel capstone program in data science and machine learning, and led Rice’s engagement with corporate and community partners in data science.
Dr. Allen is the recipient of several honors for both her research and educational efforts including a National Science Foundation Career Award, Rice University’s Duncan Achievement Award for Outstanding Faculty, the Curriculum Innovation Award, and the School of Engineering’s Research and Teaching Excellence Award. In 2014, she was named to the “Forbes ’30 under 30′: Science and Healthcare” list. She is also an elected fellow of the International Statistics Institute and the American Statistical Association. Dr. Allen currently serves as an Action Editor for the Journal of Machine Learning Research and a Series Editor for Springer Texts in Statistics. Dr. Allen received her Ph.D. in statistics from Stanford University, under the mentorship of Prof. Robert Tibshirani, and her bachelors, also in statistics, from Rice University.

Our Schedule

BRT Time (GMT-3)

Opening Ceremony

Local Organizing Committee
IME/UFBA

8:30 - 9:00 Virtual room

Keynote Speaker 1

Alexandra M. Schmidt
McGill University, Canada

American Statistical Association - ASA

9:00 - 10:00 Virtual room

Coupled Markov switching count models for monitoring the spread of infectious diseases

Spatio-temporal counts of infectious disease cases often contain an excess of zeros. It is important for decision makers to identify periods of persistence (presence to presence) and reemergence (absence to presence) of a disease. Similarly, when modelling hospital admissions it is of interest to identify epidemic or endemic periods to predict hospital capacity. In this talk I will discuss a class of coupled nonhomogeneous Markov switching models that addresses these issues. Inference and prediction are performed under the Bayesian paradigm. To showcase the ability of the proposed models in addressing the above issues we analyze spatio-temporal counts of dengue fever cases in Rio de Janeiro and COVID-19 hospital admissions in the 30 largest Quebec hospitals. This is joint work with Dirk Douwes-Schultz, PhD student in the Program of Biostatistics, McGill University.

Round Table 1: The job market in statistics and data science

Chair
Federal University of Bahia, Brazil.

10:00 - 11:30 Virtual room

In this round-table, three widely experienced scientists will share some of their views about the current job market in the field of Statistics and Data Science.

Roger W. Hoerl
Union College in Schenectady, NY, USA.

Bio:

Dr. Roger W. Hoerl is the Brate-Peschel Associate Professor of Statistics at Union College in Schenectady, NY. Previously, he led the Applied Statistics Lab at GE Global Research. While at GE, Dr. Hoerl led a team of statisticians, applied mathematicians, and computational financial analysts who worked on some of GE’s most challenging research problems, such as developing personalized medicine protocols, enhancing the reliability of aircraft engines, and management of risk for a half-trillion dollar portfolio.
Dr. Hoerl has been named a Fellow of the American Statistical Association and the American Society for Quality, and has been elected to the International Statistical Institute and the International Academy for Quality. He has received the Brumbaugh and Hunter Awards, as well as the Shewhart Medal, from the American Society for Quality, and the Founders Award and Deming Lectureship Award from the American Statistical Association. While at GE Global Research, he received the Coolidge Fellowship, honoring one scientist a year from among the four global GE Research and Development sites for lifetime technical achievement. His book with Ron Snee, Statistical Thinking: Improving Business Performance, now in its 3 rd edition, was called “the most practical introductory statistics textbook every published in a business context” by the journal Technometrics.

S. Ejaz Ahmed
Faculty of Math and Science at Brock University, Canada.

Bio:

Professor S. Ejaz Ahmed is Professor of Statistics and Dean of the Faculty of Math and Science at Brock University, Canada. Previously, he was Professor and Head of the Mathematics and Statistics Department at the University of Windsor, Canada and University of Regina, Canada as well as Assistant Professor at the University of the Western Ontario, Canada. He holds adjunct professorship positions at many Canadian and International universities. He has supervised more than 20 Ph.D. Students, and organized several international workshops and conferences around the globe. He is a Fellow of the American Statistical Association and held prestigious ASEAN Chair Professorship position. His areas of expertise include big data analysis, statistical learning, and shrinkage estimation strategy. Having authored several books, he edited and co-edited several volumes and special issues of scientific journals. He is Technometrics Review Editor for past ten years. Further, he is Editor and associate editor of many statistical journals. Overall, he published more than 200 articles in scientific journals and reviewed more than 100 books. Having been among the Board of Directors of the Statistical Society of Canada, he was also Chairman of its Education Committee. Moreover, he was Vice President of Communications for The International Society for Business and Industrial Statistics (ISBIS) as well as a member of the "Discovery Grants Evaluation Group" and the "Grant Selection Committee" of the Natural Sciences and Engineering Research Council of Canada.

Frederico Zanqueta Poleto
SERASA EXPERIAN, Brazil.

Bio:

Frederico is passionate for learning and helping decision making based on appropriate data science approaches for the scenario in hands. After a BSc, MSc and PhD in Statistics from University of São Paulo, he had a mix of experiences ranging from banking (Citi and HSBC), consulting (freelance and Moody’s Analytics), insurance (LexisNexis and MAPFRE), and credit risk bureaux (Boa Vista Serviços and Serasa Experian).

Short course 1

Julio Trecenti
Secretario Geral da ABJ, Brasil

15:15 - 18:15 Virtual room

R, Zero to Hero: Reproducible report

Have you ever thought in the writing a report that is automatically updated? Imagine being able to use the same format of document to generate reports, make a list of exercises, books, presentations and websites. Mixing everything text and codes, completely reproductively. In this short course, the participants are going to construct a report from zero and the audience will use tools the R and the Quarto, that was launched in 2022. Click here to access the reference materials.

Bio:

PHD in Statistics student at University São Paulo, IME-USP. Geral Secretary of the Brazilian Association of Jurametry (ABE). Fellow of the Terranova consulting and of R course training . Assistant Professor of the Data Science and Data Decision at INSPER.

Short course 2

Paula Brito
University of Porto

14:00 - 17:00 Virtual room

Pedro Duarte Silva
The Catholic Porto Business School

Symbolic Data Analysis: Parametric Multivariate Analysis of Interval Data

Symbolic Data is concerned with analysing data with intrinsic variability, which is to be taken into account. In Data Mining, Multivariate Data Analysis and classical Statistics, the elements under analysis are generally individual entities for which a single value is recorded for each variable - e.g., individuals, described by age, salary, education level, etc. But when the elements of interest are classes or groups of some kind - the citizens living in given towns; car models, rather than specific vehicles - then there is variability inherent to the data. Symbolic data goes beyond the usual data representation model, considering variables whose observed values for each element are no longer necessarily single real values or categories, but may assume the form of sets, intervals, or, more generally, distributions. In this Tutorial we focus on the analysis of interval data, i.e., when the variables’ values are intervals of IR, adopting a parametric approach. The proposed modelling allows for multivariate parametric analysis; in particular M(ANOVA), discriminant analysis, model-based clustering, robust estimation and outlier detection are addressed. The referred modelling and methods are implemented in the R package MAINT.Data, available on CRAN.

Bio:

Paula Brito is Associate Professor at the Faculty of Economics of the University of Porto, and member of the Artificial Intelligence and Decision Support Research Group (LIAAD) of INESC TEC, Portugal. She holds a doctorate degree in Applied Mathematics from the University Paris Dauphine, and a Habilitation in Applied Mathematics from the University of Porto. Her current research focuses on the analysis of multidimensional complex data, known as symbolic data, for which she develops statistical approaches and multivariate analysis methodologies. She has been involved in two European research projects and coordinated the Portuguese participation in the H2020 FinTech project. Paula Brito was president of the International Association for Statistical Computing (IASC-ISI) in 2013-2015. She has authored a large number of papers in highly ranked journals in her field, has been invited speaker at several international conferences, is regularly member of international program committees, and has been chair of the international conferences COMPSTAT 2008 and IFCS 2022.

Pedro Duarte Silva is an Associate Professor at the Catholic Porto Business School, and member of its research center (CEGE). He holds a doctorate degree in Business Administration from the Terry College of Business of the University of Georgia. His research focuses on the intersection between Data Analysis and Machine Learning, Multivariate Statistics and Operations Research, with a particular focus on the development of novel methodologies for the analysis of big and complex data. He is the author of numerous communications at reputed scientific conferences and his research has been published in highly ranked scientific journals such as The European Journal of Operation Research, Computational Statistic and Data Analysis, Decision Sciences, Computational Statistics, and The Journal of Multivariate Analysis.

Keynote Speaker 2

Dalton Andrade
Federal University of Santa Catarina, Brazil

9:00 - 10:00 Virtual room

Statistical Methods in Educational Assessment: Theory, Applications, Computational Aspects and Challenges.

In this lecture we will be presenting and discussing the main models and concepts of the Item Response Theory - IRT, for the measurement of the latent trait proficiency, and the hierarchical/multilevel modeling, for the study of factors associated with proficiency. Theoretical, applied and computational aspects will be dealt with, seeking to point out some research challenges/topics. Applications of IRT in other areas, such as Psychiatry, Nutrition, Physiotherapy and Engineering will also be presented.

Round Table 2 – The future of education in statistics and data science

Chair
Federal University of Bahia, Brazil.

10:00 - 11:30 Virtual room

In this round-table, three statisticians and data scientists will discuss the current and future of statistical education in an Era of Data Science.

Wagner Hugo Bonat
Paraná Federal University (UFPR), Brazil.

Bio:

Wagner Hugo Bonat is Researcher and Lecturer of the Department of Statistics at Paraná Federal University - UFPR, where he has been since 2010. He is the Head of the Data Science and Big Data program (DSBD) and a member of the Laboratory of Statistics and Geoinformation (LEG). He received a B.S. from Paraná Federal University in 2008, and an M.S. from the Paraná Federal University in 2010. He received his Ph.D. in Mathematics and Computer Science from the University of Southern Denmark in 2016. His research lies on statistical modelling and estimating functions. Much of his work has been on extending the generalized linear model class to deal with multiple response variables. His main contribution is a new class of multivariate regression models called Multivariate Covariance Generalized Linear models (McGLMs) and the associated R package (mcglm).

Sastry G. Pantula
California State University- San Bernardino (CSUSB), USA.

Bio:

Sastry G. Pantula, Dean of the College of Natural Sciences at California State University- San Bernardino (CSUSB), is nationally and internationally recognized as a leader in statistical sciences. Most recently, he has served as the Director of Data Analytics programs at Oregon State University and a Professor of Statistics. He has served as the dean of the College of Science for four years at Oregon State University from August 2013 to August 2017, after serving a three-year term as Director for the Division of Mathematical Sciences at the National Science Foundation. Pantula spent more than 30 years as a statistics professor at North Carolina State University (NCSU), where he began his academic career in 1982. At NCSU, he also served as the Director of Graduate Programs (1994-2002) and the Head of the Department of Statistics (2002-2010). In all of his administrative roles, he has focused on enhancing the quality, quantity and diversity within the department, the division and the college. His core values are excellence, diversity and harmony: strive for excellence, enhance diversity and foster harmony.
He is a Fellow of the American Association for the Advancement of Science (AAAS) and the American Statistical Association (ASA). He served as ASA president in 2010 and received the ASA Founders Award in 2014. Pantula is a member of the honor societies Phi Kappa Phi, Sigma Xi and Mu Sigma Rho. He is also a member of the NCSU Academy of Outstanding Teachers. Pantula received bachelor’s and master’s degrees in statistics from the Indian Statistical Institute in Kolkata, India, and a Ph.D. in statistics from Iowa State University.

Patrick J.F. Groenen
Erasmus School of Economics (ESE), Netherlands.

Bio:

Patrick J.F. Groenen is a professor of statistics at the Erasmus School of Economics (ESE). He currently is also dean of that school. Professor Groenen's work focuses on data science techniques and their numerical algorithms. He is the co-author of several textbooks on multidimensional scaling published by Springer and has published articles in the top peer-reviewed journals including, among others, the Journal of Machine Learning Research, the Journal of Marketing Research, Psychological Methods, Psychometrika, the Journal of Classification, Computational Statistics and Data Analysis, the British Journal of Mathematical and Statistical Psychology, and the Journal of Empirical Finance.

Invited Paper Session on Statistical Learning

Chair
UFBA,Brazil

Virtual room

Rafael Izbicki
Federal University of São Carlos (UFSCar), Brazil.

9:00 - 09:30

Uncertainty Quantification in Machine Learning

Machine learning methods have an increasing ability to create models with good predictive power. However, even the best models make mistakes. To mitigate the effect of these errors on subsequent decision-making, it is essential to be able to quantify the uncertainty associated with each prediction. In this seminar, I will discuss recent methods of uncertainty quantification that I have recently developed.

Bio:

Rafael is an Assistant Professor at the Department of Statistics of the Federal University of São Carlos (UFSCar), Brazil. He obtained his PhD degree in the Department of Statistics & Data Science at Carnegie Mellon University (CMU), USA. Prior to that, he graduated and received Master’s degree at the University of São Paulo. He is a CNPq Research Fellow and is interested in theory, methodology, applications, and foundations of statistics and machine learning.

Luciano Rebouças
UFBA/Brazil.

9:30 - 10:00

Active learning: A way to cope with large unlabelled data sets

In an age where people produce large amounts of data, often times labeling such data becomes extremely costly. Such annotation problem occurs mainly in fields of knowledge where it is required the performance of specialists who are difficult to access or even with little time dedicated to curating a data set. One of the strategies to get around this problem is to use active learning, which uses machine learning to learn with little annotated data and then be able to annotate large volumes of unlabelled data with the help of an oracle. In this talk, I will discuss the related issues and how the active learning field helps to solve such issues.

Bio:

Luciano Rebouças holds a Ph.D. in Electrical and Computer Engineering, from the Institute of Systems and Robotics University of Coimbra, a master's degree in Mechatronics, and a bachelor’s in computer science at the Federal University of Bahia (UFBA). He is an Associate Professor at the Dept. of Computer Science, at Institute of Computing, UFBA, and head of the Intelligent Vision Research Lab (http://ivisionlab.ufba.br). He is a specialist in the field of Computer Vision and Machine Learning while his applied research is focused mainly on robotics, smart cities, biometric systems, and biomedicine.

Anderson Ara
Federal University of Paraná, Brazil.

10:00 - 10:30

Convolutional Support Vector Model: prediction of coronavirus disease using chest x-rays

The disease caused by the coronavirus (COVID-19) has been plaguing the world for the last two years. In this paper, a complete and applied study of convolutional support machines will be presented to classify patients infected with COVID-19 using X-ray data and comparing them with traditional convolutional neural networks (CNN). Based on the fitted models, it was possible to observe that the proposed convolutional support vector machine with the polynomial kernel has a better predictive performance. In addition to the results obtained based on real images, the behavior of the models studied was observed through simulated images, where it was possible to observe the advantages of support vector machine (SVM) models.

Bio:

Bachelor degree (2009) and Master degree (2011) in Statistics, titles obtained at Federal University of São Carlos (UFSCar). PhD in Statistics (2016) through the Graduate Programs in Statistics (PPGEst-UFSCar) and Graduate Studies in Computer Science (PPG-CC-UFSCar). Lecturer in the Specialization in Data Science & Big Data (DSBD-UFPR), MBA in Financial Analytics (DAAGE-UTFPR) and in Specialization in Data Science and Big Data (ECD-UFBA). Since August 2021, Assistant Professor at Department of Statistics, Federal University of Paraná (DEST-UFPR), Curitiba-PR, Brazil. Assistant Professor at Department of Statistics, Federal University of Bahia (DEST-UFBA), Salvador-BA, Brazil (2017-2021). Lecturer at Faculty of Technology SENAI-SP, São Carlos-SP, Brazil (2009-2015). His research areas include statistical machine learning, statistical inference, computational methods and big data analytics.

Keynote Speaker 3

Genevera Allen
Rice University, USA.

10:30 - 11:30 Virtual room

CFast Minipatch Ensemble Strategies for Discovery and Inference

Enormous quantities of data are collected in many industries and disciplines; this data holds the key to solving critical societal and scientific problems. Yet, fitting models to make discoveries from this huge data often poses both computational and statistical challenges. In this talk, we propose a new ensemble learning strategy primed for fast, distributed, and memory-efficient computation that also has many statistical advantages. Inspired by random forests, stability selection, and stochastic optimization, we propose to build ensembles based on tiny subsamples of both observations and features that we term minipatches. While minipatch learning can easily be applied to prediction tasks similarly to random forests, this talk focuses on using minipatch ensemble approaches in unconventional ways: making data-driven discoveries and for statistical inference. Specifically, we will discuss using this ensemble strategy for feature selection, clustering, and graph learning as well as for distribution-free and model-agnostic inference for both predictions and important features. Through huge real data examples from neuroscience, genomics and biomedicine, we illustrate the computational and statistical advantages of our minipatch ensemble learning approaches.

Closing Ceremony

Local Organizing Committee
IME/UFBA

11:30 - 11:45 Virtual room

Short course 3

David Banks
(Duke University, USA)

14:00 - 17:00 Virtual room

Introduction of Some of Data Science

We review some of the applications of data science in classification, cluster analysis, and text analytic applications. Topics include support vector machines, random forests, boosting, hierarchical agglomerative clustering, mixture models, and latent Dirichlet allocation.

Bio:

David Banks is a professor of statistics at Duke University and a fellow of the ASA, IMS and AAAS. He is a past editor of the Journal of the American Statistical Association and founding editor of Statistics and Public Policy. His research areas include agent-based models, adversarial risk analysis, dynamic networks, text data, and human rights statistics.

Short course 4

Paulo H. Ferreira
Federal University of Bahia, Brazil.

14:00 - 17:00 Virtual room

Statistical Process Control (SPC) for overdispersed count and unit data using R

The great competitiveness in the current market makes the companies’ search for excellence highly necessary. In this context, Statistical Process Control (SPC) is a very important and widely used alternative. One of its main techniques is the control charts, in which it is possible to observe whether the process is out of statistical control or not. Two types of data that have been receiving a lot of attention in the SPC literature are: (i) count data, which are present in various everyday situations (such as the number of nonconforming/defective items in a production line, the number of COVID-19 cases or deaths per epidemiological week, etc.) and often exhibit overdispersion (that is, variance greater than mean); and (ii) continuous data in the interval (0,1), or unit data, with applications in a wide range of areas, such as ecology, economics and industry, among others (relative air humidity and inflation rate are some examples of unit variables). Therefore, the main objective of this short course is to discuss SPC in this framework and also to provide the implementation and availability of functions in R, capable of generating control charts for the statistical monitoring of non-normal processes (e.g., count and unit data) via classical and Bayesian inferential approaches.

Bio:

Paulo Henrique Ferreira da Silva received the B.Sc., M.Sc. and Ph.D. degrees in statistics from the Federal University of São Carlos (UFSCar), Brazil, in 2009, 2011 and 2015, respectively. He is currently a Professor of Statistics with the Institute of Mathematics and Statistics, Federal University of Bahia (UFBA), Brazil. He held a Postdoctoral Training with the University of São Paulo (USP), Brazil, in 2019. His main research interests include survival and reliability analysis, data mining, and statistical process control.

Poster Session

December 1, 2022

Group 1

Chair
UFBA/Brazil)

18:00 - 19:00 Virtual room

CP1: Akalu Banbeta Tereda, Emmanuel Lesaffre and Joost van Rosmalen
Bayesian methods for borrowing historical information:The power prior for the linear regression model
[Poster] [Video ] [Link for the poster presentation]

CP2: Alexsandra Gomes de Lima, Raydonal Ospina Martinez and Cristiano Ferraz
Analysis of predictive model applied to public safety: a case study
[Poster] [Video ] [Link for the poster presentation]

CP3: Camilo Rengifo Gutiérrez, Sergio Arciniegas Alarcón, Marisol García Peña and Wojtek J . Krzanowski
The robust singular value decomposition and the problem of incomplete two-way data.
[Poster] [Video ] [Link for the poster presentation]

CP4: Carla Santos, C. Nunes, C. Dias and J. T. Mexia
On the role of U-matrices in the commutativity condition defining a special class of models with orthogonal block structure
[Poster] [Video ] [Link for the poster presentation]

CP5: Dionisio Alves da Silva Neto and Héliton Ribeiro Tavares
Item Response Theory and Computerized Adaptive Testing in a collective environment
[Poster] [Video ] [Link for the poster presentation]

CP6: Eliardo Costa and Rachel Tarini
Bayesian inference for the Net Promoter Score
[Poster] [Video ] [Link for the poster presentation]

CP7: Gladys Choque Ulloa and Guilherme Pumi
Estimation of Long-Dependency Processes in the Presence of Many Missing Data
[Poster] [Video ] [Link for the poster presentation]

CP8: Inácio Nascimento, Raydonal Ospina Martinez and Getúlio José Amorim do Amaral
Bagging algorithm for grouping improvement in the context of three-dimensional shapes
[Poster] [Video ] [Link for the poster presentation]

CP9: Junar Lingo and Milburn O. Macalos
Gamma Mixture of Chi-Square Distribution: Properties, Estimation and Simulation
[Poster] [Video ] [Link for the poster presentation]

CP10: Luiz Eduardo Silva Gomes and Thais C. O. Fonseca
A Bayesian network approach to food security modeling in Brazil
[Poster] [Video ] [Link for the poster presentation]

CP11: Mariana Thais Almeida, Vinícyus Araujo, Lilia Carolina Carneiro da Costa and Anderson Ara
Incidence relationship of COVID-19 among Brazilian regions using static and dynamic Bayesian network
[Poster] [Video ] [Link for the poster presentation]

CP12: Marta Ferreira
Resampling methods on inference of the smoothness of a time series
[Poster] [Video ] [Link for the poster presentation]

CP13: Márcio Henrique Matos de Freitas
Classification of images of fruits and vegetables with Deep learning
[Poster] [Video ] [Link for the poster presentation]

CP14: Md Shariful Islam
Big Data opportunities arising from the new data ecosystem
[Poster] [Video ] [Link for the poster presentation]

CP15: Muhammad Auwal and Nura Muhammad
Comparative study on the effect of organic and inorganic fertilizer on maize yield
[Poster] [Video ] [Link for the poster presentation]

CP16: Nicolas Mathias Hahn
Automatic Music Composition using Recurrent Neural Networks
[Poster] [Video ] [Link for the poster presentation]

CP17: Nixon Jerez-Lillo, Pedro L. Ramos, Francisco S. Godoy, Osafu A. Egbon and Francisco Louzada
The piecewise power-law model
[Poster] [Video ] [Link for the poster presentation]

CP18: Oluchukwu Asogwa, N. M. Eze and C. M. Eze
On the application of statistical quality control on Nigeria malt drink
[Poster] [Video ] [Link for the poster presentation]

CP19: Osuolale Peter Popoola
Fourth Industrial Revolution and Data Science: General Overview
[Poster] [Video ] [Link for the poster presentation]

CP20: Portia Kuzivakwashe Mafukidze, Samuel Musili Mwalili and Thomas Mageto
A modification to the Fuzzy Regression Discontinuity model
[Poster] [Video ] [Link for the poster presentation]

CP21: Renan Regis, Raydonal Ospina Martinez and Wilton Bernardino da Silva
Asset Pricing: An Estimation Alternative to the Five-Factor Model
[Poster] [Video ] [Link for the poster presentation]

CP22: Sandro Lins Lopes de Lucena and Jalmar M F Carrasco
Residual analysis for joint modelling of longitudinal binary data and survival data
[Poster] [Video ] [Link for the poster presentation]

CP23: Umar Ahmad Isyaku and Nura Muhammad
Body mass index and it's influence on HIV positive patients
[Poster] [Video ] [Link for the poster presentation]

CP24: Vikas Barnwal, Chandra Prakash Yadav and M S Panwar
Objective Bayesian Analysis of Recall-based Observations with Application to Breastfeeding Data
[Poster] [Video ] [Link for the poster presentation]

CP25: Zaida Quiroz and Marcos Prates
Bayesian spatio-temporal modelling of anchovy abundance through the SPDE Approach
[Poster] [Video ] [Link for the poster presentation]

Group 2

Chair
UFBA/Brazil)

19:00 - 20:00 Virtual room

CP26: Anderson Fonseca, Paulo Henrique Ferreira da Silva, Diego Carvalho Nascimento, Estefania Bonnail and Francisco Louzada
Unraveling water monitoring association towards weather attributes for response proportions data: A unit-Lindley learning
[Poster] [Video ] [Link for the poster presentation]

CP27: Abdulmuahymin Abiola Sanusi, S.I.S Doguwa, I. Yahaya and Y. M. Baraya
Burr X exponential Weibull distribution: Properties and applications
[Poster] [Video ] [Link for the poster presentation]

CP28: Ankita Dey, Diganta Mukherjee and Sugata Sen Roy
An improved model of latent class analysis in a multiple group set-up with a parameter of social influence
[Poster] [Video ] [Link for the poster presentation]

CP29: Carla Martinho and Cláudia Silvestre
Structural equation models to explain the consumption of social networks in public and private places
[Poster] [Video ] [Link for the poster presentation]

CP30: Carlos Tadeu Pagani Zanini, Helio dos Santos Migon and Ronaldo Dias
Variational Inference for Bayesian Bridge Regression
[Poster] [Video ] [Link for the poster presentation]

CP31: Cláudia Silvestre, Júlia Barros, Carina Ferreira and Pedro Frazão
Analysing newspaper articles using Python
[Poster] [Video ] [Link for the poster presentation]

CP32: Chijioke Nweke and Akaninyene Udo Udom
Approximation results for stochastic multiple-sets split feasibility and split equality problems in Hilbert space
Poster] [Video ] [Link for the poster presentation]

CP33: Eddy Johanna Fajardo and Héctor Romero
Human capital and internationalization of manufacturing firms in Colombia: an application of logistic regression analysis
[Poster] [Video ] [Link for the poster presentation]

CP34: Gabriel Gomes Ribeiro, Lilia Carolina Carneiro da Costa and Paulo Henrique Ferreira
Forecasting Results of Matches of the 2021 Brazilian Championship Through Bayesian Inference
[Poster] [Video ] [Link for the poster presentation]

CP35: Gianpaolo Zammarchi, Maurizio Romano and Claudio Conversano
Auto-labeling topics with word embedding
[Poster] [Video ] [Link for the poster presentation]

CP36: Gustavo Oliveira, Michelle P. Vale dos Passos, Leila Denise A. F. Amorim and Marcelo Taddeo
Estimation of indirect effects in structural equation models under violation of model assumptions
[Poster] [Video ] [Link for the poster presentation]

CP37: Hamel Elhadj
Prediction of nonparametric regression models with long memory data
[Poster] [Video ] [Link for the poster presentation]

CP38: Igor Patricio, Caio Zava Ferreira and Carlos Tadeu Pagani Zanini
Ecological equilibrium and coral reefs: deep neural networks for detection of Ancaster planci in images and videos.
[Poster] [Video ] [Link for the poster presentation]

CP39: Jaciele de Jesus Oliveira, Raydonal Ospina Martínez and Cristiano Ferraz
SIR models and ensemble-type algorithm with an application covid-19
[Poster] [Video ] [Link for the poster presentation]

CP40: Louiza Soltane and Abdelhakim Necir
A new kernel estimator for the tail index under random censoring
[Poster] [Video ] [Link for the poster presentation]

CP41: Márcio Luis Lanfredi Viola
Using VLMC model and Boosting to classify
[Poster] [Video ] [Link for the poster presentation]

CP42: Natan Hilário da Silva and Adriano Kamimura Suzuki
A new bivariate survival model using FGM copulas: Modeling, Inference and Influence Analysis
[Poster] [Video ] [Link for the poster presentation]

CP43: Nicky Yungco and Daisy Lou L. Polestico
On Dashboard and Frequency Model Development for Terrorism in Southern Philippines
[Poster] [Video ] [Link for the poster presentation]

CP44: Osama Hussien
The exploratory data analysis approach for statistical consulting process, with enphasize on big data
[Poster] [Video ] [Link for the poster presentation]

CP45: Pedro Rici, André Luiz Carvalho Ottoni and Marcela Silva Novo
Deep Learning approach to Covid-19 detection using HyperTuningSK algorithm for hyperparameter tuning
[Poster] [Video ] [Link for the poster presentation]

CP46: Pragya Kumari, Atanu Bhattacharjee and Gajendra K. Vishwakarma1
Cutoff value for prominent biomarkers of breast cancer on overall survival using CART analysis
[Poster] [Video ] [Link for the poster presentation]

CP47: Rabiu Isah
Development of an uncertainty-aware anomaly detection in videos data using a normalizing flow-Bayesian variational autoencoder
[Poster] [Video ] [Link for the poster presentation]

CP48: Rabiu Isah
An Optical Flow computation based on Shearlet Transform: A mathematical Formulation
[Poster] [Video ] [Link for the poster presentation]

CP49: Samer Kharroubi
Modeling the Spread of COVID-19 in Lebanon: A Bayesian Perspective
[Poster] [Video ] [Link for the poster presentation]

CP50: Silvia Noemí-Pérez, Mónica Giuliano and Luis Fernández
Detection of Parkinson's disease by selection of acoustic variables
[Poster] [Video ] [Link for the poster presentation]

CP51: Yan Barros
Tropical Cyclones forecasting with Physics Informed Neural Networks
[Poster] [Video ] [Link for the poster presentation]

Group 3

Chair
UFBA/Brazil)

20:00 - 21:00 Virtual room

CP52: Bishal Diyali
Selecting between the generalized inverted Rayleigh and the generalized inverted half logistic distributions
[Poster] [Video ] [Link for the poster presentation]

CP53: Breno Gabriel da Silva, Clarice Garcia Borges Demétrio, Renata Alcarde Sermarini, Kauana Engel, Clebson Lima Cerqueira and Alexandre Behling.
Gamma and Weibull distributions applied to tree biomass data of the species black wattle (Acacia mearnsii de Wild.)
[Poster] [Video ] [Link for the poster presentation]

CP54: Carlo Corrêa Solci, Valdério Anselmo Reisen and Paulo Canas Rodrigues
Robust Local Bootstrap for Stationary Time Series with Missing Data
[Poster] [Video ] [Link for the poster presentation]

CP55: Douglas Farias Cordeiro, Leandro Rodrigues da Silva Souza and Núbia Rosa Da Silva
Brazilian public agreements proposals clustering model based on BERT and kMeans
[Poster] [Video ] [Link for the poster presentation]

CP56: Fabiano de Moraes Domingues and Carlos Tadeu Pagani Zanini
Deep learning based classification model to detect fake websites used in phishing attacks
[Poster] [Video ] [Link for the poster presentation]

CP57: Francisco F. Queiroz and Silvia L.P. Ferrari
PLreg: an R package for modeling bounded continuous data
[Poster] [Video ] [Link for the poster presentation]

CP58: Getulio Amaral and Jhonnata Bezerra de Carvalho
Support Vector Machines In Statistical Shape Analysis
[Poster] [Video ] [Link for the poster presentation]

CP59: Heverton Anunciação
Conflicts bettween the Data Scientist, Marketing, CRM and Customer Experience
[Poster] [Video ] [Link for the poster presentation]

CP60: Hugo Carvalho and Carlos Almada
Probabilistic Modelling of Antonio Carlos Jobim's Music
[Poster] [Video ] [Link for the poster presentation]

CP61: Ibrahim Lawal Kane
Dynamic Linear State Space modeling Approach with Application to Annual Rainfall Data
[Poster] [Video ] [Link for the poster presentation]

CP62: João Vítor Rocha da Silva and Paulo Canas Rodrigues
Different faces of defense: Studying the National Basketball Association's (NBA) defensive positions
[Poster] [Video ] [Link for the poster presentation]

CP63: Jonatha Pimentel, Rodrigo S. Bulhões and Paulo Canas Rodrigues
How do human and meteorological variables affect the spatio-temporal behavior of Brazilian wildfires?
[Poster] [Video ] [Link for the poster presentation]

CP64: Juan Ruiz Otondo
Machine Learning (ML) algorithms: Prediction of SARS-CoV-2 cases in Bolivia
[Poster] [Video ] [Link for the poster presentation]

CP65: Kim Silva, Crysttian Arantes Paixão and Paulo Canas Rodrigues
Application of Machine Learning Techniques for Fake News Classification
[Poster] [Video ] [Link for the poster presentation]

CP66: Ludmila Cavalcanti, Alex Dias Ramos and Caliteia Santana de Sousa
Um Modelo Biológico com Operador de Substituição
[Poster] [Video ] [Link for the poster presentation]

CP67: Marcelo Fonseca, Vanda Lourenço and Paulo Canas Rodrigues
Robust modeling in statistical genetics
[ Poster] [Video ] [Link for the poster presentation]

CP68: Miguel Pérez, Omar Mejía, Juan Carlos Meneses, Cesar Serrano, Francisco León
Data analytics, a good practice of educational innovation in decision making under the Lean - Six Sigma approach
[Poster] [Video ] [Link for the poster presentation]

CP69: Michela Sheryl Noven, Winita Sulandari and Respatiwulan
Implementation of Neural Network Autoregression (NNAR) and Double Exponential Smoothing Holt Method in forecasting PT Telkom Indonesia Stock Price
[Poster] [Video ] [Link for the poster presentation]

CP70: Matheus Saldanha and Adriano K. Suzuki
Parameter‑Varying Support for Maximum Likelihood on Data with Unknown Left Endpoint
[Poster] [Video ] [Link for the poster presentation]

CP71: Natiele de Almeida Gonzaga, Wélson Antônio de Oliveira, Rafaela de Carvalho Salvador, Isolina Aparecida Vilas Bôas, Edilson Marcelino Silva and Joel Augusto Muniz
Ajuste de modelos não lineares para descrever a germinação de sementes de Brachiaria brizantha
[Poster] [Video ] [Link for the poster presentation]

CP72: Nayguel Costa, Paulo Canas Rodrigues and Luciano Rebouças de Oliveira
Active deep learning for seismic facies classification
[Poster] [Video ] [Link for the poster presentation]

CP73: Rodrigo Esteves and Carlos Tadeu Pagani Zanini
Image classification and detection with Imagenet using Transfer Learning
[Poster] [Video ] [Link for the poster presentation]

CP74: Rodrigo M. R. de Medeiros and Marcelo Bourguignon
A Simple and Useful Regression Model for Fitting Count Data
[Poster] [Video ] [Link for the poster presentation]

CP75: Silvina Pistonesi, Jorge Martinez and Ana Georgina Flesia
Noise-Tolerant texture feature characterization through an improve CPLBP
[Poster] [Video ] [Link for the poster presentation]

CP76: Titis Jati Nugraha Saputra, Winita Sulandari and Isnandar Slamet
Forecasting the composite stock price index (CSPI) on the JAKARTA Stok Exchange(JSE) using the high order intuitionistic FUZZY time series method
[Poster] [Video ] [Link for the poster presentation]

CP77: Thiago Stephem da Motta and Carlos Tadeu Pagani Zanini
Deep neural networks for detection of waste in the deep ocean floor
[Poster] [Video ] [Link for the poster presentation]

CP78: Wélson António de Oliveira, Natiele de Almeida Gonzaga, Rafaela de Carvalho Salvador, João Domingos Scalon and José Márcio de Mello
Spatial Analysis of the Distribution of the Native Vegetation Species Copaifera Langsdorffii in a Forest Fragment of the Atlantic Forest Biome
[Poster] [Video ] [Link for the poster presentation]

Poster Session - Students of the Specialization on Data Science and Big Data (UFBA)

December 2, 2022

Room 1

Juracy Almeida
Federal University Of Bahia, Brazil

19:00 - 20:00 Virtual room

CP1-ECD: Débora Oliveira Santana
Regional and temporal patterns analysis of mortality in Brazil supported by a data warehouse
[Poster] [Video ] [Jury]

CP2-ECD: Hamilton Caldas Santos
Applicability and potential in the use of business inteligence tools in the reduction of home care costs
[Poster] [Video ] [Jury]

CP3-ECD: Leduan Gheller
A case study involving Business Intelligence as a tool to support enrollment management in educational institutions
[Poster] [Video ] [Jury]

Room 2

Crysttian Arantes Paixão
Federal University Of Bahia, Brazil

19:00 - 20:00 Virtual room

CP4-ECD: Isabella Calfa Vieira Costa
Analysis of a brazilian electricity distributor based on population's tweets
[Poster] [Video ] [Jury]

CP5-ECD: Evandro Botti de Cerqueira
Text mining in social media for market analysis of products
[Poster] [Video ] [Jury]

CP6-ECD: Fagner Ferreira de Santana
Machine Learning as an aid tool in the school dropout treatment
[Poster] [Video ] [Jury]

Room 3

Paulo Henrique Ferreira da Silva
Federal University Of Bahia , Brazil

19:00 - 20:00 Virtual room

CP7-ECD: Paulo Roberto Carvalho Vasquez
Photovoltaic system as alternative energy souce for northeast Brazil
[Poster] [Video ] [Jury]

CP8-ECD: Lucas de Almeida Gama Paixão
Prediction of Underground Mine Stope Stability: a Case Study Based on Supervised Learning Methods
[Poster] [Video ] [Jury]

CP9-ECD: Gregori Alisson de Sena Ramos
A study on data science applied to sporting events: An exploratory and descriptive analysis
[Poster] [Video ] [Jury]

Room 4

Jalmar Carrasco
Federal University Of Bahia, Brazil

20:00 - 21:00 Virtual room

CP10-ECD: Diego Silva Cunha
Machine learning to classify federal deputies elected in Brazilian elections
[Poster] [Video ] [Jury]

CP11-ECD: Gilson Ramos dos Santos
O Business Intelligence: Panel performance indicators financial for company of trade and services of petroleum. A Vision Multidimensional to support decision making
[Poster] [Video ] [Jury]

Room 5

Allan Robert da Silva
Federal University Of Sergipe, Brazil

20:00 - 21:00 Virtual room

CP12-ECD: Carlos Antonio Guimarães
Data mining model for analysis of feelings and emotions of goverment actions municipal of Salavdor
[Poster] [Video ] [Jury]

CP13-ECD: Márcio Alexandre Silva Monteiro
Netflix movie recomendation system using alternate last square matrix fatoration
[Poster] [Video ] [Jury]

CP14-ECD: Taian Fonseca Feitosa
Use of machine learning to assess the influence of comorbidities on the risk of death from COVID-19
[Poster] [Video ] [Jury]

CP15-ECD: Rubens José Teixeira Machado Neto
Detection of banking organization using named entity recognition
[Poster] [Video ] [Jury]

CP16-ECD: Walmir dos Santos Cardoso Filho
Neural Networks Models Applied to Brazilian Soybean Production Prediction
[Poster] [Video ] [Jury]

CP17-ECD: Peter Gonçalves Morris
Wind persistence analysis for expansion of the Te Āpiti wind farm in New Zealand
[Poster] [Video ] [Jury]

Previous Editions

Our Endorsements

4th Conference on Statistics and Data Science

All sessions will be broadcast on the YouTube channel of the Department of Statistics of the Federal University of Bahia

youtube.com/@DEst_UFBA

The CSDS-2022

Important dates

Until November 06, 2022.

Our Speakers

Alexandra M. Schmidt

Dalton Andrade

Genevera Allen

Our Schedule

BRT Time (GMT-3)

Opening Ceremony

Keynote Speaker 1

Round Table 1: The job market in statistics and data science

Short course 1

Short course 2

Keynote Speaker 2

Round Table 2 – The future of education in statistics and data science

Invited Paper Session on Statistical Learning

Keynote Speaker 3

Closing Ceremony

Short course 3

Short course 4

Poster Session

December 1, 2022

Group 1

Group 2

Group 3

Poster Session - Students of the Specialization on Data Science and Big Data (UFBA)

December 2, 2022

Room 1

Room 2

Room 3

Room 4

Room 5

Previous Editions

Our Endorsements