About Me

My name is Matheus Andrade. I have a degree in Mechanical Engineering and a master's degree in Mechatronics.

I currently work as a researcher on topics related to Industry 4.0 and also as an application developer in the field of industrial automation.

I am developing personal projects on Data Science to gain experience in solving business problems and improve my abilities with the tools that a data scientist uses on a daily basis.

I am looking for an opportunity to work professionally as a Data Scientist to improve the company decision making by building solutions using data.

Data Science Skills

Programming Languages and Database

  • Python for data analysis.
  • SQL for data extraction.
  • Databases: SQL Server, MySQL, PostgreSQL, InfluxDB

Statistics and Machine Learning

  • Descriptive Statistics.
  • Regression, Classification, Clustering, Learn to Rank, Time Series.
  • KNN, Random Forest, XGBoost, LightGBM, CatBoost, K-Means, GMM.
  • Algorithm Performance Metrics: Confusion Matrix, Accuracy, R², MAE, MAPE, Silhouette Score.
  • Machine Learning Libraries: Scikit Learn, Scipy, Pandas, Numpy.
  • Feature Engineering, Data Preparation, Dimensionality Reduction.

Data Visualization

  • Data Visualization Libraries: Matplotlib, Seaborn, Plotly, Folium.
  • Metabase, Grafana, Streamlit.

Software Engineering

  • Python Libraries: Flask, Tkinter, OpenCV, Streamlit
  • Git, Github, Gitlab
  • Anaconda
  • AWS (S3, EC2, RDS)
  • Render, Heroku Cloud
  • Telegram Bot, Google Sheets API

Professional Experiences

Working Experience

I have experience in Object Oriented Programming, Computer Vision, Multi Agent Systems, Evolutionary Computation, Database Administration and Machine Learning. I've already worked with Python, Matlab, SQL, HTML, CSS and JavaScript.

Data Science Projects

Customers Clustering for High Value Customers Identification abd Loyalty Program Creation.

Regression for Oil Production Prediction for Several Wells.

Customers Classification to Rank Database According to Purchasing Propensity.

Regression for Drugstores Sales Prediction.

Computer Vision for Liquid Column Measurement in Real Time.

Insights for the Real State Market to Help the CEO to Find the Best Homes to Buy and Sell.

Insights for the Real State Market to Assist in the Observationb and Analysis of House Prices.

Application Developer and Researcher

1 year of experience developing applications for the industrial automation sector.

Education

Master's Degree in Mechatronics

UFBA (Federal University of Bahia), PPGM (Graduate Program in Mechatronics)

Development of a computer vision application for measurement of a column of liquid and development of a digital twin aimed at improving performance using machine learning, genetic algorithms and multi-agent systems as a tool for production data analysis in the master's degree.

Degree in Mechanical Engineering

UNIFACS (Universidade Salvador)

Data Science Projects

Imagem fazendo alusão a uma separação de clientes em grupos

Loyalty Program Creation

The business team asked the data scientists to select the most valuable customers for the company Recency, frequency and monetary aspects were considered by the business team as the main characterists to evaluate the customers in clusters.

Tools used in the project:

  • Python Libraries: Pandas, Seaborn, Scikit Learn, Matplotlib, Pickle.
  • UMAP, t-SNE.
  • AWS (RDS, S3, EC2).
  • Git, Github.
  • Visual Studio Code.
Imagem de uma plataforma de petróleo offshore.

Oil Production Prediction

Production prediction is one of the core problems in a company. The provided dataset is a set of nearby wells located in the United States and their 12 months cumulative production. The company data scientist needs to build a model from scratch to predict production.

Tools used in the project:

  • Python Libraries: Boruta, Scikit Learn, Numpy, Seaborn, Matplotlib, Numpy, XGBoost, LightGBM, Catboost.
  • Render Cloud.
  • Streamlit Cloud.
  • Git, Github.
  • Visual Studio Code.
Imagem escrita Health Insurance.

Health Insurance Cross Sell

An insurance company wants to start selling vehicle insurance to the customers that already have health insurance. They believe that one of the ways to reach as many customers as possible with the least amount of calls is to make a machine learning model that sorts the list of customers to maximize the amount of contracted services.

Tools used in the project:

  • Python Libraries: Pandas, Seaborn, Boruta, Scikit Learn, Scikit Plot, Flask, Pickle.
  • Google Sheets API.
  • Heroku Cloud.
  • Git, Github.
  • Visual Studio Code.
Imagem representando o um gráfico de linha.

Drugstore Sales Prediction

The CEO from Rossmann wants to renovate all stores and asked wants to know what the income of all the stores will be in the next 6 weeks. A regression model would be of great help.

Tools used in the project:

  • Python Libraries: Pandas, Seaborn, Boruta, Scikit Learn, Flask, Pickle.
  • BotFather, Telegram.
  • Heroku Cloud.
  • Git, Github.
  • Visual Studio Code.
Imagem do rosto de um robô para dar a ideia de visão computacional.

Computer Vision for Liquid Column Measurement

Computer vision system to measure the essential oil produced by steam distillation in real time communicating with supervisory applications using Modbus TCP.

Tools used in the project:

  • Python Libraries: OpenCV, Numpy, pyModbusTCP, picamera.
  • Git, Github.
  • Google Colab.
  • Raspberry Pi, Camera Module V2.
Imagem representando o mercado imobiliário.

Insights for Real State Market Negatiation

The House Rocket is a real state company. The data scientist from House Rocket should help the CEO answering two questions and creating two tool to help understanding the dataset.

Tools used in the project:

  • Python Libraries: Numpy, Pandas, Matplotlib, Seaborn, Plotly, Geopandas, Streamlit, Folium.
  • Heroku Cloud
  • Git, Github.
  • Visual Studio Code.
Imagem escrito aluguel em inglês.

Insights for Real State Market Rent

Real estate insights project, using a dataset from Airbnb, in New York, to help the company's CEO to evaluate the behavior of the prices of properties available in the city.

Tools used in the project:

  • Python Libraries: Pandas, Plotly.
  • Git, Github.
  • Jupyter Notebook.

Contact me: