brunocampos01's CodersRank profile

Bruno Campos

Florianopolis, Brazil

Intro

Software Engineer with several years of experience specializing in Data Engineering, Analytics, DevOps and MLOps.

Scores & Badges

CodersRank Score

961.1

CodersRank Rank

Top 1%

Based on:

18 events

Top 100

Python

Developer

Brazil

Top 50

SQL

Developer

Brazil

Top 50

Shell

Developer

Brazil

Show all badges

Language overview

Python

234.5

exp.

Top 2% out of 165K Worldwide Top 3% out of 3K Brazil

Jupyter Notebook

238.7

exp.

Top 0.9% out of 37K Worldwide Top 1% out of 700 Brazil

Shell

75.7

exp.

Top 0.8% out of 139K Worldwide Top 2% out of 3K Brazil

Java

91.6

exp.

Top 6% out of 130K Worldwide Top 11% out of 3K Brazil

21.8

exp.

Top 16% out of 122K Worldwide Top 19% out of 2K Brazil

TSQL

29.3

exp.

Top 2% out of 54K Worldwide Top 1% out of 2K Brazil

HCL

13.1

exp.

Top 21% out of 8K Worldwide Top 11% out of 187 Brazil

SQL

26.4

exp.

Top 2% out of 52K Worldwide Top 2% out of 1K Brazil

Work Experiences

List your work history, including any contracts or internships

Lanlink Informática Ltda.

May 2020 - Feb 2022 (1 year 9 months)

Florianópolis, Brazil

Data Engineer

→ Architecting and Building a Data Lake with 1M+ Companies
• Designed a data lake with lambda architecture to extract business insights from the data of 1M+ companies.
• Designed the data governance (lineage, catalog and audit) using Apache Atlas and the data security management with Kerberos, Sentry, roles in Hive, and ACL policies.
• Generated the data in avro and parquet formats with snappy compression.
• Managed the physical servers (configuration, temperature, adding/exchanging parts), O.S. (network, users) and also the applications using YARN, Docker, Docker-Compose.
• Improved the efficiency of queries by 48 times faster compared to the transactional environment.
• Tech Stack: Oracle Exadata, Cloudera, Hadoop

→ Creating a Multi-Tenant Environment for Data Scientists
• Designed and built a multi-tenant data analysis and modeling environment where users could perform analysis directly in the cluster without the need to install and configure libraries or Jobs Spark.
• Tech Stack: JupyterHub, Spark, YARN, Docker, Kerberos, Ansible

→ Architecting and Building a Data Warehouse with 300 TB+
• Builds multiple pipelines for daily data ingestion and processing.
• Automated creation of DB, tables, partitions and statistics update.
• Managed 300 TB+ on Hive/Impala.
• Achieved the objective of performing accounting analyses with a monthly volume of data in a few minutes, which was not possible before.
• Tech Stack: HDFS, Hive, Impala, YARN, HUE, Pyspark, Airflow

→ Data Pipeline Optimization from 30 Days Latency to < 4 Hours
• Developed a framework in Python, using Object-Oriented Programming and Template Method design pattern, to ingest and process more than 300k records/day.
• Architected data movement between tasks using Redis.
• Framework optimized from 30 days latency to < 4 hours.
• Tech Stack: Postgres, Oracle 19c, Airflow, Pytest, Pyspark, Celery, Arrow, Hive, Impala, HDFS, YARN, Docker, Azure DevOps, Ansible.

apache spark hadoop bigdata PostgreSQL Python redis docker compose docker ci/cd ansible linux ubuntu bash scripting apache database design unit testing oop hadoop apache spark bigdata

Softplan Planejamento e Sistemas

Apr 2019 - Apr 2020 (1 year)

Florianópolis, Brazil

Data Engineer

→ Architecting and Building the Company's own Data Lake
• Automated the process of integrating multiple sources (DB2, SQLServer, Postgres, MySQL, Sharepoint, Oracle 11) in data lake using Python and SQL.
• Coded custom MapReduce jobs using Hive UDFs.
• Built Data Warehouse using Azure Data Factory, SQLServer, Databricks. For OLAP cubes, designed in Analysis Services and DAX.
• Documented the project using MKDocs, hosted site on Gitlab, and automated publishing changes with Gitlab’s CI/CD.
• Tech Stack: Apache (Azkaban, Ambari, Spark, Tez, Ranger, Sqoop, YARN, Hive) and Azure (HDInsights, Storage, VM)

→ Framework for Data Warehouse Creation Automation
• Developed framework to resource automation on Azure for provisioning, configuring and managing data warehouse services using Terraform, Python and Powershell.
• This framework helped business units to standardize technologies, facilitated the creation of data warehouses and reduced cloud services cost more than 7 times.
• Tech Stack: Azure Analysis Services, Automation Account, Storage Account, Log Analytics, PowerBI

Azure cloud MySQL PostgreSQL Python docker compose docker bash scripting database design linux ubuntu

Linx

Dec 2016 - Mar 2019 (2 years 3 months)

Florianópolis, Brazil

Big Data Developer

→ Cross-Cloud Migration Resulting in Savings of $10K+ USD per Month
• Designed, architected, migrated and rebuilt the services and data used on Rackspace to self-managed AWS services, which resulted in $10k+ USD savings per month.
• Tech Stack: AWS (EMR, S3, EC2, DynamoDB, RDS, Route 53), HBase, Hadoop, HDFS, Hive, MySQL, Oozie, Zookeeper, Sqoop, Spark, Jenkins, Java, Shell Scripting, Chef

→ Pipeline Optimization & Enhancement
• Reduced from 20 hours to < 4 hours the main data pipeline that was processing more than 1 billion (13 TB+) records per day.
• Tech Stack: Airflow, S3, EMR(Spark, Hive), DynamoDB, RDS, Java, Python, Chef

→ Pipelines Migration
• Improved and facilitated data pipeline management by centralizing multiple flows as Oozie, Luigi, Nifi in a single tool (Airflow). From this, it was possible to generate periodic reports and apply SLA.
• Tech Stack: EMR, EC2, Java, Python, Shell Scripting, Chef

AWS Java MySQL Python flask python3

Portfolio

Add some compelling projects here to demonstrate your experience

Federated Learning for Text Generation

Sep 2021 - Present

Machine Learning models: recurrent neural network
Main Packages: Numpy, Pandas, Matplotlib, Plotly, WordCloud, Scikit-Learn, Keras, Tensorflow Federated

Check Demo

Comparative Analysis of Techniques for Forecasting Time Series in Financial Markets

Jan 2020 - Dec 2020

Statistics models: Autoregressive, ARIMA, SARIMA
Machine Learning models: Random Forest, SVM, LSTM
Main Packages: Numpy, Pandas, Seaborn, Matplotlib, Scikit-Learn, Statsmodels, Keras

Check Demo

Python Package: encrypt-file

Jan 2020 - Present

CLI to encrypt or decrypt files with only one command.

Check Demo

Education

This section lets you add any degrees or diplomas you have earned.

Universidade Federal de Santa Catarina

Master of Science - MS, Computer Science (Distributed Systems)

Jan 2021 - Jan 2023

Universidade Federal de Santa Catarina

Bachelor of Technology - BTech, Computer Science (Information Systems)

Jan 2014 - Jan 2020

Academia Militar das Agulhas Negras

Brazilian Army Officer, Military Sciences

Jan 2012 - Jan 2014

Certificates

Big Data Specialization

Jan 2018

Machine Learning Nanodegree

Jan 2018

Intro

Scores & Badges

What is this?

Tech Skills

Timeline

Activity Chart

Language overview

Technologies

Work Experiences

Portfolio

Education

Certificates

Jobs for you

Would you like to improve your profile?