brunocampos01
Bruno Campos
Florianopolis, Brazil

Software Engineer with several years of experience specializing in Data Engineering, Analytics, DevOps and MLOps.

CodersRank Score

What is this?

This represents your current experience. It calculates by analyzing your connected repositories. By measuring your skills by your code, we are creating the ranking, so you can know how good are you comparing to another developers and what you have to improve to be better

Information on how to increase score and ranking details you can find in this blog post.

961.1
CodersRank Rank
Top 1%
Based on:
Stackoverflow 18 events
Top 100
Python
Python
Developer
Brazil
Top 50
SQL
SQL
Developer
Brazil
Top 50
Shell
Shell
Developer
Brazil
Highest experience points: 0 points,

0 activities in the last year

List your work history, including any contracts or internships
Lanlink Informática Ltda.
May 2020 - Feb 2022 (1 year 9 months)
Florianópolis, Brazil
Data Engineer
→ Architecting and Building a Data Lake with 1M+ Companies
• Designed a data lake with lambda architecture to extract business insights from the data of 1M+ companies.
• Designed the data governance (lineage, catalog and audit) using Apache Atlas and the data security management with Kerberos, Sentry, roles in Hive, and ACL policies.
• Generated the data in avro and parquet formats with snappy compression.
• Managed the physical servers (configuration, temperature, adding/exchanging parts), O.S. (network, users) and also the applications using YARN, Docker, Docker-Compose.
• Improved the efficiency of queries by 48 times faster compared to the transactional environment.
• Tech Stack: Oracle Exadata, Cloudera, Hadoop

→ Creating a Multi-Tenant Environment for Data Scientists
• Designed and built a multi-tenant data analysis and modeling environment where users could perform analysis directly in the cluster without the need to install and configure libraries or Jobs Spark.
• Tech Stack: JupyterHub, Spark, YARN, Docker, Kerberos, Ansible

→ Architecting and Building a Data Warehouse with 300 TB+
• Builds multiple pipelines for daily data ingestion and processing.
• Automated creation of DB, tables, partitions and statistics update.
• Managed 300 TB+ on Hive/Impala.
• Achieved the objective of performing accounting analyses with a monthly volume of data in a few minutes, which was not possible before.
• Tech Stack: HDFS, Hive, Impala, YARN, HUE, Pyspark, Airflow

→ Data Pipeline Optimization from 30 Days Latency to < 4 Hours
• Developed a framework in Python, using Object-Oriented Programming and Template Method design pattern, to ingest and process more than 300k records/day.
• Architected data movement between tasks using Redis.
• Framework optimized from 30 days latency to < 4 hours.
• Tech Stack: Postgres, Oracle 19c, Airflow, Pytest, Pyspark, Celery, Arrow, Hive, Impala, HDFS, YARN, Docker, Azure DevOps, Ansible.
apache spark hadoop bigdata PostgreSQL Python redis docker compose docker ci/cd ansible linux ubuntu bash scripting apache database design unit testing oop hadoop apache spark bigdata
Softplan Planejamento e Sistemas
Apr 2019 - Apr 2020 (1 year)
Florianópolis, Brazil
Data Engineer
→ Architecting and Building the Company's own Data Lake
• Automated the process of integrating multiple sources (DB2, SQLServer, Postgres, MySQL, Sharepoint, Oracle 11) in data lake using Python and SQL.
• Coded custom MapReduce jobs using Hive UDFs.
• Built Data Warehouse using Azure Data Factory, SQLServer, Databricks. For OLAP cubes, designed in Analysis Services and DAX.
• Documented the project using MKDocs, hosted site on Gitlab, and automated publishing changes with Gitlab’s CI/CD.
• Tech Stack: Apache (Azkaban, Ambari, Spark, Tez, Ranger, Sqoop, YARN, Hive) and Azure (HDInsights, Storage, VM)

→ Framework for Data Warehouse Creation Automation
• Developed framework to resource automation on Azure for provisioning, configuring and managing data warehouse services using Terraform, Python and Powershell.
• This framework helped business units to standardize technologies, facilitated the creation of data warehouses and reduced cloud services cost more than 7 times.
• Tech Stack: Azure Analysis Services, Automation Account, Storage Account, Log Analytics, PowerBI
Azure cloud MySQL PostgreSQL Python docker compose docker bash scripting database design linux ubuntu
Linx
Dec 2016 - Mar 2019 (2 years 3 months)
Florianópolis, Brazil
Big Data Developer
→ Cross-Cloud Migration Resulting in Savings of $10K+ USD per Month
• Designed, architected, migrated and rebuilt the services and data used on Rackspace to self-managed AWS services, which resulted in $10k+ USD savings per month.
• Tech Stack: AWS (EMR, S3, EC2, DynamoDB, RDS, Route 53), HBase, Hadoop, HDFS, Hive, MySQL, Oozie, Zookeeper, Sqoop, Spark, Jenkins, Java, Shell Scripting, Chef

→ Pipeline Optimization & Enhancement
• Reduced from 20 hours to < 4 hours the main data pipeline that was processing more than 1 billion (13 TB+) records per day.
• Tech Stack: Airflow, S3, EMR(Spark, Hive), DynamoDB, RDS, Java, Python, Chef

→ Pipelines Migration
• Improved and facilitated data pipeline management by centralizing multiple flows as Oozie, Luigi, Nifi in a single tool (Airflow). From this, it was possible to generate periodic reports and apply SLA.
• Tech Stack: EMR, EC2, Java, Python, Shell Scripting, Chef
AWS Java MySQL Python flask python3

Request failed with status code 503

Jobs for you

Show all jobs
Feedback