fabiosv
Fabio Valonga
Sao Paulo, Brazil

I am a versatile data professional with a background spanning data engineering, NLP, and web development. With a strong track record of creating and optimizing data pipelines, leading teams, and improving code quality, I thrive on complex challenges. My experience includes working with diverse tech stacks and cloud platforms, ensuring efficient data processing and analytics. I am passionate about delivering data-driven insights and fostering a culture of excellence in every role I take on.

CodersRank Score

What is this?

This represents your current experience. It calculates by analyzing your connected repositories. By measuring your skills by your code, we are creating the ranking, so you can know how good are you comparing to another developers and what you have to improve to be better

Information on how to increase score and ranking details you can find in this blog post.

928.1
CodersRank Rank
Top 1%
Mid Developer
Python
Python
Associate Developer
Ruby
Ruby
Mid Developer
JavaScript
JavaScript
Highest experience points: 0 points,

0 activities in the last year

List your work history, including any contracts or internships
Hvar Consulting Service
Jun 2023 - Apr 2024 (10 months)
Remote
Tech Lead - Data Engineer
Summary:
• Migrating data from SAP Data Warehouse to Azure Storage Gen2/Databricks Lakehouse (Unity Catalog), enabling seamless integration with PowerBI reports
• Providing leadership to the team and ensuring the successful execution of projects
• Ingest SAP tables through Azure Data Factory Pipeline and dispose in StarSchema arquitecture
• Supporting streaming data pipelines utilizing Spark Structured Streaming, Change Data Feed, and Delta Live Tables
• Explore View Materializing with Spark Streaming for incremental joins


Day-to-day responsibilities:
• Designing data pipeline architectures to ensure optimal performance
• Assisting the team in overcoming technical challenges and roadblocks
• Mentoring and guiding squad members in their career development
• Defining and promoting code patterns for reusable and maintainable code for Databricks and Data Factory

Improvements/Accomplishments:
• Created a generic framework that supports all data wrangling.
• Reduced computer costs by optimizing cluster usage/resources and applying Spark Structured Streaming
• Reduced cloud costs by migrating Data Factory parallelism to Databricks Jobs run in Spark Streaming pipelines

Technology Stack:
• Azure Data Factory, Azure Functions, Azure Storage Gen2
• Azure Databricks
databricks azure pyspark unity catalog
Softensity
Apr 2022 - Jun 2023 (1 year 2 months)
Remote
Data Engineer
Summary:
• Migrating data from RDS to S3 using EMR and Databricks, which involved Spark and Hadoop
• Establishing a data-serving mechanism from S3 through REST API, developed using Node.js and Express
• Creating a robust Data Pipeline leveraging Delta Lake libraries and Kafka

Day-to-day responsibilities:
• Processed historical data, optimizing performance and efficiency by coalescing thousands of small files and database dumps into a Delta Table using batch and stream processing, specifically Databricks Autoloader.
• Applied intricate business rules to the data, employing PySpark for data wrangling
• Supported an ongoing pipeline, adding new datasets with Scala, reading Kafka topics, applying deduplication, business rules, and saving them to S3
• Facilitated data access via a REST API developed with Node.js and Express
• Pioneered code patterns for reusability and maintainability

Improvements/Accomplishments:
• Creating a Historical pipeline responsible for ingesting more than 1 Petabyte of data, demonstrating my capability to handle large-scale data operations
• Developing reports for data quality, aiding in the identification of erroneous or missing data and rule changes
• Supporting the creation of monitoring tools to ensure the health of the data pipeline
• Reducing operational costs through cluster optimization and adhering to clean code practices

Technology Stack:
• Leveraging Java and Python for Blockchain and REST API collectors
• Employing Databricks pipelines powered by PySpark, Delta Lake, and Autoloader
• Crafting EMR pipelines using Spark and Scala
• Managing RDS with PostgreSQL databases
• Serving data through REST API on ECS with Node.js
• Facilitating data access via WebSockets using Python

This role not only showcased my proficiency in data engineering but also highlighted my problem-solving skills, adaptability, and commitment to maintaining data quality and pipeline efficiency.
nodeJS Databricks aws spark python ganglia hdfs kafka Big Data quicksight data visualization scala EMR
Hvar Consulting Services
Dec 2021 - Oct 2023 (1 year 10 months)
Remote
Tech Lead - Data Engineer
(Part-time job)

Summary:
• Migrating data from SAP Data Warehouse to AWS Redshift or Databricks Lakehouse, enabling seamless integration with PowerBI reports
• Providing leadership to the team and ensuring the successful execution of projects
• Supporting streaming data pipelines utilizing Spark Structured Streaming, Change Data Feed, and Delta Live Tables


Day-to-day responsibilities:
• Designing data pipeline architectures to ensure optimal performance
• Assisting the team in overcoming technical challenges and roadblocks
• Mentoring and guiding squad members in their career development
• Defining and promoting code patterns for reusable and maintainable code

Improvements/Accomplishments:
• Created a generic framework that supports all data wrangling.
• Refactored the framework to run in AWS Glue, AWS EMR, and Azure Databricks
• Reduced computer costs by optimizing cluster usage/resources and applying Spark Structured Streaming

Technology Stack:
• AWS Glue, Redshift, Athena, AWS S3
• Azure Databricks
pyspark Databricks python aws AWS Glue Redshift azure Data Factory amazon s3

Add some compelling projects here to demonstrate your experience
AutoIT Gem
AutoIT is a ruby gem similar to Selenium except for this control anything made with Windows Frames as Calculator.
ruby rubygems dll gemspec bundler
This section lets you add any degrees or diplomas you have earned.
Instituto Federal de Educação, Ciência e Tecnologia de São Paulo - IFSP
Bachelors's degree in System Development, Information Technology
Jan 2013 - Jan 2016
IFSP is a Federal Institution that operates within the CTI, The Renato Archer Information Technology Center (CTI) is a research unit of the Ministry of Science, Technology, and Innovation (MCTI) that operates in research and development in information technology. The intense interaction with the academic sectors, through several partnerships in research, and industrial, in several cooperation projects with companies, keeps CTI in the state of the art in its main areas of action, such as the area of electronic components, microelectronics, systems, displays, software and IT applications such as robotics, decision support software and 3D technologies for industry and medicine.
ETEP - Escola Técnica de Paulínia
Certificate in Chemistry, Chemistry
Jan 2010 - Jan 2012
In Brazil, students can get Certificate Program to work in a specific field before ingressing into a university.
This Certificate allows me to work in Laboratory for Chemistry experiments related (QA, research, ...).
Coursera Project Network
Fine Tune BERT for Text Classification with TensorFlow
Jul 2021
DeepLearning.AI
Natural Language Processing with Probabilistic Models
Jul 2021
DeepLearning.AI
Natural Language Processing with Classification and Vector Spaces
Jun 2021

Jobs for you

Show all jobs
Feedback