Resume
Email: mabelvj@gmail.com
LinkedIn: mabelvj
Website: mabelvj.github.io
GitHub: mabelvj
StackOverflow: mabel-villalba
Skills
- Programming: Python, R, Bash scripting, Docker, Docker Compose, SQL, MongoDB, HTML, CSS, XPath, Regular Expressions, Matlab
- Python Frameworks: Scrapy, Scikit-learn, Pandas, Numpy, Statsmodels, Prophet, Matplotlib, Seaborn, PySpark, Jupyter Notebook, Quantopian, Backtrader, TensorFlow
- Data Science: Statistical analysis, Time Series, Deep Learning
- Data Engineering: AWS Lambda, S3, Kinesis, Athena, Glue, Quicksight
- Other: Agile, Scrum, AWS, Microsoft Azure, Linux, CI/CD, GitHub, GitLab, Bitbucket, LATEX, Markdown
Languages
- Spanish: Native
- English: Full proficiency (Cambridge CAE, C1, 2012)
- French: Intermediate
Experience
Team Lead - Data Engineer - SEAT CODE
May 2023 – Present | Barcelona, Hybrid, Full-time
- Led a multidisciplinary team of data scientists, engineers, product owners, and designers in the development of an internal SEAT product.
- Collaborated with the Product Owner to set priorities, optimize resources, and create business cases.
- Designed and implemented ETL pipelines using AWS (Lambda, Kinesis, Glue) and managed cloud infrastructure provisioning with CloudFormation.
- Conducted 1:1 mentoring sessions, career development plans, and guided personalized team growth through Profile Ladders.
- Managed performance reviews and salary evaluations, aligning individual growth with team objectives.
- Championed Agile methodologies to improve team productivity and streamlabout.mdine workflows.
Data Engineer - SEAT CODE
April 2021 – May 2023 (2 years, 1 month) | Barcelona, Hybrid, Full-time
- Developed predictive maintenance models for the SEAT Factory in Martorell using Machine Learning.
- Built and maintained an ETL pipeline on AWS using Lambda, Kinesis Firehose, and Glue.
- Data visualization using Quicksight
- Implemented CI/CD using GitHub Actions.
- Managed infrastructure provisioning with CloudFormation.
Data Science Engineer - Mática Partners
December 2020 – April 2021 (5 months) | Barcelona, Hybrid, Full-time
- Built ETL systems using Docker and AWS S3 for business information extraction.
- Deployed and managed data processing pipelines using PySpark, Azure, and Jenkins.
Python Developer & Backend Developer - Scrapinghub
April 2019 – December 2020 (1 year, 9 months) | Remote, Full-time
- Developed scrapers and ETL processes using Scrapy, Pandas, and Numpy.
- Built unit testing frameworks with Pytest and integrated CI/CD pipelines.
- Managed data storage using MongoDB, MySQL, and deployed in cloud environments.
Data Science Contractor
June 2017 – April 2019 (1 year, 11 months) | Remote
- LISTedTECH: Built a tool for university data extraction; performed data cleaning and regular expressions.
- Windsor AI: Generated TV attribution and ROI reports using R and PostgreSQL.
- Arbuckle Capital: Developed an algorithmic trading system (Quantopian, Backtrader) using GARCH models for volatility prediction.
- SerpicoDEV: Built a commodity price prediction system using Python.
Mentor & Reviewer - Udacity
April 2017 – April 2019 (2 years) | Remote, Part-time
- Mentored 120+ students in the Data Analyst and Machine Learning Nanodegrees with an average rating of 4.7.
- Reviewed over 750 projects in Deep Learning, AI, and Statistical Analysis with an average rating of 4.93.
Predoctoral Researcher
October 2015 – October 2016 (1 year) | Optical Communications Group (UPC)
- Developed simulation scripts for optical devices using Python and Matlab.
- Simulated wavelength shifters for optical networks with 54dB sideband rejection.
Education
-
MSc in Photonics – Polytechnic University of Catalonia (UPC)
2013-2014
Collaborated with the Institute of Photonic Sciences (ICFO), UAB, UB. -
Telecommunication Engineering (BSc + MSc) – University of Malaga
2005-2012
Courses
- Data Analyst Nanodegree – Udacity
November 2016 – June 2017 - Machine Learning Engineer Nanodegree – Udacity
July 2016 – November 2016 - Machine Learning Course – Stanford University (Coursera)
2016
Contributions
- Open Source: Contributor to Scikit-learn and Pandas.
- Fixed error in RidgeCV
- Added
store_cv_values
to RidgeClassifierCV - Contributed to bug fixes and documentation improvements in Pandas and Scrapy.
- More Contributions: View all on my website.
- StackOverflow:
- Answered questions related to Python, Pandas, Matplotlib, and machine learning with top tags including Pandas, Python, and Scikit-learn.
- Top-rated answers on groupby operations, violin plots, and random forest classifiers.
Personal Projects
- Stocks Dashboard in Bokeh – Built using Bokeh (Python), displaying time series data automatically with customizable parameters.
- Right Whale Call Recognition – Implemented a Convolutional Neural Network for audio recognition with 95% accuracy (AUC).