Skip to content
View barbavegeta's full-sized avatar

Block or report barbavegeta

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
barbavegeta/README.md

Hi, I’m Salvatore

Data Science & Bioinformatics
London, United Kingdom


About Me

I’m a biomedical scientist with 7+ years of experience in clinical laboratories and a growing focus on bioinformatics, data science, and precision medicine.
I bridge wet-lab techniques with computational workflows - from stem cell research to NGS data pipelines - combining biological expertise with data-driven insights.

  • 7+ years of experience across genomic, clinical, stem-cell, and ATMP laboratories (CooperGenomics, UCLH)
  • Experience in flow cytometry, NGS library prep, and clinical data reporting
  • Skilled in Python, R, SQL, and Bash for genomic analysis and automation
  • Cloud-native mindset: Google Cloud, AWS, and Kubernetes for scalable bioinformatics pipelines
  • Freelance AI data annotator (Mercor, Outlier, micro1, Alignerr) supporting model training for biomedical data
  • Currently pursuing MSc Bioinformatics @ Atlantic Technological University (remote)

Tech Stack

Programming: Python · R · SQL · Bash · Git
Bioinformatics: Bowtie2 · HISAT2 · SAMtools · BEDtools · DESeq2 · Bioconductor · Biopython
Data Viz: Tableau · ggplot2 · seaborn · matplotlib
Cloud: AWS · Google Cloud · GKE (Kubernetes)
AI/ML: Supervised data annotation, evaluation of LLM outputs, prompt and instruction design for technical content


Featured Projects

Project Description Tools
Genomic Data Science End-to-end RNA-seq & variant analysis using HISAT2, StringTie, and DESeq2 Python · R · Bash · Bioconductor
Salifort Motors Predictive modelling to understand drivers of employee turnover and inform retention strategy XGBoost · NumPy · SciPy · scikit-learn · Pandas · Statsmodels
TikTok Project Exploratory analysis of engagement metrics to uncover content trends and optimisation levers Matplotlib · Seaborn · Plotly · SciPy
Bellabeat Case Study Fitbit data analysis and Tableau dashboard R · dplyr · Tableau · SQL
AWS Solution Architecture Cloud deployment diagrams & IaC design AWS · ECS · S3 · Aurora
Fiber Business Intelligence Capstone Data integration and visualization for business insights BigQuery · Tableau · SQL
Portfolio Website Personal website showcasing bioinformatics and data projects HTML · CSS · JS

Education & Certifications

MSc Bioinformatics - Atlantic Technological University (Remote) - 2025-Present
MSc Cell & Gene Therapy - University College London - 2021-2023
BSc Biomedical Science - University of Catania - 2014-2017

Data, Analytics, and Cloud Certifications:
Google: Data Analytics · Advanced Data Analytics · IT Automation with Python · Project Management · Business Intelligence
Google Cloud: Architecting with Google Kubernetes Engine
Amazon Web Services (AWS): Cloud Practitioner Essentials · Cloud Solutions Architect

Bioinformatics Certifications:
Johns Hopkins University: Genomic Data Science Specialization
Wellcome: Bioinformatics for Biologists: An Introduction to Linux, Bash Scripting, and R; Analysing and Interpreting Genomics Datasets

Programming & Data Science Courses:
freeCodeCamp: Data Analysis with Python; Relational Databases; Scientific Computing with Python & Databases
DE<code>LIFE: Genomes, Networks & Pathways; Data Science & Machine Learning with Python


Connect with Me

Pinned Loading

  1. Genomic_Data_Science_Specialization Genomic_Data_Science_Specialization Public

    Personal solutions, scripts, and command logs from the Johns Hopkins Genomic Data Science Specialization (alignment, RNA-seq, variant calling, and database querying).

    Jupyter Notebook

  2. Google_Advanced_Data_Analytics-Salifort_Motors Google_Advanced_Data_Analytics-Salifort_Motors Public

    HR analytics capstone for the Google Advanced Data Analytics certificate, building classification models to understand and predict employee attrition at Salifort Motors.

    HTML

  3. Google_Advanced_Data_Analytics-TikTok_Project Google_Advanced_Data_Analytics-TikTok_Project Public

    Google Advanced Data Analytics project analysing TikTok engagement data to uncover content patterns and factors linked to higher user interaction.

    Jupyter Notebook

  4. Google_Data_Analytics-Bellabeat_Project Google_Data_Analytics-Bellabeat_Project Public

    Google Data Analytics capstone using Fitbit data for Bellabeat to explore activity, sleep, and heart-rate patterns and build Tableau dashboards for actionable insights.

    R

  5. AWS_Solution_Architect AWS_Solution_Architect Public

    High-level AWS migration design for a three-tier web app and Hadoop analytics stack, using managed services (ECS, Aurora, EMR, Redshift, QuickSight) to modernise on-prem workloads.

  6. Google_Business_Intelligence-Google_Fiber Google_Business_Intelligence-Google_Fiber Public

    Google Business Intelligence capstone for Google Fiber, merging market datasets and building Tableau dashboards to communicate revenue, margin, and customer trends to stakeholders.