Data Analyst
Professional Summary
Highly skilled and detail-oriented Data Analyst adept at extracting actionable insights from complex datasets. Proficient in Python, SQL, Excel, and Tableau, I leverage advanced analytical techniques to drive informed decision-making and optimize processes. I excel in designing data-driven solutions and translating technical findings into clear business recommendations, backed by strong communication and collaboration skills. With demonstrated expertise in data cleaning, manipulation, visualization, and trend identification, I am committed to delivering high-quality analyses that contribute to organizational success. Eager to apply my analytical expertise and passion for data to a dynamic, growth-oriented team.
Technical Skills
Programming & Databases:
- Python: Created scripts manipulating databases and conducted data analysis using NumPy, Pandas, and Matplotlib in Visual Studio Code and Jupyter Notebooks. Familiar with APIs for data ingestion.
- SQL: Utilized SQLite (via DB Browser) and other SQL environments to wrangle, query, and prepare data for analysis.
Data Analysis & Visualization:
- Excel: Leveraged advanced functions, pivot tables, and data analysis toolpak for data manipulation and reporting.
- Tableau: Developed interactive dashboards and visualizations to intuitively present data-driven insights and project outcomes.
- R, SAS, Stata: Proficient in using these statistical software packages for data analysis and modeling (demonstrated in academic projects and coursework).
Version Control & Project Management:
- Git/GitHub: Employed for version control, repository management, and showcasing portfolio projects.
- Agile: ICAgile Certified Professional (ICP). Experienced in Scrum methodology, including daily standups, sprint planning, and retrospectives using Jira.
Key Competencies & Techniques:
- Statistical Techniques: Regression (Linear, Logistic), t-tests, Chi-square, ANOVA, PCA, Factor Analysis.
- Machine Learning: Logistic Regression, Neural Networks, SVM, Clustering.
- Data Management, Statistical Modeling, Data Collection, Report Generation.
Soft Skills
- Communication: Effectively conveying complex data insights to diverse audiences (technical and non-technical) through presentations, reports, and visualizations.
- Problem-Solving: Analytical and methodical approach to identifying issues, evaluating solutions, and making data-driven decisions.
- Critical Thinking: Adept at questioning assumptions, interpreting data thoughtfully, and identifying underlying patterns and trends.
- Attention to Detail: Meticulous in data cleaning, validation, and analysis to ensure accuracy and reliability of findings.
- Collaboration: Proven ability to work effectively within cross-functional teams, contributing to shared goals and fostering a positive work environment (e.g., Agile/Scrum experience).
- Adaptability & Continuous Learning: Eager to embrace new technologies, methodologies, and tackle unfamiliar challenges in the evolving field of data analytics.
- Time Management & Organization: Skilled in prioritizing tasks, managing deadlines, and organizing complex projects efficiently.
Projects
U.S. Food Access Analysis: Uncovering Complex Realities (2010)
GitHub Repository | View SQL Scripts | Tableau Dashboard

As my capstone for the Savvy Coders Data Analytics + Python Bootcamp, this project analyzed 2010 U.S. food access. It moved beyond simplistic “food desert” notions to explore the complex interplay between geographic access, socioeconomic factors, and health indicators using USDA and County Health Rankings data.
Key Questions & Objectives:
- Identified states/counties most impacted by Low Income & Low Access (LILA), by population and proportion.
- Compared LILA disparities in urban vs. rural areas.
- Analyzed correlations between Low Food Access Rate (LFA_Rate) and indicators like poverty, vehicle ownership, and health outcomes.
Methodology & Skills Applied:
- SQL (SQLite): Data wrangling, cleaning, and aggregation.
- Python (Pandas, NumPy, Matplotlib): In-depth EDA, metric calculation, correlation analysis, and initial visualizations.
- Tableau: Developed an interactive dashboard with maps, charts, and scatter plots to communicate findings and key correlations.
Core Insights:
- The LFA_Rate (geographic proximity) showed surprisingly weak/no direct linear correlation with individual poverty or health indicators.
- Stronger relationships emerged between SNAP Participation & Poverty Rate (+0.69), and Child Poverty & Poor Health Rate (+0.71).
- Conclusion: Food access is multifaceted; geographic proximity alone offers a limited view. This project highlighted the complex web of factors influencing food security.
Tech Stack: SQL, Python (Pandas, NumPy, Matplotlib), Tableau
Work Experience
Graduate Teaching Assistant | University of Minnesota Duluth | Duluth, MN | Aug 2015 – May 2017
- Analyzed student performance data using Excel, implementing data-driven improvements that resulted in a 2% increase in average exam scores.
- Collaborated with course instructors to maintain accurate academic records for 100+ students, ensuring data integrity and contributing to student success.
- Effectively communicated complex mathematical concepts to diverse student groups, enhancing comprehension and engagement.
Continuing Education
Education
- MS in Mathematical Sciences (Statistics Focus) – University of Minnesota Dulut, Duluth, MN – July 2024
- BA in Mathematical Science & IT – Westminster Name, City, MO – May 2018
Academic Projects
Multivariate Analysis of Bioaccumulation in Pueblo Reservoir | Jan 2024 – July 2024
- Cleaned and transformed complex environmental data, applying Box-Cox normalization to ensure data accuracy for subsequent analysis.
- Identified distinct patterns in trace element bioaccumulation across trophic levels by applying multivariate statistical techniques (PCA, Factor Analysis, Cluster Analysis) in R, contributing to the environmental understanding of the reservoir.
Abstract
This study investigates the accumulation patterns of 18 trace elements across different trophic levels within the Pueblo Reservoir ecosystem in Colorado, USA. The reservoir, located downstream of the historic Leadville Mining District, serves as an ideal site to study the impact of trace element contamination on aquatic life. Utilizing data from previous research, this study uses principal component analysis (PCA), factor analysis, and K-means clustering to analyze the concentrations of trace elements in various organisms on different trophic levels. The results reveal distinct patterns of accumulation, with certain elements exhibiting higher concentrations in higher trophic levels, suggesting biomagnification. Some other elements, however, are predominantly found in lower trophic levels. These findings underscore the importance of understanding trophic interactions and factors in assessing the ecological and human health risks associated with trace element contamination.
Keywords: Trace Elements, Bioaccumulation, Multivariate Analysis, Principal Component Analysis (PCA), Factor Analysis, K-means Clustering, Trophic Levels,
Biomagnification, Ecological Risk, Human Health Risk, Pueblo Reservoir
Analyzing the Pima Indians Diabetes Dataset with Machine Learning | Jan 2024 - May 2024
- Conducted predictive modeling on the Pima Indians Diabetes Dataset using Logistic Regression and Artificial Neural Networks.
- Enhanced model performance through Principal Component Analysis (PCA) for dimensionality reduction.
- Leveraged Python (Scikit-learn) to develop and validate models, achieving [mention key metric if possible, e.g., an F1 score of 0.XX and ROC AUC of 0.YY].
Biostatistical Analysis of Oxytocin Administration Routes | Aug 2023 – Dec 2023
- Performed comparative analysis of oxytocin administration routes on postpartum hemorrhage outcomes using t-tests and chi-square tests.
- Developed a multiple linear regression model to assess the predictive power of various factors on postpartum hemoglobin levels, identifying key predictors.
Relevant Coursework
CS 4232 - Machine Learning & Data Mining
University of Minnesota Duluth | Spring 2024
- Explored foundational concepts in machine learning and data mining, including decision trees, neural networks, SVMs, and ensemble methods. Implemented algorithms using Python and associated libraries.
STAT 5511 - Regression Analysis
University of Minnesota Duluth | Spring 2023
- Conducted simple, polynomial, and multiple regression analyses using STATA. Utilized matrix formulation for estimation, testing, and prediction, including residual analysis and model selection.
STAT 5572 - Statistical Inference
University of Minnesota Duluth | Spring 2023
- Dived deep in Mathematical statistics, Bayes’ and maximum-likelihood estimators, and unbiased estimators, onfidence intervals and hypothesis testing(including likelihood ratio tests, most powerful tests, and goodness-of-fit tests)
PUBH 6450 - Biostatistics
University of Minnesota Twin Cities | Fall 2023
- Applied concepts of exploratory data analysis and statistical inference. Utilized SAS for hypothesis testing, ANOVA, regression, and nonparametric methods.
STAT 5531 - Probability Models
University of Minnesota Duluth | Fall 2022
- Developed and applied probability models to science and engineering problems, analyzing classical distributions (binomial, Poisson, exponential) and exploring Markov processes.
STAT 5571 - Probability
University of Minnesota Duluth | Fall 2021
- Mastered probability axioms, random variable distributions (discrete/continuous), joint/conditional distributions, mathematical expectation, moments, and correlation, establishing a strong theoretical foundation for advanced data analysis and modeling.