Yan-Ni (Annie) Tai

Data Analyst | UMich Applied Statistics Graduate 〽️💙

About Me

I’m a recent graduate from the University of Michigan with a master’s degree in Applied Statistics and a passion for solving business problems through data. I bring hands-on experience in SQL, Python, Power BI, and Tableau, developed through internships and academic projects in time series forecasting, sentiment analysis, and machine learning.

I am most skilled in: SQL, Python, Tableau, A/B Testing, and Statistical Modeling

Experience

TSMC

Data Analytics Intern – Intelligent Manufacturing

June 2024 – Aug 2024

Leveraging SQL, Power BI, and manufacturing KPIs, I turned over 20 machines’ raw output into data-driven strategies that improved efficiency and reduced inspection rate by 10%.

  • Analyzed data from 20+ machines using SQL to identify anomalies and optimize production planning.
  • Built Power BI dashboards for real-time monitoring of key KPIs across manufacturing teams.
  • Reduced inspection effort by 10% by designing data-driven decision rules.

Northeast and Yilan Coast National Scenic Area Administration

Data Analyst Intern

Feb 2023 – Jun 2023

  • Built an Excel tool that streamlined the organization of 2,000+ tourism records, cutting manual processing time by 20%.
  • Uncovered insights into visitor trends and behavior to inform strategic planning.

National Cheng Kung University

Research Assistant

Sep 2021 – Mar 2023

  • Developed Python-based ETL pipelines to collect and process 60,000+ financial news records, streamlining data preparation for stock market research.
  • Built 10+ time series models to forecast stock price trends, resulting in improved accuracy and analysis efficiency.
  • Refined an alternative credit scoring model by applying weighted multiple regression, increasing predictive reliability and supporting better decisions.

Projects

Passenger Transport Prediction

Spaceship_Titanic_Kaggle_Competition

Leveraged EDA, scikit-learn, and ensemble learning to engineer end-to-end predictive workflows on transport data.

  • Identified key patterns in demographics and spending behavior by conducting exploratory data analysis (EDA) on 13,000+ passenger records using Python, supporting data-driven modeling decisions.
  • Achieved 80.3% prediction accuracy by developing end-to-end machine learning pipelines using scikit-learn and applying ensemble methods (Voting Classifier).

Stock Market Analysis Using Deep Learning and NLP During the Covid-19 Pandemic

Stock_Market_Analysis

Combined NLP, time series modeling, and COVID-19 data to enhance stock market forecasting, achieving a 30% improvement in trend prediction accuracy with LSTM models.

  • Scraped 60,000+ financial news articles and labeled sentiment using CKIP Tagger and a sentiment dictionary.
  • Merged sentiment, COVID-19 case counts, and technical indicators (MA, BIAS, momentum) for feature engineering.
  • Built ARIMA and LSTM models to forecast stock trends on TWII and sector indices.
  • Used expanding window cross-validation; LSTM + sentiment improved directional accuracy by 30%.
  • Demonstrated that combining financial NLP and pandemic data enhances market prediction during crises.

Education

University of Michigan

M.S. in Applied Statistics

2023 - 2025

  • GPA: 3.85/4.0
  • Relevant Coursework: Statistical learning, Data Management and Analysis, Data Science in Python, Database Design

National Cheng Kung University

B.B.A in Statistics

2018 - 2023

  • GPA: 3.88/4.0
  • Dean’s List / National Undergraduate Research Fellowship
  • Relevant Coursework: Regression, Multivariate Analysis, Time Series Analysis, Experimental Design, Machine Learning

A Little More About Me

Alongside my interests in data analysis, some of my other interests and hobbies are:

  • NBA
  • Working Out
  • Surfing