Customer Review Analysis
Full technical and analytical breakdown of a customer review analysis from Google & Trustpilot reviews for a fitness club chain.
📊Project Overview
This project analyses over 10,000 online reviews for a fitness club chain gathered from Google Reviews and TrustPilot. The aim is to uncover sentiment patterns, key themes, and hidden pain points that influence customer satisfaction and churn.
📂Data Sources
- Platforms: Google Reviews and TrustPilot. (2018–2025)
- Fields: free-text review, rating (1–5), datetime, location, and sometimes reviewer metadata.
- Volume: 30,000 text reviews across multiple branches and time periods.
- Sources: Google Reviews & Trustpilot (2018–2025)
💼Business Problem
Customer reviews offer valuable insights, but manual analysis is slow and subjective.
This project automated sentiment and theme detection to reveal:
- What drives loyalty and positive experiences
- What causes dissatisfaction or churn
- How customer perception changes over time
The result was clear, actionable insights to support marketing, operations, and service improvement.
⚙️Preprocessing Pipeline
Developed a multi-step NLP and analytics pipeline for textual, temporal, and spatial analysis:
1️⃣ Data Preparation
- Cleaned text (lowercasing, stopword removal, punctuation stripping)
- Removed duplicates and standardized timestamps
- Tokenized and lemmatized review text for NLP modeling
2️⃣ Exploratory Analysis
- Generated word frequency distributions
- Identified common themes via n-gram analysis
- Segmented reviews by star rating, platform, and year


3️⃣ Sentiment Analysis
- Applied FinBERT (finance-tuned BERT model, repurposed for sentiment precision)
- Validated results with VADER for polarity scoring
- Mapped sentiment to time series to detect perception shifts

4️⃣ Topic Modeling
- Used BERTopic to uncover recurring themes such as:
- “Staff friendliness & professionalism”
- “Facility cleanliness & maintenance”
- “Contract & cancellation issues”
- Linked sentiment by topic to reveal emotion clusters (e.g., anger linked to billing or membership freezing)


5️⃣ Visualization
- Built interactive dashboards (Chart.js & Power BI) for:
- Power BI dashboards summarising average ratings, sentiment distributions, and topic frequencies.
- Word clouds and keyword charts for an at-a-glance view of common terms.
- Location-based sentiment maps to reveal which branches were over- or under-performing.

📈 Key Findings
Overall Sentiment: 72% positive, 18% neutral, 10% negative
Top Positive Drivers:
- Professional and friendly trainers
- Modern facilities and class variety
Top Negative Drivers:
- Contract inflexibility
- Delays in support response
Platform Differences:
- Google Reviews leaned more positive (avg 4.3★)
- Trustpilot highlighted recurring issues (avg 3.6★)
Temporal Trends:
- Sentiment dipped during COVID lockdowns (2020–2021)
- Gradual recovery post-2022, with peak satisfaction in 2024
🔍 Analytical Insights
- Positive sentiment clusters centered on staff quality and gym atmosphere, showing these as loyalty anchors
- Negative clusters linked to administrative issues rather than service quality
- Sentiment analysis provided early detection of recurring pain points before they appeared in survey feedback
- Combining BERTopic + FinBERT yielded deeper context by quantifying why customers felt positive or negative
✅Recommendations
- Automate monthly sentiment monitoring for continuous insight
- Create a review response strategy targeting recurring pain points
- Train staff on top negative themes (contract and communication issues)
- Integrate dashboards into CRM or marketing systems for proactive action
- Expand analysis to social media mentions for cross-platform consistency
💡Business & Regulatory Impact
- Improved brand perception tracking and customer understanding
- Enabled the marketing team to quantify feedback themes and measure progress
- Supported data-driven decisions on service design, staff training, and membership communication
- Provided a scalable framework for real-time sentiment tracking across all gym locations
🚀 Deliverables
- End-to-end NLP pipeline (Python & Jupyter)
- Interactive dashboards (Chart.js / Power BI)
- Sentiment & Topic summary report (PDF)
- Executive brief highlighting actionable insights
👩💻 My Role
- Designed the preprocessing and sentiment analysis pipeline in Python.
- Implemented BERTopic topic modelling and interpreted clusters.
- Created the Power BI dashboard layout and visual storytelling.
- Authored the final report and recommendations for stakeholders.
📘Conclusion
This project demonstrates how unstructured customer feedback can be transformed into a structured, navigable insight layer using modern NLP. By combining sentiment scores, topic models, and interactive dashboards, businesses can move from anecdotal feedback to systematic, data-driven decisions about customer experience.