Customer Review Analysis

Full technical and analytical breakdown of a customer review analysis from Google & Trustpilot reviews for a fitness club chain.


📊Project Overview

This project analyses over 10,000 online reviews for a fitness club chain gathered from Google Reviews and TrustPilot. The aim is to uncover sentiment patterns, key themes, and hidden pain points that influence customer satisfaction and churn.


📂Data Sources

  • Platforms: Google Reviews and TrustPilot. (2018–2025)
  • Fields: free-text review, rating (1–5), datetime, location, and sometimes reviewer metadata.
  • Volume: 30,000 text reviews across multiple branches and time periods.
  • Sources: Google Reviews & Trustpilot (2018–2025)

💼Business Problem

Customer reviews offer valuable insights, but manual analysis is slow and subjective.
This project automated sentiment and theme detection to reveal:

  • What drives loyalty and positive experiences
  • What causes dissatisfaction or churn
  • How customer perception changes over time

The result was clear, actionable insights to support marketing, operations, and service improvement.


⚙️Preprocessing Pipeline

Developed a multi-step NLP and analytics pipeline for textual, temporal, and spatial analysis:

1️⃣ Data Preparation

  • Cleaned text (lowercasing, stopword removal, punctuation stripping)
  • Removed duplicates and standardized timestamps
  • Tokenized and lemmatized review text for NLP modeling

2️⃣ Exploratory Analysis

  • Generated word frequency distributions
  • Identified common themes via n-gram analysis
  • Segmented reviews by star rating, platform, and year

3️⃣ Sentiment Analysis

  • Applied FinBERT (finance-tuned BERT model, repurposed for sentiment precision)
  • Validated results with VADER for polarity scoring
  • Mapped sentiment to time series to detect perception shifts

4️⃣ Topic Modeling

  • Used BERTopic to uncover recurring themes such as:
    • “Staff friendliness & professionalism”
    • “Facility cleanliness & maintenance”
    • “Contract & cancellation issues”
  • Linked sentiment by topic to reveal emotion clusters (e.g., anger linked to billing or membership freezing)

5️⃣ Visualization

  • Built interactive dashboards (Chart.js & Power BI) for:
    • Power BI dashboards summarising average ratings, sentiment distributions, and topic frequencies.
    • Word clouds and keyword charts for an at-a-glance view of common terms.
    • Location-based sentiment maps to reveal which branches were over- or under-performing.

📈 Key Findings

Overall Sentiment: 72% positive, 18% neutral, 10% negative

Top Positive Drivers:

  • Professional and friendly trainers
  • Modern facilities and class variety

Top Negative Drivers:

  • Contract inflexibility
  • Delays in support response

Platform Differences:

  • Google Reviews leaned more positive (avg 4.3★)
  • Trustpilot highlighted recurring issues (avg 3.6★)

Temporal Trends:

  • Sentiment dipped during COVID lockdowns (2020–2021)
  • Gradual recovery post-2022, with peak satisfaction in 2024

🔍 Analytical Insights

  • Positive sentiment clusters centered on staff quality and gym atmosphere, showing these as loyalty anchors
  • Negative clusters linked to administrative issues rather than service quality
  • Sentiment analysis provided early detection of recurring pain points before they appeared in survey feedback
  • Combining BERTopic + FinBERT yielded deeper context by quantifying why customers felt positive or negative

✅Recommendations

  • Automate monthly sentiment monitoring for continuous insight
  • Create a review response strategy targeting recurring pain points
  • Train staff on top negative themes (contract and communication issues)
  • Integrate dashboards into CRM or marketing systems for proactive action
  • Expand analysis to social media mentions for cross-platform consistency

💡Business & Regulatory Impact

  • Improved brand perception tracking and customer understanding
  • Enabled the marketing team to quantify feedback themes and measure progress
  • Supported data-driven decisions on service design, staff training, and membership communication
  • Provided a scalable framework for real-time sentiment tracking across all gym locations

🚀 Deliverables

  • End-to-end NLP pipeline (Python & Jupyter)
  • Interactive dashboards (Chart.js / Power BI)
  • Sentiment & Topic summary report (PDF)
  • Executive brief highlighting actionable insights

👩‍💻 My Role

  • Designed the preprocessing and sentiment analysis pipeline in Python.
  • Implemented BERTopic topic modelling and interpreted clusters.
  • Created the Power BI dashboard layout and visual storytelling.
  • Authored the final report and recommendations for stakeholders.

📘Conclusion

This project demonstrates how unstructured customer feedback can be transformed into a structured, navigable insight layer using modern NLP. By combining sentiment scores, topic models, and interactive dashboards, businesses can move from anecdotal feedback to systematic, data-driven decisions about customer experience.