Dataset Description
This analysis uses an ecommerce customer dataset from a clothing store company that offers both online shopping and in-store style and clothing advice sessions. The dataset was sourced from Kaggle and contains behavioral data for 500 customers. The primary objective is to help the company decide whether to concentrate resources on their mobile app service or website, based on the yearly amount spent by customers.Dataset Columns
Customer’s email address (unique identifier)
Customer’s physical address
Customer’s avatar color preference
Average session length in minutes when visiting the storeStatistics:
- Mean: 33.05 minutes
- Range: 29.53 - 36.14 minutes
- Std Dev: 0.99
Time spent on the mobile app in minutesStatistics:
- Mean: 12.05 minutes
- Range: 8.51 - 15.13 minutes
- Std Dev: 0.99
Time spent on the website in minutesStatistics:
- Mean: 37.06 minutes
- Range: 33.91 - 40.01 minutes
- Std Dev: 1.01
How long the customer has been a member in yearsStatistics:
- Mean: 3.53 years
- Range: 0.27 - 6.92 years
- Std Dev: 1.00
The annual amount spent by the customer (target variable)Statistics:
- Mean: $499.31
- Range: 765.52
- Std Dev: $79.31
Dataset Statistics
The dataset contains 500 customer records with complete data across all 8 columns. There are no missing values in the dataset.
Key Insights
- Sample Size: 500 customers
- Data Quality: Complete dataset with no missing values
- Feature Distribution: All numeric features show relatively normal distributions with consistent standard deviations around 1.0
- Target Variable: Yearly Amount Spent shows good variance (SD: $79.31) suitable for regression analysis
Data Distribution Characteristics
Based on the descriptive statistics:- Avg. Session Length shows moderate variation with most customers having sessions between 32-34 minutes
- Time on App averages around 12 minutes with a relatively tight distribution
- Time on Website has the highest mean value (37 minutes) among engagement metrics
- Length of Membership spans from new customers (< 1 year) to long-term members (nearly 7 years)
- Yearly Amount Spent ranges from 765.52, providing a good spread for predictive modeling
The quartile values indicate relatively symmetric distributions for most features, which is ideal for linear regression modeling.