Back to Research

Payment Platform User Churn Prediction (XGBoost)

Machine LearningXGBoost

An end-to-end machine learning pipeline built during my Allinpay Payment Fintech internship to predict user churn on a payment platform and drive targeted retention strategies.

Pythonpandasscikit-learnXGBoostLightGBMmatplotlibJupyter

1Project Overview

During my internship at Allinpay Payment Fintech, I observed a monthly user churn rate of 12%, directly causing approximately ¥2.4M in revenue loss per month. This project aims to build a machine learning model that predicts users likely to churn within 30 days, enabling the operations team to implement targeted retention interventions.

Business Problem
12%
Monthly Churn Rate
¥2.4M / month
Revenue Loss
Project Goal
30 Days
Early Warning Window
Identify At-Risk Users
Enable Targeted Interventions
Target Outcome
25%
Churn Reduction Target
¥600K / month
Expected Revenue Saved

ML Pipeline Overview

Data CollectionFeature EngineeringModel TrainingHyperparameter TuningEvaluationDeployment

2Dataset & Preprocessing

50,000
User Samples
8 Months
Behavioral Data Span
12%
Positive Rate (Churned)
#Feature NameTypeDescription
1user_idIDUnique user identifier
2registration_datedatetimeAccount registration timestamp
3last_login_daysintDays since last platform login
4transaction_count_30dintNumber of transactions in last 30 days
5transaction_amount_30dfloatTotal transaction amount in last 30 days (¥)
6avg_transaction_valuefloatAverage value per transaction (¥)
7payment_methods_usedintNumber of distinct payment methods used
8support_ticketsintCustomer support tickets filed
9app_sessions_7dintApp sessions in last 7 days
10feature_usage_scorefloatComposite score of feature utilization (0-1)
11channel_sourcecategoricalUser acquisition channel (organic, paid, referral, etc.)
12device_typecategoricalPrimary device type (iOS, Android, Web)
13city_tiercategoricalCity tier classification (Tier 1-4)
14age_groupcategoricalUser age bracket (18-24, 25-34, 35-44, 45+)
15is_churnedbinaryTarget variable — 1 if churned (12% positive rate)
data_loading.py
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split

# Load dataset
df = pd.read_csv('allinpay_user_data.csv', parse_dates=['registration_date'])
print(f"Dataset shape: {df.shape}")  # (50000, 15)
print(f"Churn rate: {df['is_churned'].mean():.2%}")  # 12.00%

# Train/test split with stratification (preserve class ratio)
X = df.drop(['user_id', 'is_churned'], axis=1)
y = df['is_churned']
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)
print(f"Train: {X_train.shape[0]}, Test: {X_test.shape[0]}")  # 40000, 10000

3Feature Engineering

Based on domain knowledge and exploratory data analysis, I engineered 15 derived features across 4 categories, significantly improving the model's predictive power.

RFM Features

recencyDays since last transaction
frequencyTransaction count over full period
monetaryTotal spend over full period
rfm_scoreCombined R/F/M quintile scores (1-5)

Behavioral Features

session_trend7-day vs 30-day session ratio (activity direction)
feature_diversityNumber of distinct platform features used
payment_method_shiftChange in primary payment method
peak_hour_ratioProportion of activity during peak hours

Temporal Features

days_since_registrationAccount age in days
weekend_ratioWeekend vs weekday activity proportion
activity_decay_rateRate of activity decline over time (slope)
month_of_yearCyclical encoding of month (sin/cos)

Engagement Features

login_frequency_changeLogin frequency change (recent vs historical)
transaction_gap_increaseIncrease in average gap between transactions
support_interaction_ratioSupport tickets per transaction ratio
feature_engineering.py
# RFM Feature Engineering
df['recency'] = (pd.Timestamp.now() - df['last_transaction_date']).dt.days
df['frequency'] = df.groupby('user_id')['transaction_id'].transform('count')
df['monetary'] = df.groupby('user_id')['transaction_amount'].transform('sum')

# Quintile scoring (1=worst, 5=best)
df['r_score'] = pd.qcut(df['recency'], 5, labels=[5,4,3,2,1]).astype(int)
df['f_score'] = pd.qcut(df['frequency'].rank(method='first'), 5, labels=[1,2,3,4,5]).astype(int)
df['m_score'] = pd.qcut(df['monetary'].rank(method='first'), 5, labels=[1,2,3,4,5]).astype(int)

# Behavioral: Activity trend detection
df['session_trend'] = df['app_sessions_7d'] / (df['app_sessions_30d'] / 4.28 + 1e-6)
df['activity_decay_rate'] = np.polyfit(range(8), weekly_activity_series, 1)[0]

# Engagement delta features
df['login_frequency_change'] = (
    df['login_count_recent_14d'] / (df['login_count_prior_14d'] + 1e-6) - 1
)
df['transaction_gap_increase'] = (
    df['avg_gap_recent_30d'] - df['avg_gap_prior_30d']
)

4Model Comparison

I trained and compared 4 mainstream classification models using 5-fold cross-validation with consistent hyperparameter search strategies. All models were evaluated on the same train/test split.

ModelAUC-ROCPrecisionRecallF1 ScoreResult
Logistic Regression
0.780.710.650.68-
Random Forest
0.850.790.730.76-
XGBoostBest
0.890.830.780.80Selected
LightGBM
0.880.820.770.79-

Why XGBoost?

  • Highest performance across all evaluation metrics: AUC 0.89, F1 0.80
  • Built-in handling of missing values and class imbalance, suitable for financial data
  • Provides interpretable feature importance rankings, enabling business team understanding and action
  • Fast inference speed (<10ms per sample), meeting near-real-time prediction requirements
model_training.py
import xgboost as xgb
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import roc_auc_score, classification_report

# Handle class imbalance with scale_pos_weight
scale_ratio = (y_train == 0).sum() / (y_train == 1).sum()  # ~7.33

# XGBoost with hyperparameter tuning
param_grid = {
    'max_depth': [4, 6, 8],
    'learning_rate': [0.01, 0.05, 0.1],
    'n_estimators': [200, 500, 800],
    'min_child_weight': [1, 3, 5],
    'subsample': [0.8, 0.9],
    'colsample_bytree': [0.8, 0.9],
}

xgb_clf = xgb.XGBClassifier(
    objective='binary:logistic',
    scale_pos_weight=scale_ratio,
    eval_metric='auc',
    random_state=42,
    use_label_encoder=False,
)

grid_search = GridSearchCV(
    xgb_clf, param_grid, scoring='roc_auc',
    cv=5, n_jobs=-1, verbose=1
)
grid_search.fit(X_train, y_train)

# Best model evaluation
best_model = grid_search.best_estimator_
y_pred_proba = best_model.predict_proba(X_test)[:, 1]
print(f"AUC-ROC: {roc_auc_score(y_test, y_pred_proba):.4f}")  # 0.8903

5Results & Evaluation

Feature Importance — Top 10

1
last_login_days0.18
2
transaction_gap_increase0.14
3
activity_decay_rate0.12
4
session_trend0.10
5
transaction_count_30d0.09
6
feature_usage_score0.08
7
support_tickets0.07
8
payment_method_shift0.06
9
login_frequency_change0.05
10
avg_transaction_value0.04

Confusion Matrix(threshold = 0.45)

Predicted
Actual
Churned
Retained
TP
780
FN
220
FP
160
TN
4,840
Churned (Actual)
Retained (Actual)

Threshold Optimization

The default threshold of 0.50 misses too many potential churners. Through business cost-benefit analysis, I adjusted the threshold to 0.45 for a better precision-recall trade-off — in fintech, the cost of missing a churner far exceeds the cost of a single retention outreach to a retained user.

Default
0.50
Optimal
0.45
Recall Gain
+5.2%

Final Model Metrics

Accuracy
93.7%
Precision
83.0%
Recall
78.0%
F1 Score
0.80

6Business Impact

Model predictions were translated into actionable operational strategies, achieving significant business returns through targeted retention campaigns.

78%
Churners identified 2 weeks before churn event
¥50
Targeted coupon to high-risk users
¥600K
Expected monthly revenue save (25% churn reduction)
8.5x
ML system ROI (¥600K save vs ¥70K cost)

ROI Cost-Benefit Analysis

Intervention Costs (Monthly)

Coupon distribution (~940 users x ¥50)¥47,000
SMS / push notification costs¥3,000
ML infrastructure & maintenance¥15,000
Operational labor costs¥5,000
Total Cost¥70,000

Expected Benefits (Monthly)

Prevented churners~1,500 users
Average monthly value per user (ARPU)¥400
Revenue saved¥600,000
Customer lifetime value protected¥2.4M+
ROI8.5x (¥600K / ¥70K)

7Conclusion

This project successfully demonstrated the significant value of machine learning in payment fintech user retention. Through systematic feature engineering and model optimization, the XGBoost model achieved AUC 0.89 predictive performance, accurately identifying 78% of potential churners 2 weeks before the churn event. The model-driven targeted retention strategy is expected to save ¥600K in monthly revenue with an 8.5x ROI.

Next Steps

Real-Time Scoring Pipeline

Deploy the model to a Kafka streaming architecture for real-time user behavior scoring (<100ms latency), upgrading from batch prediction to a real-time early warning system.

Deep Learning Exploration

Experiment with LSTM/Transformer sequence models to capture temporal patterns in user behavior, leveraging attention mechanisms to identify key churn signals, with an expected 3-5% AUC improvement.

CRM Integration

Automatically sync model predictions to the CRM system, triggering differentiated retention strategies (SMS/coupons/dedicated support) based on churn risk tiers for fully automated closed-loop operations.

Interested in ML & Product Analytics?

I am seeking product operations internship opportunities. Let's discuss data-driven growth, ML applications, or the payment fintech industry.