Tool 08 · Algorithm Deep Dive
Multi-label BERT + Rule Engine
RICEFW (Reports, Interfaces, Conversions, Enhancements, Forms, Workflows) is SAP's taxonomy for custom development. Classifying objects correctly matters because a Conversion Program costs 3-5× more than a Report. Misclassify it, and your estimate is wrong by hundreds of hours. The taxonomy has clear patterns but limited labeled data.
Hybrid approach: Rules + XGBoost. A deterministic rules layer handles unambiguous patterns (e.g., "Z_CDS_" → Enhancement). XGBoost handles the ambiguous gray cases using object name embeddings, description text, module context, and requirement text. This maximizes accuracy with limited training data.
Typical effort multipliers: Report (1×), Interface (2.5×), Conversion (3.5×), Enhancement (2×), Form (1.8×), Workflow (3×)
Deterministic rules handle ~22% of cases: object naming patterns (Z_CDS_ → E), transaction codes (Z* → E), object types in transport (REPT → R, FUNC → I).
Multi-class head (6 classes) over 38 features: keyword signals, object pattern features, module context, verb-object n-grams, description embeddings.
Separate XGBoost regressor predicts effort hours based on RICEFW class, complexity features, and module.
Low-confidence predictions (< 0.65) flagged for human review, then added to training set.
┌─────────────────────────────────────────────────────────────────────────────────────────┐
│ RICEFW CLASSIFIER PIPELINE │
├─────────────────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ │
│ │ INPUT: │ Object: "Z_CDS_VENDOR_MASTER" │
│ │ Object Meta │ Description: "Core Data Service for Vendor Master with BOPF" │
│ └──────┬───────┘ Module: MM (from Tool 03) │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ STEP 1: RULES LAYER (Unambiguous Cases) │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ Rule 1: Name starts with "Z_CDS_" → Enhancement (E) │ │ │
│ │ │ Rule 2: Name contains "_FORM_" → Form (F) │ │ │
│ │ │ Rule 3: Transaction starts with "Z" → Enhancement (E) │ │ │
│ │ │ Rule 4: Type = "REPT" in transport → Report (R) │ │ │
│ │ │ Rule 5: Type = "FUNC" and name has "BAPI" → Interface (I) │ │ │
│ │ │ Rule 6: Name has "CONV" or "MIGR" → Conversion (C) │ │ │
│ │ │ Rule 7: Description contains "workflow" → Workflow (W) │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ │ │ │
│ │ If match found → STOP, return class with confidence=1.0 │ │
│ │ Else → Continue to XGBoost │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ (if no rule match) │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ STEP 2: FEATURE ENGINEERING │ │
│ │ │ │
│ │ Build 38-dimensional feature vector: │ │
│ │ • Object name n-grams (character-level, 3-5 grams) │ │
│ │ • Description TF-IDF vector (top 100 terms) │ │
│ │ • Module one-hot encoding (FI, CO, MM, SD, PP, etc.) │ │
│ │ • Keyword presence: "report", "interface", "bapi", "idoc", │ │
│ │ "conversion", "enhancement", "exit", "form", "smartform", │ │
│ │ "workflow", "cds", "odata", "fiori" │ │
│ │ • Requirement context (from Tool 02) │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ STEP 3: XGBOOST CLASSIFICATION │ │
│ │ │ │
│ │ ┌─────────────────────────────────────────────────────────────┐ │ │
│ │ │ XGBoost Multi-Class (6 classes, 500 trees, max-depth=6) │ │ │
│ │ │ │ │ │
│ │ │ Input Feature Vector (38-dim) │ │ │
│ │ │ ↓ │ │ │
│ │ │ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐ ┌───────┐│ │ │
│ │ │ │ R │ │ I │ │ C │ │ E │ │ F │ │ W ││ │ │
│ │ │ │ 0.12 │ │ 0.08 │ │ 0.03 │ │ 0.94 │ │ 0.05 │ │ 0.02 ││ │ │
│ │ │ └───────┘ └───────┘ └───────┘ └───────┘ └───────┘ └───────┘│ │ │
│ │ │ │ │ │
│ │ │ Softmax: P(class) = exp(z_i) / Σ exp(z_j) │ │ │
│ │ │ │ │ │
│ │ │ Predicted: Enhancement (E) with 0.94 confidence │ │ │
│ │ └─────────────────────────────────────────────────────────────┘ │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ STEP 4: EFFORT ESTIMATION │ │
│ │ │ │
│ │ Separate XGBoost Regressor: │ │
│ │ Features = [Class one-hot, Complexity score, Module, │ │
│ │ Keyword density, Requirement length] │ │
│ │ ↓ │ │
│ │ Predicted Effort: 24.5 hours │ │
│ │ Confidence Interval: [18.2, 30.8] hours │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────────────────────────────────────────────────────┐ │
│ │ OUTPUT JSON: │ │
│ │ { │ │
│ │ "object": "Z_CDS_VENDOR_MASTER", │ │
│ │ "ricefw_class": "E", │ │
│ │ "confidence": 0.94, │ │
│ │ "effort_hours": 24.5, │ │
│ │ "effort_interval": [18.2, 30.8], │ │
│ │ "explanation": ["Z_CDS_ prefix", "CDS in description"] │ │
│ │ } │ │
│ └──────────────────────────────────────────────────────────────────┘ │
│ │
└─────────────────────────────────────────────────────────────────────────────────────────┘
Tool 08 quantifies custom development effort and feeds Clean Core analysis.
| Metric | Value | Benchmark |
|---|---|---|
| Classification Accuracy | 93.7% | 11,800 labeled objects (test split) |
| F1 Score (Macro) | 0.91 | Balanced across 6 classes |
| Effort MAE | 6.4 hours | vs. actual logged effort in SolMan |
| Effort MAPE | 18.2% | Mean absolute percentage error |
| Rules Coverage | 22% | % of cases handled deterministically |
| Active Learning Improvement | +2.3% | Accuracy gain from human feedback |
Result: Correctly classified as Interface. Fed to Tool 04 with appropriate effort multiplier.