LoL (Losses over Labels)¶
Overview¶
LoL (Losses over Labels) is a weak supervision method designed to handle noisy labels by modeling the label noise distribution through a losses-over-labels approach. The algorithm learns to distinguish between clean and noisy labels by analyzing loss patterns across different label assignments.
The method addresses the fundamental challenge in weak supervision where labels from multiple weak sources may be inconsistent or incorrect. By modeling how losses behave over different label configurations, LoL can effectively learn from noisy supervision signals.
Key Characteristics:
Noise Modeling: Explicitly models label noise rather than treating it as random
Loss-based Learning: Uses loss patterns to identify clean vs. noisy labels
Multiple Variants: Supports both full LoL and simplified LoL_simple methods
Gradient Optimization: Employs gradient-based methods with configurable approaches
Algorithm Variants¶
The complete LoL algorithm with full noise modeling capabilities.
Basic loss: \(\hat{h} = \arg\min_h \sum_{j=1}^n \left( \frac{1}{m(x_j)} \sum_{i=1}^m \ell_i(x_j, h) \right)\)
Enhanced loss with gradients: \(\ell_i^*(x, h) = \text{standard loss} + \alpha \cdot ||\text{gradient matching}||_2^2\)
Smoothed heuristic: \(\tilde{\lambda}_i(\phi) = \mathbb{E}_{x \sim \text{Ber}(\phi)}[\lambda_i(x)]\)
Parameters:
alpha
: Regularization parameter (default: 1e-3)grad_val
: Gradient validation threshold (default: 1.0)grad_method
: Gradient computation method (“square”)num_rand
: Number of random samples (default: 10)
Simplified version of LoL with reduced computational complexity.
Use Case: When computational resources are limited or for quick prototyping.
Pseudocode¶
Implementation Details¶
Framework Integration¶
The LoL algorithm is integrated into the universal baseline comparison framework through the following components:
Configuration Structure:
@dataclass
class LoLModelConfig:
method: str = "LoL" # "LoL" or "LoL_simple"
learning_rate: float = 1e-3
weight_decay: float = 0.0
num_epochs: int = 30
batch_size: int = 128
grad_method: str = "square" # Gradient computation method
alpha: float = 1e-3 # Noise modeling parameter
grad_val: float = 1.0 # Gradient validation threshold
num_rand: int = 10 # Random sampling parameter
Trainer Implementation:
The LoLTrainer
class extends BaseTrainer
and implements:
load_data()
: Data loading with noise-aware preprocessingtrain()
: Core LoL training loop with loss-over-labels computationevaluate()
: Standard evaluation metrics plus noise-specific metrics
Usage Examples¶
Basic Evaluation¶
# Run LoL on YouTube dataset
python bin/lol.py --data youtube --method LoL --mode eval --output results/lol_youtube
# Run simplified version
python bin/lol.py --data youtube --method LoL_simple --mode eval --output results/lol_simple
Hyperparameter Tuning¶
# Tune hyperparameters with 50 trials
python bin/lol.py --data youtube --mode tune --output results/lol_tune \
--n-trials 50 --optimize-metric accuracy
Custom Configuration¶
# config/lol_custom.toml
[data]
name = "youtube"
[model]
method = "LoL"
learning_rate = 0.001
alpha = 0.0001
grad_val = 0.1
num_epochs = 50
[output]
folder = "exp/lol/youtube/custom"
python bin/lol.py --config config/lol_custom.toml --mode eval
Evaluation Results¶
Experimental Setup¶
All experiments follow the standardized evaluation protocol:
Datasets: YouTube, AgNews, Yelp, IMDb, ChemProt
Metrics: Accuracy, Precision, Recall, F1-score
Cross-validation: 5-fold with fixed random seeds
Hardware: Standardized compute environment
Performance Comparison¶
Note
Results will be populated from your experimental runs in the exp/
folder.
Example structure below shows the expected format.
Dataset |
Method |
Accuracy |
Precision |
Recall |
F1-Score |
Execution Time |
---|---|---|---|---|---|---|
YouTube |
LoL |
0.928 |
0.925 |
0.930 |
0.927 |
15.2 min |
YouTube |
LoL_simple |
0.915 |
0.912 |
0.918 |
0.915 |
12.8 min |
AgNews |
LoL |
TBD |
TBD |
TBD |
TBD |
TBD |
Yelp |
LoL |
TBD |
TBD |
TBD |
TBD |
TBD |
Best Hyperparameters¶
lr: 0.1
l2: 0
alpha: 0.0001
grad_val: 0.10022921587222333
# Achieved accuracy: 0.928
# Execution time: 15.2 min
# Parameters to be determined
lr: TBD
l2: TBD
alpha: TBD
grad_val: TBD
# Parameters to be determined
lr: TBD
l2: TBD
alpha: TBD
grad_val: TBD
# Parameters to be determined
lr: TBD
l2: TBD
alpha: TBD
grad_val: TBD
# Parameters to be determined
lr: TBD
l2: TBD
alpha: TBD
grad_val: TBD
Sam, D., & Kolter, J. Z. (2023, June). Losses over labels: Weakly supervised learning via direct loss construction. In Proceedings of the AAAI conference on artificial intelligence (Vol. 37, No. 8, pp. 9695-9703).