Federated Learning for Movie Recommendations

Table of Contents

Project Overview

Project Repository

Developed a privacy-preserving movie recommendation system using federated learning techniques on the MovieLens dataset, comparing performance with centralized approaches.

Key Contributions

Implemented Neural Collaborative Filtering (NCF) model using PyTorch
Applied federated learning using PySyft to preserve user privacy
Conducted comparative analysis between federated and centralized approaches

Technical Highlights

Data Processing:
- Preprocessed MovieLens dataset, encoding user/movie IDs and normalizing ratings
- Implemented data splitting for training and testing
Model Architecture:
- Developed NCF model combining Matrix Factorization and Multi-Layer Perceptron
- Implemented separate embedding layers for enhanced feature learning
Federated Learning Implementation:
- Utilized PySyft for simulating distributed environment

Key Findings

Federated NCF (RMSE: 1.048) performed comparably to centralized NCF (RMSE: 0.9619)
Larger batch sizes improved federated learning performance
Complex models (NCF) outperformed simpler ones (Matrix Factorization) in federated settings

Challenges Overcome

Balanced communication frequency and computation in federated setup
Managed memory constraints in PySyft simulations
Debugged distributed learning environment