Breaking the Memory Barrier
Oct 27, 2024 ·
15m 32s
Download and listen anywhere
Download your favorite episodes and enjoy them, wherever you are! Sign up or log in now to access offline listening.
Description
🧠Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss This research paper introduces Inf-CL, a novel approach for contrastive learning that dramatically reduces GPU memory usage...
show more
🧠Breaking the Memory Barrier: Near Infinite Batch Size Scaling for Contrastive Loss
This research paper introduces Inf-CL, a novel approach for contrastive learning that dramatically reduces GPU memory usage during training, allowing for near-infinite batch sizes. The authors address the issue of quadratic memory growth in traditional methods by implementing a tile-based computation strategy that partitions the contrastive loss calculation into smaller, sequentially computed blocks. To further enhance efficiency, they propose a multi-level tiling strategy that leverages ring-based communication at the GPU level and fused kernels at the CUDA core level, minimizing I/O overhead. The experiments demonstrate that Inf-CL significantly outperforms previous methods, achieving unprecedented batch sizes while maintaining accuracy and comparable training speed. This breakthrough opens new possibilities for large-scale contrastive learning, paving the way for advancements in areas such as self-supervised learning and dense text retrieval.
📎 Link to paper
show less
This research paper introduces Inf-CL, a novel approach for contrastive learning that dramatically reduces GPU memory usage during training, allowing for near-infinite batch sizes. The authors address the issue of quadratic memory growth in traditional methods by implementing a tile-based computation strategy that partitions the contrastive loss calculation into smaller, sequentially computed blocks. To further enhance efficiency, they propose a multi-level tiling strategy that leverages ring-based communication at the GPU level and fused kernels at the CUDA core level, minimizing I/O overhead. The experiments demonstrate that Inf-CL significantly outperforms previous methods, achieving unprecedented batch sizes while maintaining accuracy and comparable training speed. This breakthrough opens new possibilities for large-scale contrastive learning, paving the way for advancements in areas such as self-supervised learning and dense text retrieval.
📎 Link to paper
Information
Author | Shahriar Shariati |
Organization | Shahriar Shariati |
Website | - |
Tags |
Copyright 2024 - Spreaker Inc. an iHeartMedia Company