This repository contains the code for paper CiT: Curation in Training for Effective Vision-Language Data. For the first time, CiT curates/optimizes training data during (pre-)training a CLIP-style ...