This repository contains the code for paper CiT: Curation in Training for Effective Vision-Language Data. For the first time, CiT curates/optimizes training data during (pre-)training a CLIP-style ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results