Hugging Face’s Xet team is working on improving the efficiency of the Hub’s storage architecture so that users can easily and quickly store and update data and models. Hugging Face hosts nearly 11PB of datasets, and Parquet files alone account for over 2.2PB of that storage, so optimizing Parquet storage is a very high priority. [...]
The post Improved deduplication of Hug Face Hub parquet first appeared on Versa AI hub.
from Blog - Versa AI hub https://versaaihub.com/improved-deduplication-of-hug-face-hub-parquet/?utm_source=rss&utm_medium=rss&utm_campaign=improved-deduplication-of-hug-face-hub-parquet
via IFTTT
No comments:
Post a Comment