A system for quickly generating training data with weak supervision
-
Updated
May 2, 2024 - Python
A system for quickly generating training data with weak supervision
A GPU-accelerated library containing highly optimized building blocks and an execution engine for data processing to accelerate deep learning training and inference applications.
🔥🔥High-Performance Face Recognition Library on PaddlePaddle & PyTorch🔥🔥
TextAttack � is a Python framework for adversarial attacks, data augmentation, and model training in NLP https://textattack.readthedocs.io/en/master/
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
Medical imaging toolkit for deep learning
A Python library for audio data augmentation. Inspired by albumentations. Useful for machine learning.
一键ä¸æ–‡æ•°æ?®å¢žå¼ºåŒ… ï¼› NLPæ•°æ?®å¢žå¼ºã€?bertæ•°æ?®å¢žå¼ºã€?EDA:pip install nlpcda
fastdup is a powerful, free tool designed to rapidly generate valuable insights from image and video datasets. It helps enhance the quality of both images and labels, while significantly reducing data operation costs, all with unmatched scalability.
List of useful data augmentation resources. You will find here some not common techniques, libraries, links to GitHub repos, papers, and others.
Data augmentation for NLP, presented at EMNLP 2019
自然è¯è¨€å¤„ç?†ï¼ˆnlp),å°?姜机器人(闲è?Šæ£€ç´¢å¼?chatbot),BERTå?¥å?‘é‡?-相似度(Sentence Similarity),XLNETå?¥å?‘é‡?-相似度(text xlnet embedding),文本分类(Text classification), 实体æ??å?–(ner,bert+bilstm+crf),数æ?®å¢žå¼ºï¼ˆtext augment, data enhance),å?Œä¹‰å?¥å?Œä¹‰è¯?生æˆ?,å?¥å?主干æ??å?–(mainpart),ä¸æ–‡æ±‰è¯çŸæ–‡æœ¬ç›¸ä¼¼åº¦ï¼Œæ–‡æœ¬ç‰¹å¾?工程,keras-http-service调用
Code for TKDE paper "Self-supervised learning on graphs: Contrastive, generative, or predictive"
An implement of the paper of EDA for Chinese corpus.ä¸æ–‡è¯æ–™çš„EDAæ•°æ?®å¢žå¼ºå·¥å…·ã€‚NLPæ•°æ?®å¢žå¼ºã€‚论文阅读笔记。
Data Augmentation For Object Detection
Fast audio data augmentation in PyTorch. Inspired by audiomentations. Useful for deep learning.
Awesome papers about generative Information Extraction (IE) using Large Language Models (LLMs)
Collection of papers and resources for data augmentation for NLP.
Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
This repository collects papers for "A Survey on Knowledge Distillation of Large Language Models". We break down KD into Knowledge Elicitation and Distillation Algorithms, and explore the Skill & Vertical Distillation of LLMs.
Add a description, image, and links to the data-augmentation topic page so that developers can more easily learn about it.
To associate your repository with the data-augmentation topic, visit your repo's landing page and select "manage topics."