机器学习

机器学习相关的算法、计算平台、数据集以及 API 服务。

⬅︎ 返回上层

其他
运行平台
线上平台与社区
- AI Image
AI 工具集
音乐生成
数字人
可视化
Chat
Diffusion 模型
图片生成
- Stable Diffusion Prompt
Prompt 工程
- Prompt 优化工具
- Prompt 快捷指令
强化学习 (Reinforcement Learning)
AutoML
OpenAI
数据集 (Dataset)
- 文字识别数据集 Text Detection and Recognition
- 计算机视觉数据集
- 语料库 (Corpus)
信息提取 (Information Extraction)
自然语言处理 (NLP)
- Word Segment 分词
- 自然语言生成 (NLG)
语音识别
- 语音转文字 (STT)
- 文字转语音 (TTS)
计算机视觉
- OCR
其他人的列表

其他

shell_gpt: 在命令行中调用 OpenAI 或者 Ollama 来处理文件内容。支持 shell 管道。
litellm: Call all LLM APIs using the OpenAI format. Use Bedrock, Azure, OpenAI, Cohere, Anthropic, Ollama, Sagemaker, HuggingFace, Replicate (100+ LLMs)

运行平台

TensorFlow: 如果你需要在生产环境中部署大规模的深度学习模型，那么 TensorFlow 可能更适合你。它具有良好的可扩展性和分布式计算能力，支持多种编程语言，并且被广泛应用于工业界。
- tfjs: TensorFlow JS 库
PyTorch: 如果你更关注研究，而不仅仅是实现，那么 PyTorch 可能更适合你。它的设计理念是“define-by-run”，即按照代码运行的方式定义计算图，这使得实验更加灵活和直观。
Keras: 如果你是初学者，或者想快速构建和训练神经网络模型，那么 Keras 可能是更好的选择。它具有简单易用的API和高层抽象，可以帮助你快速搭建和训练神经网络。
ColossalAI: 低成本（单张消费级显卡）训练 AI
llama.cpp: 通过压缩模型参数的精度，让 LLM 在消费级电脑上也能运行。虽然准确率会下降。
Ollama: 基于 llama.cpp，支持在本地运行 LLM。支持 MacOS/Linux/Windows 系统。支持命令行交互。支持 HTTP API 交互。提供模型下载和管理，官方维护一套已量化的模型。也可以编写 Modelfile 根据 gguf 模型自己微调模型。支持容器启动。

线上平台与社区

https://www.kaggle.com/
https://huggingface.co/
https://replicate.com/ : 提供模型训练和运行的云环境，价格实惠
- 在线训练你的 LoRA 模型
https://paperswithcode.com/
https://openbayes.com/ : 中国的人工智能研究机构

AI Image

text-to-image 社区

AI 工具集

https://www.futuretools.io/
https://ai-bot.cn/
https://convert.leiapix.com/ : 2D 图片添加 3D 效果。
https://flowgpt.ai/ : 这个工具能够用流程图的形式，把 ChartGPT 问答串联起来。适合做教学模板。
https://www.chatpdf.com/ : 帮助用户阅读电子书

音乐生成

https://mubert.com/ : 根据文字生成音乐

数字人

d-id: 商业服务。通过(人脸图片+文本/音频)生成说话人脸视频。支持调整音频语言、语调。
https://synclabs.so/ : 商业服务。同上。支持 API 调用。
SadTalker: 开源软件。同上。

可视化

Netron: a viewer for neural network, deep learning and machine learning models.

Chat

open-webui: User-friendly WebUI for LLMs。支持连接 OpenAI、Ollama。支持 RAG 和文档上传。
- lobe-chat: 备选方案。Chat WebUI。支持连接 OpenAI、Ollama。不支持文档上传。Ollama 模型同步暂且有问题。
https://github.com/elyase/awesome-gpt3
chatgpt-web: 轻松搭建 ChatGPT 交互网站

Diffusion 模型

ControlNet: 通过添加额外的条件来控制 Diffusion 模型
deep-floyd/IF

图片生成

flux: 开源的模型。很强大。效果比肩 Midjourney。
CompVis/stable-diffusion: A latent text-to-image diffusion model
Stability-AI/stablediffusion: High-Resolution Image Synthesis with Latent Diffusion Models
- https://huggingface.co/stabilityai/stable-diffusion-2-base
apple/ml-stable-diffusion: 把 SD 模型转换成苹果的 Core ML 模型
AUTOMATIC1111/stable-diffusion-webui: 最流行的 WebUI for SD
- stable-diffusion-webui-chinese: WebUI 中文语言包
- cmdr2/stable-diffusion-ui: 备选方案
Draw Things: Mac/iPhone 平台可用的 stable diffusion，支持自定义模型、Lora、ControlNet。没有 token 限制
ComfyUI: 用图形化界面、工作流操作 SD/Flux 等模型。

Stable Diffusion Prompt

https://openart.ai/promptbook : prompt 基础教程
咒语生成器
https://spell.novelai.dev/ : 从 Stable Diffusion 生成的图片反向生成 prompt。源码
https://prompthero.com/stable-diffusion-prompts
https://openart.ai/discovery : prompt 社区
Dalabad/stable-diffusion-prompt-templates

Prompt 工程

Prompt 优化工具

ChatGPT - Prompt Optimizer

Prompt 快捷指令

强化学习 (Reinforcement Learning)

Gymnasium: A standard API for reinforcement learning and a diverse set of reference environments (formerly Gym)

AutoML

OpenAI

OpenAI Cookbook: Examples and guides for using the OpenAI API

数据集 (Dataset)

文字识别数据集 Text Detection and Recognition

ICDAR 2013
ICDAR 2015
ICDAR 2017
ICDAR 2019
ICDAR 2021 https://icdar21-mapseg.github.io/
COCO-Text V2.0: contains 63,686 images with 239,506 annotated text instances.
SynthText: 有很多干扰元素的文字识别数据集
Text Recognition Data: 9 million images covering 90k English words,
SCUT-CTW1500: contains 1,500 images: 1,000 for training and 500 for testing. In particular, it provides 10,751 cropped text instance images, including 3,530 with curved text. The images are manually harvested from the Internet, image libraries such as Google Open-Image, or phone cameras. The dataset contains a lot of horizontal and multi-oriented text.
Total-Text: It consists of 1555 images with more than 3 different text orientations: Horizontal, Multi-Oriented, and Curved, one of a kind.

计算机视觉数据集

https://visualdata.io/discovery

语料库 (Corpus)

信息提取 (Information Extraction)

https://prodi.gy/ : 界面和功能很强大
snorkel: A system for rapidly creating, modeling, and managing training data with weak supervision
Information-Extraction-Chinese: 中文实体识别与关系提取
YEDDA: 支持中文
funNLP: 一系列信息提取的工具库
UBIAI: Easy-to-use text annotation tool for teams with most comprehensive auto-annotation features. Supports NER, relations and document classification as well as OCR annotation for invoice labeling.

自然语言处理 (NLP)

https://github.com/apachecn/AiLearning
https://github.com/crownpku/Awesome-Chinese-NLP
HanLP: 一系列中文信息处理的工具库
fastNLP: A Modularized and Extensible NLP Framework
小明 NLP: 提供中文分词, 词性标注, 拼写检查，文本转拼音，情感分析，文本摘要，偏旁部首
JioNLP: 中文 NLP 预处理、解析工具包
OpenCC: 繁体/简体中文转换
ckiplab: 台湾中研院资讯所、语言所的研究项目
Chinese-Word-Vectors: 预训练的中文词向量

Word Segment 分词

jieba: 结巴中文分词 python 版。
jieba-rs: 结巴中文分词 rust 版。

自然语言生成 (NLG)

语音识别

语音转文字 (STT)

文字转语音 (TTS)

ChatTTS: 开源的中英双语 TTS 模型。不支持语音克隆功能。
coqui-ai/TTS: 开源的。支持 16 种语言。支持语音克隆功能。
MockingBird: 支持语音克隆功能。中文支持比较好。

计算机视觉

OpenCV
pytorch-image-models: PyTorch image models, scripts, pretrained weights – ResNet, ResNeXT, EfficientNet, EfficientNetV2, NFNet, Vision Transformer, MixNet, MobileNet-V3/V2, RegNet, DPN, CSPNet, and more
yolov5

OCR

EasyOCR: Ready-to-use OCR with 80+ supported languages and all popular writing scripts including Latin, Chinese, Arabic, Devanagari, Cyrillic and etc.
tesseract: 开源 OCR 引擎
tesseract.js: 用 JS 重新实现的 tesseract
PaddleOCR

其他人的列表

https://github.com/tommy9301122/GitHub_Star

机器学习

TOC

其他

运行平台

线上平台与社区

AI Image

AI 工具集

音乐生成

数字人

可视化

Chat

Diffusion 模型

图片生成

Stable Diffusion Prompt

Prompt 工程

Prompt 优化工具

Prompt 快捷指令

强化学习 (Reinforcement Learning)

AutoML

OpenAI

数据集 (Dataset)

文字识别数据集 Text Detection and Recognition

计算机视觉数据集

语料库 (Corpus)

信息提取 (Information Extraction)

自然语言处理 (NLP)

Word Segment 分词

自然语言生成 (NLG)

语音识别

语音转文字 (STT)

文字转语音 (TTS)

计算机视觉

OCR

其他人的列表