轶哥

What Data Is Needed for Supervised Fine-Tuning (SFT)?

更新：2023-08-24 13:56:53
首发：2023-08-23 23:21:29
大模型
2933

The article primarily discusses the types and quality of data required for Supervised Fine-Tuning (SFT). It covers the following aspects:

Objectives of Supervised Fine-Tuning : Enhancing performance in specific tasks, domain adaptability, and the interpretability and controllability of the model, with an overarching goal to boost system robustness.
Core Considerations : These include the diversity of data, avoiding treating SFT merely as data supplementation, appropriately incorporating few-shot learning and COT data, emphasizing data quality over quantity in SFT, and recognizing that increasing data volume without diversity brings diminished returns.
Data Quality Requirements : These considerations touch on the length restrictions for questions and answers, the accuracy of answers, the selection of data based on industry requirements, the diversity of necessary NLP abilities, and the caution against too much vertical domain data.
Specific Examples : The article provides both good and poor dataset examples to illustrate how to choose and evaluate data.
Q&A Section : This part explains why including the ability to write code in SFT is essential, emphasizing its importance in improving reasoning and structured output abilities.

In summary, the article offers comprehensive guidance on how to conduct supervised fine-tuning, underlining the importance of data diversity and quality, and presents implementation strategies and examples to support these points.

普通电脑安装使用A800/A100等专业显卡

更新：2023-08-23 23:23:06
首发：2023-06-28 00:25:45
大模型
9362

专业显卡在生产力应用领域不仅仅用于AI模型训练及推理，也用于空气动力学仿真、科学计算和数据分析。某些情况下塔式工作站是很多专业用户的最佳选择。

为什么选择消费级平台？

使用云服务获得专业显卡的计算能力是非常方便的，但受限于光速，使用非同一个城市的机房会有较大的延迟，部分专业应用对延迟要求很高，也不得不使用物理机搭配专业显卡。

博主在近几年一直在做垂直领域的人工智能，由于服务于政府客户，对数据安全的要求比较高，因此选择自建塔式工作站放置于工作室来进行模型调试，使用雷电4接口能够快速传输大量数据。在生产部署阶段自然也会部署于服务器机房，实现合理分工。

在服务器中使用A800/A100专业显卡非常简单，插上去就完事。但是机架服务器动辄万转的风扇产生巨大的噪音使得无法放置在普通办公室使用。

不同的人对专业应用的要求是不一样的，就像有的人明知游戏卡某些计算精度性能和专业卡有区别，内存带宽和多卡互联能力也有很大差距，但是他就是要用多张4090游戏卡来进行AI炼丹一样，他们既不使用专业软件，也不要求更高的数据吞吐能力，因此出于对成本、环境、安全性以及各方面的考虑，就是有用户希望使用普通电脑安装使用专业显卡。博主实测4090比绝大部分专业卡做AI绘图速度更快，因此4090在这个场景下就是具有性价比。总之，希望大家以包容的心态看待这个问题。

家用级别的设备稳定性是远不如企业级设备的，博主也只会在开发环境下使用消费级平台，不要试图将工作站直接放进机柜部署，否则就是给自己找事情做。