From 634ce8fff824b7444ca33a6e32d98625a9119188 Mon Sep 17 00:00:00 2001 From: carry <2641257231@qq.com> Date: Sat, 26 Apr 2025 01:58:30 +0800 Subject: [PATCH] =?UTF-8?q?docs:=20=E6=9B=B4=E6=96=B0=E8=AE=BA=E6=96=87?= =?UTF-8?q?=E6=91=98=E8=A6=81=E5=92=8C=E5=85=B3=E9=94=AE=E8=AF=8D=E5=86=85?= =?UTF-8?q?=E5=AE=B9?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit 更新了中英文摘要部分,增加了对基于文档驱动的自适应编码大模型微调框架的详细描述,包括核心创新点、技术实现和实验效果。同时更新了关键词列表,以更全面地反映论文内容。 --- paper/latex/chapters/abstract.tex | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/paper/latex/chapters/abstract.tex b/paper/latex/chapters/abstract.tex index 7de4197..bd12661 100644 --- a/paper/latex/chapters/abstract.tex +++ b/paper/latex/chapters/abstract.tex @@ -7,12 +7,16 @@ % 中文摘要 \begin{onecolabstract} - \noindent{}\makebox[5em][l]{{\zihao{4}\textbf{摘要}}}{\songti \zihao{-4}本文研究了一种基于文档驱动的自适应编码大模型微调框架。该框架通过分析文档结构自动生成训练样本,实现了模型参数的高效优化。实验结果表明,该方法在多个NLP任务上取得了显著的性能提升,同时减少了人工标注的工作量。}\par - \noindent{}\makebox[5em][l]{{\zihao{4}\textbf{关键词}}}{\zihao{-4}\songti 关键词1;关键词2}\par + \noindent{}\makebox[5em][l]{{\zihao{4}\textbf{摘要}}}{\songti \zihao{-4}大型语言模型(LLMs)在通用代码生成任务中表现出色,但在处理包含专有知识的企业私有代码库时,其性能往往受限。针对此问题,本文提出并实现了一个基于文档驱动的自适应编码大模型微调框架。该框架的核心创新在于:首先,通过深度解析技术文档(以Markdown格式为例),自动抽取关键信息(如函数签名、类定义、用法示例等)并结合预设模板生成高质量的指令微调(SFT)训练语料;其次,利用参数高效微调技术(如QLoRA)对预训练的编码大模型(以Qwen为例)进行针对性优化,使其精准适配私有库的特定语法、结构和编程范式;最后,整合了包括数据持久化(SQLite+TinyDB)、训练监控(TensorBoard)和交互式前端(Gradio)在内的完整工作流。实验结果表明,该框架能够有效提升大模型在私有库代码生成任务上的准确性和实用性,显著减少对人工标注的依赖,为实现企业级软件开发的智能化和高效化提供了一套自动化、可扩展的解决方案。 + }\par + \noindent{}\makebox[5em][l]{{\zihao{4}\textbf{关键词}}}{\zihao{-4}\songti 大型语言模型; 代码生成; 模型微调; 参数高效微调; QLoRA; 文档驱动; 自动化; 私有库; 自然语言处理; Gradio + }\par \end{onecolabstract} % 英文摘要 \begin{onecolabstract} - \noindent{}\makebox[10em][l]{{\zihao{4} \textbf{ABSTRACT}}}{\zihao{-4}This paper proposes a document-driven adaptive fine-tuning framework for large coding models. By analyzing document structures to automatically generate training samples, the framework achieves efficient optimization of model parameters. Experimental results demonstrate significant performance improvements on multiple NLP tasks while reducing manual annotation workload.}\par - \noindent{}\makebox[10em][l]{{\zihao{4}\textbf{KEYWORDS}}}{\zihao{-4}Document-driven; Adaptive fine-tuning; Large language models; NLP tasks; Automatic annotation}\par + \noindent{}\makebox[10em][l]{{\zihao{4} \textbf{ABSTRACT}}}{\zihao{-4}Large Language Models (LLMs) excel in general code generation tasks, but their performance is often limited when handling enterprise private code repositories containing proprietary knowledge. To address this issue, this paper proposes and implements a document-driven adaptive fine-tuning framework for large code models. The core innovations of this framework are: first, by deeply parsing technical documentation (using Markdown format as an example), it automatically extracts key information (such as function signatures, class definitions, usage examples, etc.) and combines them with preset templates to generate high-quality instruction fine-tuning (SFT) training data; second, it utilizes parameter-efficient fine-tuning techniques (such as QLoRA) to specifically optimize a pre-trained large code model (taking Qwen as an example), enabling it to accurately adapt to the specific syntax, structure, and programming paradigms of the private library; finally, it integrates a complete workflow including data persistence (SQLite+TinyDB), training monitoring (TensorBoard), and an interactive frontend (Gradio). Experimental results demonstrate that this framework can effectively improve the accuracy and practicality of large models in private library code generation tasks, significantly reduce reliance on manual annotation, and provide an automated, scalable solution for intelligent and efficient enterprise software development. + }\par + \noindent{}\makebox[10em][l]{{\zihao{4}\textbf{KEYWORDS}}}{\zihao{-4}Large Language Models; Code Generation; Model Fine-tuning; Parameter-Efficient Fine-tuning; QLoRA; Document-Driven; Automation; Private Library; Natural Language Processing; Gradio + }\par \end{onecolabstract} \ No newline at end of file