{"username":"biwenshuai","header":{"name":"毕文帅","tagline":"NLP算法工程师 / AI大模型开发工程师 / Agent开发工程师"},"personalInfo":{"email":"biwenshuai1992@gmail.com","phone":"","location":"","pronouns":"","mbti":"","birthday":""},"experience":[{"company":"九坤投资算力业务线","role":"高级AI应用开发工程师","startDate":"2025/11","endDate":"至今","bullets":["负责企业级Agent平台、企业级上下文系统与大规模数据处理系统的设计和开发。","围绕业务意图、编排、技能和上下文四层架构构建企业级AI应用能力。","推动多模态混合检索、上下文压缩、权限管理和企业知识库自动化建设落地。"],"tags":["Agent","LLM应用","RAG","MCP","企业AI平台"]},{"company":"华大基因研究院","role":"自然语言处理高级工程师","startDate":"2021/08","endDate":"2024/10","bullets":["负责生信分析Agent系统、时空客服问答系统、文章辅助写作系统和大模型应用基础服务模块等项目。","结合Agent、RAG、GraphRAG、向量模型微调、rerank模型微调等技术提升科研与生信场景智能化效率。","参与子痫前期预测、多组学模型融合等科研项目，产出SCI论文和发明专利。"],"tags":["NLP","Bioinformatics","Agentic RAG","知识图谱","科研AI"]},{"company":"诺基亚上海贝尔科技","role":"NLP算法工程师","startDate":"2018/08","endDate":"2021/08","bullets":["负责电信与能源场景中的NLP、机器学习、知识图谱和预测建模项目。","完成深圳能源烟气预测、北京移动NPS满意度预测、北京移动告警根因分析、浙江移动供应链知识图谱构建等项目。","通过数据挖掘、图查询封装、质量监控与告警能力，支撑业务分析和流程问题发现。"],"tags":["NLP","机器学习","知识图谱","预测建模","数据挖掘"]},{"company":"中国石油集团","role":"数据库开发工程师","startDate":"2015/07","endDate":"2018/07","bullets":["从事数据库开发相关工作，积累企业级数据系统开发经验。"],"tags":["数据库开发","数据系统"]}],"education":[{"school":"哈尔滨工业大学","major":"","degree":"本科 / 学士学位","startDate":"2011.09","endDate":"2015.06"}],"projectsRecent":[{"title":"企业级Agent平台开发","description":"基于Ironclaw打造企业级Agent平台，适配企业场景并提升业务销售效率。","url":"","tags":["Agent","LangGraph","企业AI平台"]},{"title":"企业级上下文系统设计以及构建","description":"将品牌内容、用户反馈、项目文档、创意过程与策略判断转化为可理解、可调用、可演化的企业上下文。","url":"","tags":["Context Engineering","RAG","知识库"]},{"title":"生信分析Agent系统开发","description":"用户输入生信相关数据或任务后，系统自动进行任务拆解、工具调用与任务运行。","url":"","tags":["Bioinformatics","Multi-Agent","MCP"]}],"projectsDetailed":[{"title":"企业级上下文系统设计以及构建","type":"企业AI平台","startDate":"2026.1","endDate":"2026.4","url":"","award":"","bullets":["项目描述：将企业大量复杂非结构化信息，包括品牌内容、用户反馈、项目文档、创意过程与策略判断等，转化为可以被理解、被调用、被持续演化的上下文。","技术实现：企业知识库自动化构建，内容进入系统后自动识别、标注与结构化，从文件存储升级为语义化上下文网络。","技术实现：通过多模态混合检索打造企业专属决策知识引擎，并实现上下文压缩与渐进式披露。","技术实现：构建严格权限管理能力，使不同团队、项目和角色拥有差异化上下文管理权限。"],"tags":["Context Engineering","RAG","多模态检索","企业知识库","权限管理"]},{"title":"企业级Agent平台开发","type":"企业AI平台","startDate":"2025.11","endDate":"2026.04","url":"","award":"","bullets":["项目描述：基于Ironclaw打造企业级Agent平台，适配大部分企业使用场景，提高企业销量。","技术实现：构建意图层、编排层、技能层、上下文层四层架构。","编排层围绕业务意图进行发散式推理，将任务拆解为多条候选执行路径，并基于质量、成本与时延约束进行动态评估。","技能层覆盖内容生成、OA系统、智能运维和智能研究；上下文层依托企业级上下文系统承载并持续演进企业关键资产。"],"tags":["Agent","Ironclaw","任务编排","企业AI","智能运维"]},{"title":"大规模数据处理系统","type":"数据工程 / LLM数据处理","startDate":"2025.11","endDate":"2026.04","url":"","award":"","bullets":["项目描述：将约90T PDF数据转化为模型可训练的数据。","技术实现：构建大规模数据处理框架，包括状态管理、错误重试、异常识别等模块。","基于MinerU 2.5识别PDF中的文本、表格、图片和公式。","基于置信度对识别不准确模块进行标注，并使用大模型重新识别低置信度内容。"],"tags":["数据处理","PDF解析","MinerU","LLM数据","质量控制"]},{"title":"生信分析Agent系统开发","type":"生信AI / Agent系统","startDate":"2024.12","endDate":"2025.03","url":"","award":"已部署线上，并得到客户技术团队好评","bullets":["项目描述：用户输入生信相关数据或生信任务后，大模型自动进行任务拆解和任务运行。","技术实现：构建SupervisorAgent、PlanningAgent、CodeAgent、SearchAgent等多Agent系统框架。","CoreAgent将生信分析工具自动封装成MCP Server。","使用GraphRAG技术辅助生信相关代码生成，并基于生信领域语料微调工具调用模型。","整体框架基于LangGraph实现。"],"tags":["Bioinformatics","LangGraph","MCP","GraphRAG","工具调用"]},{"title":"时空客服问答系统","type":"智能客服 / RAG","startDate":"2024.03","endDate":"2024.10","url":"","award":"已部署线上，可以完全替代人工","bullets":["项目描述：具备时空组学相关背景知识，同时能够回答时空云平台相关的各类问题。","技术实现：结合Agent和RAG技术提升客服系统整体泛化能力。","负责知识库构建和优化处理、生信领域向量模型微调、rerank模型微调。","采用混合检索技术和图增强技术提升回答准确率，并支持文本和图表输出。"],"tags":["RAG","Agent","混合检索","rerank","时空组学"]},{"title":"文章辅助写作系统","type":"科研写作 / Agentic RAG","startDate":"2024.01","endDate":"2024.05","url":"","award":"已部署上线，显著提升科研人员综述撰写效率和research效率","bullets":["项目描述：帮助用户自动撰写生物相关领域综述文章。","技术实现：构建文献数据库、文献知识抽取流程和文献知识图谱。","设计资料收集Agent、OutlineAgent、WriteAgent、润色Agent。","实现准确知识检索模块，整体框架基于LangChain，Agentic RAG基于LlamaIndex。"],"tags":["LangChain","LlamaIndex","Agentic RAG","知识图谱","科研写作"]},{"title":"子痫前期预测模型（科研）","type":"科研 / 医疗AI","startDate":"2023.03","endDate":"2024.05","url":"","award":"发表SCI论文两篇，系统部署于广东省妇幼保健院","bullets":["项目描述：对孕早期女性进行子痫风险预测，辅助提前治疗和保护。","项目总结：基于多模型融合推理提高预测准确率。"],"tags":["医疗AI","预测模型","多模型融合","SCI"]},{"title":"大模型应用基础服务模块","type":"LLM基础设施","startDate":"2023.02","endDate":"2024.03","url":"","award":"显著简化大模型应用开发流程，提高开发效率","bullets":["项目描述：统一API服务模块、prompt版本管理器、问答QA测试工具集、自动微调调度器、QA问答数据挖掘和大模型服务调用监控器。","项目总结：为大模型应用开发提供基础服务能力。"],"tags":["LLMOps","Prompt管理","Fine-tuning","QA评测","服务监控"]},{"title":"多组学模型融合（科研）","type":"科研 / 多组学","startDate":"2021.11","endDate":"2022.05","url":"","award":"癌症预测准确率提高5个百分点，发表SCI一篇，发表专利一篇","bullets":["项目描述：将不同组学的底层数据采用融合算法进行融合，提高下游任务准确率。"],"tags":["多组学","模型融合","癌症预测","SCI","专利"]},{"title":"深圳能源烟气预测项目","type":"工业预测建模","startDate":"2020.11","endDate":"2021.05","url":"","award":"线上运行结果：10%正负误差准确率达到80.5%，20%正负误差达到99.6%","bullets":["项目描述：基于垃圾焚烧数据预测未来十分钟NOX和HCL值，并对现有数据进行数据质量实时监控和告警。"],"tags":["时间序列预测","工业数据","数据质量监控","告警"]},{"title":"北京移动NPS满意度预测","type":"机器学习 / NLP","startDate":"未注明","endDate":"","url":"","award":"精准率0.6，召回率0.1，业务更关注精准率","bullets":["项目描述：基于机器学习和NLP技术进行NPS满意度预测。"],"tags":["NLP","机器学习","满意度预测"]},{"title":"北京移动告警根因分析","type":"数据挖掘","startDate":"2021.03","endDate":"2021.05","url":"","award":"已部署线上，帮助专家方便定位根因告警","bullets":["项目描述：基于数据挖掘技术进行告警根因分析。"],"tags":["数据挖掘","根因分析","告警分析"]},{"title":"浙江移动供应链知识图谱构建","type":"知识图谱","startDate":"2020.09","endDate":"2021.01","url":"","award":"","bullets":["项目描述：构建浙江移动供应链知识图谱，用于发现供应链流程问题和供应链可视化。","项目总结：预先封装基于图的查询语句，便于流程各环节查询、识别和发现问题。"],"tags":["知识图谱","图查询","供应链","可视化"]},{"title":"近三年发表的论文与专利","type":"论文 / 专利","startDate":"2022","endDate":"2025","url":"","award":"","bullets":["Bi W., Ma, Y., et al. (2024). An entity extraction pipeline for medical text records using large language models. JMIR, 26, e54580. DOI: 10.2196/54580.","Wang, L., Bi, W., et al. (2024). Investigating the impact of prompt engineering on the performance of large language models for standardizing obstetric diagnosis text: Comparative study. JMIR Formative Research, 8, e53216. DOI: 10.2196/53216.","Wang, L., Ma, Y., Bi, W., et al. (2024). An early screening model for preeclampsia: Utilizing zero-cost maternal predictors exclusively. Hypertens Res, 47(4), 1051-1062. DOI:10.1038/s41440-023-01573-8.","Huang R, Yao Y, Tong X, et al., Bi W, et al. (2023). Tracing the evolving dynamics and research hotspots of microbiota and immune microenvironment. Microbiol Spectr, 11(5), e0013523. DOI:10.1128/spectrum.00135-23.","Bi W, Huang R, Yao Y, Tong X, et al. (2025). Robust multi-omics subtyping of hepatocellular carcinoma. Manuscript in preparation.","毕文帅, 王雷, 等. (2023). 多源数据融合方法及装置. 中国发明专利（已公开）, 公开号CN119229962A.","毕文帅, 王雷, 等. (2022). 妊娠期风险预测方法及装置. 中国发明专利（已公开）, 公开号CN118262905A."],"tags":["JMIR","医疗NLP","医疗AI","SCI","专利"]}],"skills":[{"name":"大模型与LLM开发","items":["精通大型语言模型开发，熟悉Transformer架构、自监督学习、指令微调、模型量化等技术。","熟练使用Hugging Face、LangChain、LlamaIndex等工具和框架。","具备Qwen、LLaMA、DeepSeek等开源模型应用与调优经验，熟练掌握prompt优化技巧。"]},{"name":"Agent与多Agent系统","items":["精通LLM应用开发和Agent开发技术栈，熟练使用LangGraph、AutoGen、CAMEL等框架。","掌握Planning、Memory、Tools模块化构建方法，能够设计复杂任务规划、长期记忆管理和工具调用机制。","熟练掌握MCP相关技能，能够构建高效协作的多Agent系统。"]},{"name":"模型定制与推理优化","items":["熟悉Embedding、Fine-tuning、LoRA、QLoRA、量化(INT8/INT4)等大模型定制化技术。","熟练使用PyTorch、Hugging Face Transformers、PEFT、DeepSpeed、vLLM等训练与推理加速技术。","具备模型压缩与性能优化经验。"]},{"name":"评测、RAG与知识平台","items":["熟悉大模型评估方法与指标，能够构建包含自动化评测和人工评测流程的完整评测体系。","熟悉Milvus、Pinecone、Chroma等向量数据库和知识库构建技术。","能够开发高效的检索增强生成(RAG)系统与企业级知识平台，掌握Graph RAG技术提升知识关联性和推理能力。"]},{"name":"个人优势与社区贡献","items":["开源社区贡献者，llama3中文版开源贡献者，prompt管理插件开源共享者，langmanus代码贡献者。","简书签约作者（机器学习、深度学习方向），Botblog.app作者，AgentShare作者。","对AI相关技术有极大兴趣和热情，学习能力、动手能力、组织能力和抗压能力强。"]}],"contact":[{"label":"Email","url":"mailto:biwenshuai1992@gmail.com"},{"label":"简书","url":"https://www.jianshu.com/u/5634325704f5"},{"label":"Botblog.app","url":"https://botblog.app"}],"meta":{"updatedAt":"2026-06-04T12:24:35.042Z"}}