PI Auto Research

用自动化实验接管性能优化与模型调参

Automate Experiments for Performance Optimization and Model Tuning

欢迎访问 PI Auto Researchpi-autoresearch)。这是一个面向开发者与工作流的开源方向:把「改代码 → 跑测试/基准 → 看指标 → 决定是否保留」这条枯燥链路交给程序持续执行。您只需明确可量化的优化目标可信的验证方式,系统会迭代提出改动、执行度量,并对改进与回退做出对应处置。

Welcome to PI Auto Research (pi-autoresearch). This is an open-source direction for developers and workflows: automate the tedious cycle of "modify code → run tests/benchmarks → check metrics → decide whether to keep". Simply define quantifiable optimization goals and reliable validation methods, and the system will iteratively propose changes, execute measurements, and handle improvements and rollbacks accordingly.

常见切入点包括:缩短单测或 CI 耗时、压缩前端产物体积、加快构建、压低 LLM 训练损失曲线,以及任何能用脚本稳定复现数字指标的优化任务。

Common use cases include: reducing unit test or CI runtime, compressing frontend bundle size, accelerating builds, lowering LLM training loss curves, and any optimization task with numerically quantifiable metrics that can be reliably reproduced with scripts.

关于 PI Auto Research About PI Auto Research

优势 Advantages

10+ 年自动化优化经验 10+ Years of Automated Optimization Experience

PI Auto Research 成功应用于各类项目,显著提升开发效率与系统性能。

PI Auto Research has been successfully applied to various projects, significantly improving development efficiency and system performance.

通过自动化实验流程,减少人工干预,让开发者专注于核心问题的解决,而非重复的机械操作。

Through automated experiment workflows, reduce manual intervention and allow developers to focus on solving core problems rather than repetitive mechanical operations.

作者: Author: github.com/adisinghstudent

自动化实验流程 Automated Experiment Workflow

智能指标评估 Intelligent Metric Evaluation

专业团队支持 Professional Team Support

特性 Features

灵活的优化目标 Flexible Optimization Goals

只要目标能以数字或稳定规则衡量,即可纳入循环:例如前端打包体积、构建耗时、接口延迟、测试总时长,或训练过程中的损失与辅助指标。

Any goal that can be measured with numbers or stable rules can be included in the loop: such as frontend bundle size, build time, API latency, total test duration, or loss and auxiliary metrics during training.

终端可视化面板 Terminal Visualization Panel

在命令行环境中提供进度与对比的直观呈现,便于长时间任务下快速把握趋势,而无需手工拼接日志。

Provides intuitive visualization of progress and comparisons in the command-line environment, making it easy to grasp trends during long tasks without manually concatenating logs.

置信度评分系统 Confidence Scoring System

通过评分机制削弱偶发抖动与测量噪声带来的误判,让「保留 / 回退」决策更贴近真实收益而非单次运气。

Reduces misjudgments caused by occasional jitter and measurement noise through a scoring mechanism, making "keep/rollback" decisions more aligned with real benefits rather than single-time luck.

使用指南 User Guide

安装步骤(概要) Installation Steps (Overview)

1

获取源码 Get Source Code

From GitHub - PI Auto Research 获取源码或发行说明。 get the source code or release notes.

2

配置环境 Configure Environment

按仓库内文档完成依赖安装与 API、运行环境配置(含令牌与预算策略)。

Follow the documentation in the repository to complete dependency installation and API and runtime environment configuration (including tokens and budget strategies).

3

启动实验 Start Experiment

写明优化目标与通过/失败判据,启动自动化实验流程;尽量缩短单次验证耗时以提升迭代密度。

Specify optimization goals and pass/fail criteria, start the automated experiment workflow; try to shorten single validation time to increase iteration density.

示例集成思路 Example Integration Approach

在您的仓库中添加独立脚本,统一调用测试或基准命令并打印解析友好的指标行;主程序只与该脚本耦合,便于日后更换度量实现。具体命令与配置请以官方 README 为准。

Add an independent script to your repository that uniformly calls test or benchmark commands and prints parse-friendly metric lines; the main program only couples with this script, making it easy to replace measurement implementations in the future. Please refer to the official README for specific commands and configurations.

技能定义 Skill Definition · pi-autoresearch-loop(SKILL.md)

以下为 pi-autoresearch 随附技能清单的网页版摘录,与仓库内 SKILL.md 语义一致,便于检索与分享。

The following is a web-based excerpt from the skill list included with pi-autoresearch, semantically consistent with SKILL.md in the repository for easy retrieval and sharing.

Skill metadata

---
name: pi-autoresearch-loop
description: Autonomous experiment loop for pi that continuously tries optimizations, measures results, and keeps what works
triggers:
  - autoresearch
  - autonomous experiment loop
  - optimize automatically
  - run experiment loop
  - continuous optimization
  - benchmark and improve
  - start autoresearch session
  - keep what works discard what doesnt
---

触发词(triggers) 亦可视作英文关键词,便于在助手或文档中匹配该技能:

Triggers can also be viewed as English keywords for matching this skill in assistants or documentation:

  • autoresearch
  • autonomous experiment loop
  • optimize automatically
  • run experiment loop
  • continuous optimization
  • benchmark and improve
  • start autoresearch session
  • keep what works discard what doesnt

pi-autoresearch — Autonomous Experiment Loop

Skill by ara.so — Daily 2026 Skills collection

Autonomous experiment loop extension for pi. Continuously proposes changes, benchmarks them, commits wins, reverts losses, and repeats — forever. Works for any measurable target: test speed, bundle size, build time, LLM training loss, Lighthouse scores.

Installation

pi install https://github.com/davebcn87/pi-autoresearch

常见问题 FAQ

如何开始使用 PI Auto Research? How to get started with PI Auto Research?

克隆仓库 → 配置运行环境与模型/工具访问 → 准备最小可复现的验证脚本 → 写下优化目标与护栏指标(必要时包括对「测试数量不变」等约束)→ 启动循环并定期检查提交历史与会话记录。

Clone the repository → Configure runtime environment and model/tool access → Prepare a minimal reproducible validation script → Define optimization goals and guardrail metrics (including constraints like "test count unchanged" if necessary) → Start the loop and regularly check commit history and session records.

插件支持哪些编程语言? Which programming languages does the plugin support? +

工具侧通常与语言解耦,取决于您的验证命令能否稳定运行在目标仓库中;实践中有算法、多模块 Java、Go 库等不同形态的讨论案例。

The tool is usually language-agnostic, depending on whether your validation commands can run stably in the target repository; there are discussion cases for algorithms, multi-module Java, Go libraries, etc. in practice.

如何报告问题或请求支持? How to report issues or request support? +

请使用 GitHub Issues 并附上环境、复现步骤与最小日志;若涉及令牌或私有数据,注意脱敏。

Please use GitHub Issues and include environment details, reproduction steps, and minimal logs; be sure to redact any tokens or private data.

社区与实践剪影 Community & Practice Highlights

以下为用户态经验归纳,便于理解工具适用边界;具体行为以官方仓库与文档为准。

The following is a summary of user experiences to help understand the tool's applicable boundaries; specific behaviors are subject to the official repository and documentation.

独立技术写作与复盘 Independent Technical Writing and Review

有作者在实测中梳理了 pi-autoresearch 类「优化循环」的工作方式:通过会话产物(如计划文档、验证脚本与仅追加的实验日志)把假设、度量与结论串成可追溯记录,并强调验证脚本越短、输出越克制,单位时间能完成的实验轮次越多。延伸阅读:Bartosz Ocytko:pi-autoresearch optimization loops

Some authors have documented the working principles of pi-autoresearch-like "optimization loops" in practice: using session artifacts (such as plan documents, validation scripts, and append-only experiment logs) to link hypotheses, measurements, and conclusions into traceable records, emphasizing that shorter validation scripts with more restrained output allow for more experiment rounds per unit time. Further reading: Bartosz Ocytko: pi-autoresearch optimization loops.

大规模代码库中的目标驱动优化 Goal-Driven Optimization in Large Codebases

公开讨论中也有案例显示:在给定单一性能诉求时,自动化代理可对复杂模板/渲染等子系统做多轮改写与测评,最终在速度与内存占用上取得可观收益——侧面说明「清晰目标 + 可靠基准」仍是成败关键。

Public discussions also show cases where, given a single performance requirement, automated agents can perform multiple rounds of rewrites and evaluations on complex template/rendering subsystems, ultimately achieving significant gains in speed and memory usage — illustrating that "clear goals + reliable benchmarks" remain the key to success.

请始终在您的环境内复核指标与行为,避免过拟合窄基准或削弱测试覆盖。更多社区讨论见 Always verify metrics and behaviors in your own environment to avoid overfitting to narrow benchmarks or weakening test coverage. For more community discussions, see LinkedIn

源代码 Source Code

主仓库: Main Repository: https://github.com/davebcn87/pi-autoresearch