我厌倦了通过 GitHub 解决问题，所以我创建了自己的 AI 机器人……🤔

TL;DR

在本文中，你将构建一个 AI Coder，用于修复 GitHub 代码库的文档和错误、创建新功能以及调试现有问题。AI Coder 会接收问题，找到解决方案，创建单独的分支，编写代码，并将拉取请求推送到远程存储库。

我们将使用 Composio、GitHub、Docker 和 OpenAI GPT-4o 来完成该项目。

以下是构建方法。

将 GitHub 与 Composio 连接起来以管理代码库。
使用 OpenAI GPT-4o 理解代码并执行操作。
添加Docker，让机器人编写和执行代码。

Composio - 第一个人工智能集成平台

以下是我们的简要介绍。

Composio 是一个开源工具基础架构，用于构建稳健可靠的 AI 应用程序。我们提供 100 多种工具和集成，涵盖 CRM、HRM、销售、生产力、开发和社交媒体等各个行业。

它们处理所有这些应用程序的用户身份验证和授权，并使所有 API 端点与各种 AI 模型和框架的连接变得简单。

请帮我们加一颗星。🥹

这将有助于我们创作更多这样的文章💖

为 Composio.dev 仓库加星标⭐

它是如何工作的？

因此，在深入研究编码部分之前，让我们先了解一下 AI 程序员的工作原理。

该机器人可以

访问任何给定的 GitHub 存储库。
使用 Composio 的本地工具根据需要读取、写入、更新和删除文件。
使用 OpenAI SDK 编排工作流程。不过，Composio 与框架无关，因此您可以使用 LangChain 和 LlamaIndex 等框架。
接受问题并为更改创建单独的分支。
在沙盒编码环境（例如 Docker）或云托管环境（例如 E2B、FlyIo 等）中运行程序。
最后，将修复推送到远程存储库。

构建 SWE 代理的先决条件

这些是完成此项目的必要条件。在继续下一步之前，请确保已正确配置这些条件。

Composio API 密钥

使用 Composio创建一个用户帐户来获取一个。

现在，使用您的 GitHub、Gmail 或任何其他电子邮件 ID 登录。登录后，您可以访问仪表板，查看可连接以赋能 AI 机器人的应用程序目录。

现在，导航到“设置”选项卡并复制 API 密钥。

OpenAI API 密钥

创建用户帐户并生成 API 密钥。确保您已添加积分以使用其模型进行推理。

GitHub 访问令牌

接下来，你必须为你的 GitHub 帐户创建一个访问令牌，以便 Composio 将更改推送到该帐户。点击链接并创建一个基本访问令牌。

让我们开始吧🔥

依赖项

首先使用您喜欢的包管理器安装依赖项。推荐的方法是pnpm，但您也可以使用 npm 或 yarn。

pnpm install -g composio-core

设置环境变量

您将需要 GITHUB_ACCESS_TOKEN、OPENAI_API_KEY、COMPOSIO_API_KEY、GITHUB_USERNAME 和 GITHUB_USER_EMAIL 来完成该项目。

因此，创建一个.env文件并添加上述变量。

GITHUB_ACCESS_TOKEN="your access token"
OPENAI_API_KEY="openai_key"
COMPOSIO_API_KEY="composio-api-key"
GITHUB_USER_NAME="GitHub username"
GITHUB_USER_EMAIL="GitHub user email"

执行以下命令将它们设置为环境变量。

export $(grep -v '^#' .env | xargs)

项目结构

该项目组织如下：

src
──代理
│ └──swe.ts
──app.ts
──提示
.ts └──utils.ts

以下是这些文件的简要说明。

agent/swe.ts：包含实现软件工程机器人的代码。
app.ts：应用程序的主入口点。
prompts.ts：定义代理使用的提示。
utils.ts：整个项目使用的实用功能。

要快速启动，请克隆此存储库并安装其余依赖项。

git clone https://github.com/ComposioHQ/swe-js-template.git swe-js

cd swe-js && pnpm i

现在你已经完成了所有设置。让我们开始编写我们的AI代理代码。

定义提示和目标

让我们首先定义AI程序员的提示和目标。详细解释每个步骤至关重要，因为这些定义会显著影响代理的性能和执行能力。

因此，prompts.ts如果您还没有创建文件，请创建一个。

现在，定义代理的角色和目标。

export const ROLE = "Software Engineer";

export const GOAL = "Fix the coding issues given by the user, and finally generate a patch with the newly created files using `filetool_git_patch` tool";

在这里，我们将角色定义为 SWE，目标是修复任何编码问题并使用创建补丁filetool_git_patch。这是用于创建补丁文件的 GitHub 集成的 Compsoio Action。

现在，定义一个详细的背景故事和 AI 程序员的描述。

export const BACKSTORY = `You are an autonomous programmer; your task is to
solve the issue given in the task with the tools in hand. Your mentor gave you
the following tips.
  1. Please clone the github repo using the 'FILETOOL_GIT_CLONE' tool, and if it
     already exists - you can proceed with the rest of the instructions after
     going into the directory using \`cd\` shell command.
  2. PLEASE READ THE CODE AND UNDERSTAND THE FILE STRUCTURE OF THE CODEBASE
    USING GIT REPO TREE ACTION.
  3. POST THAT READ ALL THE RELEVANT READMES AND TRY TO LOOK AT THE FILES
    RELATED TO THE ISSUE.
  4. Form a thesis around the issue and the codebase. Think step by step.
    Form pseudocode in case of large problems.
  5. THEN TRY TO REPLICATE THE BUG THAT THE ISSUES DISCUSS.
     - If the issue includes code for reproducing the bug, we recommend that you
      re-implement that in your environment, and run it to make sure you can
      reproduce the bug.
     - Then start trying to fix it.
     - When you think you've fixed the bug, re-run the bug reproduction script
      to make sure that the bug has indeed been fixed.
     - If the bug reproduction script does not print anything when it is successfully
      runs, we recommend adding a print("Script completed successfully, no errors.")
      command at the end of the file so that you can be sure that the script
      indeed, it ran fine all the way through.
  6. If you run a command that doesn't work, try running a different one.
    A command that did not work once will not work the second time unless you
    modify it!
  7. If you open a file and need to get to an area around a specific line that
    is not in the first 100 lines, say line 583, don't just use the scroll_down
    command multiple times. Instead, use the goto 583 command. It's much quicker.
  8. If the bug reproduction script requires inputting/reading a specific file,
    such as buggy-input.png, and you'd like to understand how to input that file,
    conduct a search in the existing repo code to see whether someone else has
    I've already done that. Do this by running the command find_file "buggy-input.png"
    If that doesn't work, use the Linux 'find' command.
  9. Always make sure to look at the currently open file and the current working
    directory (which appears right after the currently open file). The currently
    open file might be in a different directory than the working directory! Some commands, such as 'create', open files, so they might change the
    currently open file.
  10. When editing files, it is easy to accidentally specify a wrong line number or write code with incorrect indentation. Always check the code after
    You issue an edit to ensure it reflects what you want to accomplish.
    If it didn't, issue another command to fix it.
  11. When you FINISH WORKING on the issue, USE THE 'filetool_git_patch' ACTION with the
      new files using the "new_file_paths" parameters created to create the final patch to be submitted to fix the issue. Example,
      if you add \`js/src/app.js\`, then pass \`new_file_paths\` for the action like below,
      {
        "new_file_paths": ["js/src/app.js"]
      }
`;

export const DESCRIPTION = `We're solving the following issue within our repository. 
Here's the issue text:
  ISSUE: {issue}
  REPO: {repo}

Now, you're going to solve this issue on your own. When you're satisfied with all
your changes, you can submit them to the code base by simply
running the submit command. Note, however, that you cannot use any interactive
session commands (e.g. python, vim) in this environment, but you can write
scripts and run them. E.g. you can write a Python script and then run it
with \`python </path/to/script>.py\`.

If you face a "module not found error", you can install dependencies.
Example: in case the error is "pandas not found", install pandas like this \`pip install pandas\`

Respond to the human as helpfully and accurately as possible`;

在上面的代码块中，我们已经仔细且清晰地定义了代理完成任务必须采取的步骤。

这有助于法学硕士在面对常见编程挑战时做出反应。

定义效用函数

在本节中，我们将定义两个主要函数from GitHub和getBranchNameFromIssue，它们将提取有关给定 GitHub 问题的信息。

import * as fs from 'fs';
import * as path from 'path';
import * as readline from 'readline';
import { ComposioToolSet } from "composio-core/lib/sdk/base.toolset";
import { nanoid } from "nanoid";

type InputType = any;

function readUserInput(
  prompt: string,
  metavar: string,
  validator: (value: string) => InputType
): InputType {
  const rl = readline.createInterface({
    input: process.stdin,
    output: process.stdout
  });

  return new Promise<InputType>((resolve, reject) => {
    rl.question(`${prompt} > `, (value) => {
      try {
        const validatedValue = validator(value);
        rl.close();
        resolve(validatedValue);
      } catch (e) {
        console.error(`Invalid value for \`${metavar}\` error parsing \`${value}\`; ${e}`);
        rl.close();
        reject(e);
      }
    });
  });
}

function createGithubIssueValidator(owner: string, name: string, toolset: ComposioToolSet) {
  return async function githubIssueValidator(value: string): Promise<string> {
    const resolvedPath = path.resolve(value);
    if (fs.existsSync(resolvedPath)) {
      return fs.readFileSync(resolvedPath, 'utf-8');
    }

    if (/^\d+$/.test(value)) {
      const responseData = await toolset.executeAction('github_issues_get', {
        owner,
        repo: name,
        issue_number: parseInt(value, 10),
      });
      return responseData.body as string;
    }

    return value;
  };
}

export async function fromGithub(toolset: ComposioToolSet): Promise<{ repo: string; issue: string }> {
  const owner = await readUserInput(
    'Enter github repository owner',
    'github repository owner',
    (value: string) => value
  );
  const name = await readUserInput(
    'Enter github repository name',
    'github repository name',
    (value: string) => value
  );
  const repo = `${owner}/${name.replace(",", "")}`;
  const issue = await readUserInput(
    'Enter the github issue ID or description or path to the file containing the description',
    'github issue',
    createGithubIssueValidator(owner, name, toolset)
  );
  return { repo, issue };
}

所以，这就是上面的代码块中发生的事情。

readUserInput：此辅助函数从命令行读取用户输入。我们只需要 GitHub 用户 ID、仓库名称以及问题编号或描述。
createGithubIssueValidator：该函数返回一个用于 GitHub 问题的验证器。它可以处理文件路径、数字问题 ID 或纯字符串描述形式的输入。如果输入的是数字问题 ID，它会使用 Composio 的github_issues_get操作从 GitHub 获取问题详情。
fromGitHub：此函数结合上述辅助函数来接受用户输入、验证 GitHub 问题并最终返回存储库名称和提供的问题。

getBranchNameFromIssue现在，根据问题描述定义创建分支名称。

export function getBranchNameFromIssue(issue: string): string {
  return "swe/" + issue.toLowerCase().replace(/\s+/g, '-') + "-" + nanoid();
}

这将帮助 AI 程序员在处理代码之前根据问题创建一个新的分支。

定义 Swe 代理

现在，让我们进入主要活动，您将使用 OpenAI 助手和 Composio 工具集定义 Swe 代理。

因此，首先，导入库并定义 LLM 和工具。

import { OpenAIToolSet, Workspace } from 'composio-core';
import { BACKSTORY, DESCRIPTION, GOAL } from '../prompts';
import OpenAI from 'openai';

// Initialize tool.
const llm = new OpenAI({
    apiKey: process.env.OPENAI_API_KEY,
});
const composioToolset = new OpenAIToolSet({ 
    workspaceConfig: Workspace.Docker({})
});

在上面的代码块中，

我们使用 API 密钥创建了一个 OpenAI 实例。
我们还创建了一个设置为 Docker 的实例OpenAIToolSet。workspaceConfig这是为了使用 Docker 为 AI Coder 构建沙盒编码环境。您还可以使用 E2B 和 FlyIo 等云代码解释器。

现在，让我们定义一下AI程序员。

export async function initSWEAgent(): Promise<{composioToolset: OpenAIToolSet; assistantThread: OpenAI.Beta.Thread; llm: OpenAI; tools: Array<any>}> {
    let tools = await composioToolset.getTools({
        apps: [
            "filetool",
            "fileedittool",
            "shelltool"
        ],
    });

    tools = tools.map((a) => {
        if (a.function?.description?.length || 0 > 1024) {
            a.function.description = a.function.description?.substring(0, 1024);
        }
        return a;
    });

    tools = tools.map((tool) => {
        const updateNullToEmptyArray = (obj) => {
            for (const key in obj) {
                if (obj[key] === null) {
                    obj[key] = [];
                } else if (typeof obj[key] === 'object' && !Array.isArray(obj[key])) {
                    updateNullToEmptyArray(obj[key]);
                }
            }
        };

        updateNullToEmptyArray(tool);
        return tool;
    });

    const assistantThread = await llm.beta.threads.create({
        messages: [
            {
                role: "assistant",
                content:`${BACKSTORY}\n\n${GOAL}\n\n${DESCRIPTION}`
            }
        ]
    });

    return { assistantThread, llm, tools, composioToolset };
}

以下是上述代码块中发生的情况。

获取工具：从 Composio 工具集中获取、和的工具filetool。file edit tool顾名思义shelltool，这些工具将用于访问文件、编辑文件以及使用 Shell 执行命令。
修剪工具描述：将工具描述限制为 1024 个字符。这是为了避免占用 LLM 上下文窗口。
更新空值：用空数组替换工具配置中的空值。
创建助手线程：使用我们之前定义的提示启动 OpenAI 助手线程。
返回语句：Pz返回工具、助手线程、OpenAI实例和Composio工具集。

定义应用程序的入口点

这是最后一部分，我们定义应用程序的入口点。因此，加载环境变量并导入所需的模块。

import dotenv from "dotenv";
dotenv.config();

import { fromGithub, getBranchNameFromIssue } from './utils';
import { initSWEAgent } from './agents/swe';
import { GOAL } from './prompts';

代码块

加载环境变量。
导入必要的实用功能。
导入我们之前定义的 Swe Agent 和代理目标。

现在，定义main函数。

async function main() {
  /**Run the agent.**/
  const { assistantThread, llm, tools, composioToolset } = await initSWEAgent();
  const { repo, issue } = await fromGithub(composioToolset);

  const assistant = await llm.beta.assistants.create({
    name: "SWE agent",
    instructions: GOAL + `\nRepo is: ${repo} and your goal is to ${issue}`,
    model: "gpt-4o",
    tools: tools
  });

  await llm.beta.threads.messages.create(
    assistantThread.id,
    {
      role: "user",
      content: issue
    }
  );

  const stream = await llm.beta.threads.runs.createAndPoll(assistantThread.id, {
    assistant_id: assistant.id,
    instructions: `Repo is: ${repo} and your goal is to ${issue}`,
    tool_choice: "required"
  });

  await composioToolset.waitAndHandleAssistantToolCalls(llm as any, stream, assistantThread, "default");

  const response = await composioToolset.executeAction("filetool_git_patch", {
  });

  if (response.patch && response.patch?.length > 0) {
    console.log('=== Generated Patch ===\n' + response.patch, response);
    const branchName = getBranchNameFromIssue(issue);
    const output = await composioToolset.executeAction("SHELL_EXEC_COMMAND", {
      cmd: `cp -r ${response.current_working_directory} git_repo && cd git_repo && git config --global --add safe.directory '*' && git config --global user.name ${process.env.GITHUB_USER_NAME} && git config --global user.email ${process.env.GITHUB_USER_EMAIL} && git checkout -b ${branchName} && git commit -m 'feat: ${issue}' && git push origin ${branchName}`
    });

    // Wait for 2s
    await new Promise((resolve) => setTimeout(() => resolve(true), 2000));

    console.log("Have pushed the code changes to the repo. Let's create the PR now", output);

    await composioToolset.executeAction("GITHUB_PULLS_CREATE", {
      owner: repo.split("/")[0],
      repo: repo.split("/")[1],
      head: branchName,
      base: "master",
      title: `SWE: ${issue}`
    })

    console.log("Done! The PR has been created for this issue in " + repo);
  } else {
    console.log('No output available - no patch was generated :(');
  }

  await composioToolset.workspace.close();
}

main();