Skip to main content

06 → Search with glob and ripgrep

Overview

What you'll build

You're giving the assistant the ability to discover files and search through their contents. The glob tool scans file paths, while grep wraps ripgrep so the agent can find matches with context—all without leaving the project sandbox enforced by PathValidation.

Why it matters

  • Glob lets the agent list candidates (e.g., “find all *.mdx docs”).
  • Grep surfaces specific matches—perfect for “where is ConfigService mentioned?”
  • Tight validation protects your filesystem (no wandering into /etc or following symlinks out of the repo).

Big picture

This step builds on the file-reading foundation from Step 03. Later steps (07/08) rely on search results to preview edits and render markdown, so getting safe, predictable search is critical now.

Core concepts (Effect-TS)

Validating Working Directories

Always resolve user-provided paths through PathValidation.ensureWithinCwd. It normalizes absolute/relative inputs and fails if the resolved path tries to escape the project root.

Default Excludes

Ripgrep accepts --glob patterns to ignore expensive folders. Provide a sensible default (e.g., node_modules, .git, build artifacts) and let callers add more. This keeps searches fast and avoids scanning binaries.

Streaming Ripgrep Output

Ripgrep’s --json flag produces machine-readable events. Pipe them through a parser (parseGrepOutput) to accumulate matches, context lines, and file counts.


Implementation

Implementation

Step 1: Glob safely within the project root

src/tools/SearchTools.ts
import * as Schema from 'effect/Schema'
import * as Effect from 'effect/Effect'
import { PathValidation } from '../services/PathValidation'

// Define the input schema
const GlobParameters = Schema.Struct({
pattern: Schema.String,
cwd: Schema.optional(Schema.String),
dot: Schema.optional(Schema.Boolean),
absolute: Schema.optional(Schema.Boolean),
onlyFiles: Schema.optional(Schema.Boolean),
maxResults: Schema.optional(Schema.Number),
})

const glob = ({
pattern,
cwd,
dot = false,
absolute = false,
onlyFiles = true,
maxResults = 1000,
}: Schema.Schema.Type<typeof GlobParameters>) =>
Effect.gen(function* () {
const pathValidation = yield* PathValidation
const workingDir = yield* pathValidation.ensureWithinCwd(cwd ?? '.')

const globber = new Bun.Glob(pattern)
const iterator = globber.scan({
cwd: workingDir,
dot,
absolute,
onlyFiles,
followSymlinks: false,
throwErrorOnBrokenSymlink: false,
})

const matches: string[] = []
for await (const val of iterator) {
matches.push(val as string)
if (matches.length >= maxResults) break
}

matches.sort()

return {
pattern,
matches,
count: matches.length,
truncated: matches.length >= maxResults,
cwd: absolute ? workingDir : pathValidation.relativePath(workingDir),
} as const
})

Step 2: Build ripgrep arguments with sane defaults

src/tools/SearchTools.ts
const DEFAULT_EXCLUDES = ['**/node_modules/**', '**/.git/**', '**/dist/**', '**/.turbo/**']

const buildGrepArgs = (
pattern: string,
isRegex: boolean,
includePatterns: readonly string[] | undefined,
excludePatterns: readonly string[] | undefined,
contextLines: number,
maxResults: number,
): string[] => {
const args = [
'--json',
'--max-count',
maxResults.toString(),
'--context',
contextLines.toString(),
'--with-filename',
'--line-number',
]

if (!isRegex) args.push('--fixed-strings')

for (const exclusion of excludePatterns ?? DEFAULT_EXCLUDES) {
args.push('--glob', `!${exclusion}`)
}

for (const inclusion of includePatterns ?? []) {
args.push('--glob', inclusion)
}

args.push(pattern, '.')
return args
}

Step 3: Execute ripgrep through the command runtime

src/tools/SearchTools.ts
import * as Schema from 'effect/Schema'
import * as Effect from 'effect/Effect'
import { PathValidation } from '../services/PathValidation'
import { EmptyPattern } from '../types/errors'

// Define the input schema
const GrepParameters = Schema.Struct({
pattern: Schema.String,
isRegex: Schema.optional(Schema.Boolean),
includePatterns: Schema.optional(Schema.Array(Schema.String)),
excludePatterns: Schema.optional(Schema.Array(Schema.String)),
contextLines: Schema.optional(Schema.Number),
maxResults: Schema.optional(Schema.Number),
searchPath: Schema.optional(Schema.String),
})

const grep = ({
pattern,
isRegex = false,
includePatterns,
excludePatterns,
contextLines = 2,
maxResults = 100,
searchPath,
}: Schema.Schema.Type<typeof GrepParameters>) =>
Effect.gen(function* () {
if (pattern.trim().length === 0) {
return yield* Effect.fail(new EmptyPattern({ pattern }))
}

const pathValidation = yield* PathValidation
const cwd = yield* pathValidation.ensureWithinCwd(searchPath ?? '.')

const args = buildGrepArgs(
pattern,
isRegex,
includePatterns,
excludePatterns,
contextLines,
maxResults,
)

const output = yield* runRgString(args, cwd).pipe(Effect.orDie)
const { matches, filesSearched } = parseGrepOutput(output, path, cwd, contextLines)

return {
pattern,
isRegex,
filesSearched,
matches,
truncated: matches.length >= maxResults,
} as const
})

Remember to provide PathValidation.layer when constructing the managed runtime in ToolRegistry (mirrors the dependency you added for FileTools).


Security & Performance Notes
  • Enforcing ensureWithinCwd prevents path traversal and keeps searches local to the project.
  • Limiting maxResults protects against massive result sets (adjust in the schema if you need bulk results).
  • Add extra excludes for large binary directories (**/vendor/**, **/coverage/**) if your repo needs it.

Testing & Validation
  1. Run glob for src/**/*.ts and confirm results stay inside the repo.
  2. Run grep for a rare token and verify filesSearched increments correctly.
  3. Try an empty pattern—EmptyPattern should fail with a helpful error.
  4. Pass ../.. as searchPath; expect a failure from PathValidation rather than a filesystem escape.

Common Issues
ProblemLikely CauseFix
Glob returns absolute paths unexpectedlyabsolute: true used without UI normalizationUse pathValidation.relativePath before returning to the assistant
Ripgrep scans binary foldersDEFAULT_EXCLUDES missing build/vendor dirsAdd repo-specific excludes when configuring SearchTools
runRgString throws “command not found”ripgrep not installed in Bun runtimeInstall rg on your system or bundle it alongside the CLI
Path traversal allowedForgot to provide PathValidation.layerEnsure ToolRegistry runtime composes PathValidation.layer before SearchTools.layer

Connections

Builds on:

Sets up:

Related code:

  • src/tools/SearchTools.ts
  • src/tools/search/GrepParser.ts
  • src/adapters/SearchToolAdapters.ts
  • src/services/ToolRegistry.ts