06 → Search with glob and ripgrep
Overview
What you'll build
You're giving the assistant the ability to discover files and search through their contents. The glob tool scans file paths, while grep wraps ripgrep so the agent can find matches with context—all without leaving the project sandbox enforced by PathValidation.
Why it matters
- Glob lets the agent list candidates (e.g., “find all
*.mdxdocs”). - Grep surfaces specific matches—perfect for “where is
ConfigServicementioned?” - Tight validation protects your filesystem (no wandering into
/etcor following symlinks out of the repo).
Big picture
This step builds on the file-reading foundation from Step 03. Later steps (07/08) rely on search results to preview edits and render markdown, so getting safe, predictable search is critical now.
Core concepts (Effect-TS)
Validating Working Directories
Always resolve user-provided paths through PathValidation.ensureWithinCwd. It normalizes absolute/relative inputs and fails if the resolved path tries to escape the project root.
Default Excludes
Ripgrep accepts --glob patterns to ignore expensive folders. Provide a sensible default (e.g., node_modules, .git, build artifacts) and let callers add more. This keeps searches fast and avoids scanning binaries.
Streaming Ripgrep Output
Ripgrep’s --json flag produces machine-readable events. Pipe them through a parser (parseGrepOutput) to accumulate matches, context lines, and file counts.
Implementation
Implementation
Step 1: Glob safely within the project root
import * as Schema from 'effect/Schema'
import * as Effect from 'effect/Effect'
import { PathValidation } from '../services/PathValidation'
// Define the input schema
const GlobParameters = Schema.Struct({
pattern: Schema.String,
cwd: Schema.optional(Schema.String),
dot: Schema.optional(Schema.Boolean),
absolute: Schema.optional(Schema.Boolean),
onlyFiles: Schema.optional(Schema.Boolean),
maxResults: Schema.optional(Schema.Number),
})
const glob = ({
pattern,
cwd,
dot = false,
absolute = false,
onlyFiles = true,
maxResults = 1000,
}: Schema.Schema.Type<typeof GlobParameters>) =>
Effect.gen(function* () {
const pathValidation = yield* PathValidation
const workingDir = yield* pathValidation.ensureWithinCwd(cwd ?? '.')
const globber = new Bun.Glob(pattern)
const iterator = globber.scan({
cwd: workingDir,
dot,
absolute,
onlyFiles,
followSymlinks: false,
throwErrorOnBrokenSymlink: false,
})
const matches: string[] = []
for await (const val of iterator) {
matches.push(val as string)
if (matches.length >= maxResults) break
}
matches.sort()
return {
pattern,
matches,
count: matches.length,
truncated: matches.length >= maxResults,
cwd: absolute ? workingDir : pathValidation.relativePath(workingDir),
} as const
})
Step 2: Build ripgrep arguments with sane defaults
const DEFAULT_EXCLUDES = ['**/node_modules/**', '**/.git/**', '**/dist/**', '**/.turbo/**']
const buildGrepArgs = (
pattern: string,
isRegex: boolean,
includePatterns: readonly string[] | undefined,
excludePatterns: readonly string[] | undefined,
contextLines: number,
maxResults: number,
): string[] => {
const args = [
'--json',
'--max-count',
maxResults.toString(),
'--context',
contextLines.toString(),
'--with-filename',
'--line-number',
]
if (!isRegex) args.push('--fixed-strings')
for (const exclusion of excludePatterns ?? DEFAULT_EXCLUDES) {
args.push('--glob', `!${exclusion}`)
}
for (const inclusion of includePatterns ?? []) {
args.push('--glob', inclusion)
}
args.push(pattern, '.')
return args
}
Step 3: Execute ripgrep through the command runtime
import * as Schema from 'effect/Schema'
import * as Effect from 'effect/Effect'
import { PathValidation } from '../services/PathValidation'
import { EmptyPattern } from '../types/errors'
// Define the input schema
const GrepParameters = Schema.Struct({
pattern: Schema.String,
isRegex: Schema.optional(Schema.Boolean),
includePatterns: Schema.optional(Schema.Array(Schema.String)),
excludePatterns: Schema.optional(Schema.Array(Schema.String)),
contextLines: Schema.optional(Schema.Number),
maxResults: Schema.optional(Schema.Number),
searchPath: Schema.optional(Schema.String),
})
const grep = ({
pattern,
isRegex = false,
includePatterns,
excludePatterns,
contextLines = 2,
maxResults = 100,
searchPath,
}: Schema.Schema.Type<typeof GrepParameters>) =>
Effect.gen(function* () {
if (pattern.trim().length === 0) {
return yield* Effect.fail(new EmptyPattern({ pattern }))
}
const pathValidation = yield* PathValidation
const cwd = yield* pathValidation.ensureWithinCwd(searchPath ?? '.')
const args = buildGrepArgs(
pattern,
isRegex,
includePatterns,
excludePatterns,
contextLines,
maxResults,
)
const output = yield* runRgString(args, cwd).pipe(Effect.orDie)
const { matches, filesSearched } = parseGrepOutput(output, path, cwd, contextLines)
return {
pattern,
isRegex,
filesSearched,
matches,
truncated: matches.length >= maxResults,
} as const
})
Remember to provide PathValidation.layer when constructing the managed runtime in ToolRegistry (mirrors the dependency you added for FileTools).
Security & Performance Notes
- Enforcing
ensureWithinCwdprevents path traversal and keeps searches local to the project. - Limiting
maxResultsprotects against massive result sets (adjust in the schema if you need bulk results). - Add extra excludes for large binary directories (
**/vendor/**,**/coverage/**) if your repo needs it.
Testing & Validation
- Run
globforsrc/**/*.tsand confirm results stay inside the repo. - Run
grepfor a rare token and verifyfilesSearchedincrements correctly. - Try an empty pattern—
EmptyPatternshould fail with a helpful error. - Pass
../..assearchPath; expect a failure fromPathValidationrather than a filesystem escape.
Common Issues
| Problem | Likely Cause | Fix |
|---|---|---|
| Glob returns absolute paths unexpectedly | absolute: true used without UI normalization | Use pathValidation.relativePath before returning to the assistant |
| Ripgrep scans binary folders | DEFAULT_EXCLUDES missing build/vendor dirs | Add repo-specific excludes when configuring SearchTools |
runRgString throws “command not found” | ripgrep not installed in Bun runtime | Install rg on your system or bundle it alongside the CLI |
| Path traversal allowed | Forgot to provide PathValidation.layer | Ensure ToolRegistry runtime composes PathValidation.layer before SearchTools.layer |
Connections
Builds on:
- 03 — First Tool — Reuses
PathValidation
Sets up:
- 07 — Edit with Preview and Diff Gating — Uses search results to focus edits
- 08 — Markdown Rendering — Often triggered after
glob
Related code:
src/tools/SearchTools.tssrc/tools/search/GrepParser.tssrc/adapters/SearchToolAdapters.tssrc/services/ToolRegistry.ts