# gitlab-zoekt

A fast, precise code search solution that integrates Zoekt with GitLab. This project provides the infrastructure for indexing and searching code across GitLab repositories with high performance and accuracy.

## Overview

gitlab-zoekt is built on top of [Zoekt](https://github.com/sourcegraph/zoekt), a fast code search engine maintained by Sourcegraph. It creates specialized search indexes of GitLab repositories by communicating with Gitaly (GitLab's Git repository storage service) and provides a search API for performing fast, exact and regex searches across indexed code.

Key features:
- Fast and accurate code search across GitLab repositories
- Incremental indexing for efficiency (only updating changed files)
- Scalable architecture supporting distributed searching across multiple nodes

This repository includes three binaries:
- `gitlab-zoekt-indexer`: Original indexer binary
- `gitlab-zoekt-webserver`: Original webserver binary
- `gitlab-zoekt`: New unified binary that can run in either indexer or webserver mode

## Documentation overview

- [How to run test with Gitaly running locally](doc/how-to-gitaly-local.md)

## Compiling the binaries

```bash
# Build the original indexer (deprecated)
$ make build
$ ./bin/gitlab-zoekt-indexer
2023/08/14 11:07:29 Usage: ./bin/gitlab-zoekt-indexer [ --version | --index_dir=<DIR> | --path_prefix=<PREFIX> | --listen=:<PORT> ]

# Build the original webserver (deprecated)
$ make build-web

# Build the unified binary
$ make build-unified
$ ./bin/gitlab-zoekt
Usage: ./bin/gitlab-zoekt <command> [options]

Commands:
  indexer     Run in indexer mode
  webserver   Run in webserver mode
  version     Print version information

For command specific help:
  ./bin/gitlab-zoekt <command> -help
```

## Using the unified binary

The unified binary supports both indexer and webserver modes:

```bash
# Run in indexer mode
$ ./bin/gitlab-zoekt indexer -index_dir=/data/index -listen=:6060

# Run in CI/testing mode (skips gitlab_url and self_url requirements)
$ ./bin/gitlab-zoekt indexer -index_dir=/data/index -listen=:6060 -ci

# Run in webserver mode
$ ./bin/gitlab-zoekt webserver -index_dir=/data/index -listen=:6070

# Show help for a specific mode
$ ./bin/gitlab-zoekt indexer -help
$ ./bin/gitlab-zoekt webserver -help

# Show version information
$ ./bin/gitlab-zoekt version
```

## Running indexer in GDK mode

1. Set `GDK_DIR` env variable (for example, `export GDK_DIR="$HOME/projects/gdk"`).
1. Stop GDK zoekt processes if you have it running via
   ```shell
   gdk stop gitlab-zoekt-indexer-development-1 gitlab-zoekt-indexer-development-2 gitlab-zoekt-webserver-development-1 gitlab-zoekt-webserver-development-2
   ```
1. Execute `make gdk`. This will replace GDK processes with the unified binary from this repo.

> [!note]
> If your gitlab is different from `http://localhost:3000`, please also set `GDK_GITLAB_URL`. For example:
> `export GDK_GITLAB_URL="https://gdk.test:3443"`

## Running indexer with docker-compose

For trying out zoekt. Not an official installation method.

See [example](example/docker-compose/README.md)

## Running tests

1. Install a [suitable docker client](https://handbook.gitlab.com/handbook/tools-and-tips/mac/#docker-desktop)
1. Compile the unified binary
   ```shell
   make build-unified
   ```
1. Run the dependencies:
   ```shell
   docker-compose up
   ```
1. Run the tests:
   ```shell
   # One time
   make test

   # On every change (requires https://github.com/watchexec/watchexec installed)
   make watch-test
   ```

## Updating the shard test fixture

1. Build a binary in sourcegraph/zoekt project
   ```
   go build -o zoekt-index ./cmd/zoekt-index
   ```
1.  Index a code repository using the binary. Notice that we are specifying a meta file in the command line arguments.
   ```
   ./zoekt-index -index /tmp/.zoekt -meta ~/code/gitlab-zoekt/_support/test/309.meta.json ~/code/gitlab-zoekt
   ```

## Release Process

In order to release the changes to GitLab.com production, follow these steps:

1. Bump up the [VERSION file](https://gitlab.com/gitlab-org/gitlab-zoekt-indexer/-/blob/main/VERSION) in this repository following [semantic versioning](https://semver.org/)
2. Update the version file [in the monolith](https://gitlab.com/gitlab-org/gitlab/-/blob/master/GITLAB_ZOEKT_VERSION) Note: **MR is created automatically** via [Renovate bot](https://gitlab.com/gitlab-org/frontend/renovate-gitlab-bot).

   Changing this allows the Monolith run specs against the newer Zoekt Indexer. This file is also planned to be used for OmniBus in the future, to indicate which Zoekt Indexer to run.
3. Prepare a CNG MR to release a new image. Example: [!2533](https://gitlab.com/gitlab-org/build/CNG/-/merge_requests/2533)
4. Deploy to GSTG by updating [`gstg.zoekt-versions.yaml`](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/blob/master/releases/gitlab/values/gstg.zoekt-versions.yaml). Example MR: [!4607](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/merge_requests/4607)
5. Deploy to GPRD by updating [`gprd.zoekt-versions.yaml`](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/blob/master/releases/gitlab/values/gprd.zoekt-versions.yaml). Example MR: [!4616](https://gitlab.com/gitlab-com/gl-infra/k8s-workloads/gitlab-com/-/merge_requests/4616)

After we verify that everything works as expected for a few days, we can release the same change to SM customers:

1. Create an MR in [`gitlab-zoekt`](https://gitlab.com/gitlab-org/cloud-native/charts/gitlab-zoekt). Example: [!121](https://gitlab.com/gitlab-org/cloud-native/charts/gitlab-zoekt/-/merge_requests/121)
2. Bump the Zoekt chart version in the [main helm chart](https://gitlab.com/gitlab-org/charts/gitlab). Example: [!4422](https://gitlab.com/gitlab-org/charts/gitlab/-/merge_requests/4422). Note: **MR is created automatically**
