Skip to content

Telemetry

The CLI collects anonymous usage analytics via Amplitude to help the DataRobot team understand how the tool is used. Telemetry is implemented in internal/telemetry/. On each CLI invocation a Client is created with a set of CommonProperties, events are queued via Client.Track(), and the queue is flushed at process exit via Client.Flush().

When telemetry is disabled or the Amplitude API key is absent (all dev builds), every operation is a safe no-op — events are logged to the debug logger instead of being sent over the network.

Opting out

Users can disable telemetry in three ways, in order of precedence:

Method How
Flag dr --disable-telemetry <command>
Environment variable DATAROBOT_CLI_DISABLE_TELEMETRY=true
Config file disable-telemetry: true in drconfig.yaml

Device ID

Amplitude requires a device_id or user_id on every event. The CLI uses a stable device identifier obtained in this order:

  1. OS-provided machine ID — via github.com/denisbrodbeck/machineid, which reads:
  2. IOPlatformUUID on macOS
  3. /etc/machine-id on Linux
  4. HKLM\SOFTWARE\Microsoft\Cryptography\MachineGuid on Windows

The raw value is HMAC-SHA256'd with the app ID "dr" before use, so the actual system identifier is never sent to Amplitude.

  1. Persisted random UUID — if the OS identifier is unavailable, a random UUID is generated and written to ~/.config/datarobot/device_id (respects $XDG_CONFIG_HOME). The same value is reused on subsequent invocations.

  2. Session-scoped fallback — if the config directory is also inaccessible, a fresh ID prefixed with "fallback-" is generated for that session only.

User ID

When the user is authenticated, the CLI sends a real DataRobot uid as the top-level Amplitude user_id field. If the user is unauthenticated (no API token, invalid token, or network failure with no valid cache), the field is left empty and Amplitude falls back to device_id-only anonymous tracking.

The uid is fetched from GET /api/v2/account/info/, which returns an AccountInfo response containing the user's unique identifier. The uid is stable per DataRobot instance and is not PII (email is deliberately excluded from telemetry to avoid transmitting personally identifiable information).

Caching

To avoid an API call on every CLI invocation, the uid is cached to disk alongside device_id and drconfig.yaml:

  • Cache file: $CONFIG_DIR/datarobot/user_id (respects $XDG_CONFIG_HOME)
  • File permissions: 0600 (owner read/write only), consistent with device_id and drconfig.yaml
  • Cache format (JSON):
{"uid":"...","endpoint":"https://app.datarobot.com","token_fingerprint":"sha256hex"}
  • uid — the DataRobot user identifier
  • endpoint — the scheme+host of the DataRobot instance (e.g., https://app.datarobot.com)
  • token_fingerprint — SHA-256 hex digest of the current API token

Cache validation and invalidation

On subsequent invocations, when no fresh API uid is available, the cache is validated against both the current endpoint and the current token fingerprint:

  • Endpoint match: the cached endpoint must equal the current viperx.GetString(config.DataRobotURL) (scheme+host only)
  • Token fingerprint match: the cached token_fingerprint must equal the SHA-256 hex of the current API token

If either check fails, the cache is treated as stale and the user_id is left empty (anonymous tracking). This ensures correct behavior in shared environments (e.g., Codespaces) where two users may authenticate sequentially with different tokens — the token fingerprint prevents incorrectly attributing User B's activity to User A's cached uid.

Behavior summary

Scenario user_id behavior
Authenticated, API succeeds uid from API, cached to disk
Authenticated, cache hit (same endpoint + token) Cached uid (no API call)
Endpoint changed Re-fetch from API, update cache
Token changed (rotation / new user) Re-fetch from API, update cache
No API token / invalid token Empty user_id, anonymous tracking
Network error, same endpoint + token Return cached uid
Network error, endpoint/token changed Empty user_id, anonymous tracking

Common Properties

The following are attached to every event:

Top-level event fields

These map to Amplitude's built-in fields and power native segmentation (version filters, OS breakdowns, language charts, etc.).

Field Source
user_id DataRobot uid from GET /api/v2/account/info/, cached to disk with endpoint + token fingerprint validation; empty (anonymous) if unauthenticated or cache miss — see User ID
device_id OS machine ID (hashed) or persisted UUID — see Device ID above
session_id Unix millisecond timestamp generated once per process invocation — Amplitude uses this as the built-in Session ID for session-based analysis
app_version CLI version set at build time via ldflags
platform Always "CLI"
os_name OS name (e.g. "macOS")
os_version OS version (e.g. "15.7.5")
language User locale tag (e.g. "en-US"), via go-locale; Amplitude maps to a display language name
ip Always "$remote" — Amplitude resolves location server-side

Event properties

Property Source
install_method Set at build time via ldflags (release, source, etc.)
os_arch CPU architecture from runtime.GOARCH
go_version Go runtime version (e.g. go1.26.4) from runtime.Version()
environment US, EU, JP, or custom — derived from endpoint URL
datarobot_instance Base URL of the configured DataRobot instance
template_name Best-effort from .datarobot/answers/ in the current repo
command_kind "core" or "plugin" — automatically set by the root command dispatcher

Event Wiring

Telemetry events are wired declaratively at command-construction time using a small API exported by internal/telemetry:

Helper Use when…
telemetry.Track(cmd) The command needs no extra event properties beyond the common ones.
telemetry.TrackWith(cmd, extract) The command needs dynamic event properties from flags or args at firing time.
telemetry.TrackPlugin(cmd, ver) The command comes from a plugin. Adds plugin_version and sets command_kind.

Each helper sets a "telemetry" annotation on the cobra command. After RunE completes, cobra.OnFinalize calls telemetry.EventFor(cmd, args), which returns an Amplitude event with EventType == cmd.CommandPath() and any properties the registered extractor produced.

This approach ensures:

  • Local: Wiring lives next to the command it tracks, not in a central map.
  • Late-bound: Events fire after RunE, so PropExtractors can read results computed during command execution (see Reading RunE results in a PropExtractor).
  • Extensible: Adding a new event requires one call where the command is built.
  • Self-documenting: The cobra command itself carries its telemetry intent.

Process exit and telemetry flush

Amplitude events are queued in-process and flushed asynchronously. If a command calls os.Exit directly (plugins, task runner exit-code propagation) or if RunE returns an error that causes cobra to skip PersistentPostRunE, the queue would be silently dropped. Two mechanisms handle this:

cmd.Exit(code int) — for main.go's error path

cmd.Exit lives in cmd/exit.go alongside the telemetryClient package-level variable that PersistentPreRunE sets. Use it only from main.go when ExecuteContext returns an error:

if err := cmd.ExecuteContext(ctx); err != nil {
    log.Stop()
    cmd.Exit(1) // flushes telemetry then calls os.Exit(1)
}

cmd.Exit is nil-safe: if PersistentPreRunE never ran (e.g. flag parse failure before any command executes) there are no queued events and it falls straight through to os.Exit.

telemetry.ExitWithContext(ctx context.Context, code int) — for cobra sub-commands

Commands that must propagate a subprocess exit code (plugin dispatch, task run --exit-code) cannot use return err because Go errors carry no integer code. They call telemetry.ExitWithContext with the command's cobra context — the client stored there by PersistentPreRunE is flushed before os.Exit:

RunE: func(cmd *cobra.Command, args []string) error {
    exitCode := runSubprocess(...)
    telemetry.ExitWithContext(cmd.Context(), exitCode) // never returns
    return nil // unreachable
},

State ownership

internal/telemetry is stateless — it defines Client, helpers, and ClientContextKey but holds no global variables. The two places that own state are:

Owner What Why
cmd package (root.go + exit.go) var telemetryClient *telemetry.Client Needed by main.go's error path, which only has the signal context (no cobra context).
cobra command context *telemetry.Client stored under telemetry.ClientContextKey{} Accessible to any sub-package that has a *cobra.Command without importing cmd (which would be circular).

Both are set in PersistentPreRunE; the context value is consumed by PersistentPostRunE (normal path) or telemetry.ExitWithContext (exit-code path).

Silent errors and SilenceErrors

cli.ErrSilent (internal/cli/command.go) is a sentinel returned by RunE implementations that have already printed their own user-facing error. Returning it tells cobra "don't add a second Error: … line", but only if the command (or an ancestor) has SilenceErrors: true set.

Rules:

  1. Any command whose RunE can return cli.ErrSilent — either directly or by calling a sub-command's RunE — must set SilenceErrors: true on its own cobra.Command.
  2. When composing commands (e.g. calling compose.Cmd().RunE(nil, nil) inside another command's RunE), the caller must also carry SilenceErrors: true, because the sentinel propagates up the call stack.
  3. main.go always calls cmd.Exit(1) for any non-nil error, so telemetry is recorded even for silent failures.

Example:

func Cmd() *cobra.Command {
    return &cobra.Command{
        Use:           "add [...]",
        RunE:          RunE,
        SilenceErrors: true, // required: RunE can return cli.ErrSilent via compose.Cmd().RunE
    }
}

func RunE(_ *cobra.Command, args []string) error {
    if err := addComponents(args); err != nil {
        return err
    }

    // compose prints its own error and returns cli.ErrSilent on failure;
    // SilenceErrors: true above prevents cobra from echoing "Error: silent error".
    return compose.Cmd().RunE(nil, nil)
}

Execution flow

User invokes command
Cobra parses flags
PersistentPreRunE (root.go)
    ├─ Initialize CommonProperties (session ID, user ID, env, ...)
    ├─ Stamp props.CommandKind = "core" or "plugin"
    │   based on telemetry.IsPluginCommand(cmd)
    ├─ Build telemetry.Client
    └─ Register cobra.OnFinalize (closes over cmd, args, client)
RunE / Run executes
cobra.OnFinalize (via cobra's deferred postRun, fires on success and error paths — unlike PersistentPostRunE, see [gotcha](https://www.jvt.me/posts/2024/11/29/gotcha-cobra-persistentpostrune/))
    ├─ telemetry.EventFor(cmd, args) → if tracked, client.Track(event)
    ├─ Flush telemetry (3-second timeout)
    └─ log.Stop()

Reading RunE results in a PropExtractor

Because cobra.OnFinalize fires after RunE on both success and error paths (unlike PersistentPostRunE), a PropExtractor can read data that RunE computed. The recommended pattern is to declare closure variables in Cmd() that RunE writes and the PropExtractor reads:

func Cmd() *cobra.Command {
    var (
        checkResult    tools.CheckResult
        installSuccess []string
        installError   string
    )

    cmd := &cobra.Command{
        RunE: func(cmd *cobra.Command, _ []string) error {
            checkResult = tools.CheckPrerequisites()       // written by RunE
            installSuccess, err = dependencies.Install()
            if err != nil {
                installError = err.Error()
                return err
            }
            return nil
        },
    }

    telemetry.TrackWith(cmd, func(_ *cobra.Command, _ []string) map[string]any {
        return map[string]any{                             // read by PropExtractor
            "validation_violations": checkResult.ValidationViolations,
            "install_success":       installSuccess,
            "install_error":         installError,
        }
    })

    return cmd
}

Both RunE and the PropExtractor close over the same local variables. No context keys or package-level variables are needed.

How to add telemetry to a new command

1. Decide what (if anything) to extract

Inspect the command's flags and args. Decide which (if any) should be exposed as event properties.

2. Wire the command at construction

Find the function (or init) that builds the cobra command and add a telemetry.Track* call before returning.

Simple command, no extra properties:

import "github.com/datarobot/cli/internal/telemetry"

func Cmd() *cobra.Command {
    cmd := &cobra.Command{
        Use:   "foo",
        Short: "Do foo",
        // ...
    }

    telemetry.Track(cmd)

    return cmd
}

Command that contributes properties from positional args:

telemetry.TrackWith(cmd, func(_ *cobra.Command, args []string) map[string]any {
    return map[string]any{
        "component_name": telemetry.FirstArg(args),
    }
})

Command that contributes a property from a flag:

telemetry.TrackWith(cmd, func(c *cobra.Command, args []string) map[string]any {
    ver, _ := c.Flags().GetString("version")

    return map[string]any{
        "plugin_name":    telemetry.FirstArg(args),
        "plugin_version": ver,
    }
})

3. Add the command's path to the wiring test

IMPORTANT: Edit cmd/telemetry_wiring_test.go and add the new cmd.CommandPath() to expectedTrackedCommands. The test will fail loudly if anyone later removes the wiring.

4. Test it

task test
task lint

Run the CLI with telemetry disabled (the dev default) and check the debug log to see your event:

dr foo --debug
# .dr-tui-debug.log will include "Telemetry event (dry-run)" entries

Plugin Commands

Plugin commands are discovered at runtime by cmd/plugin/discovery.go::createPluginCommand, which calls telemetry.TrackPlugin(cmd, manifest.Version). This:

  • Sets the "telemetry" annotation so EventFor will fire an event.
  • Sets the "telemetry:plugin" annotation so IsPluginCommand returns true, which causes the root to stamp command_kind = "plugin" on the common properties.
  • Registers an extractor that adds plugin_version to the event.

The event type is cmd.CommandPath() — for example dr assist. There is no longer a synthetic "dr plugin execute" event.

Dev builds

AmplitudeAPIKey is empty in dev builds (it is injected via ldflags in release builds only). When the key is empty, IsEnabled() returns false and all Track calls log to the debug logger.

SDK log routing

The Amplitude SDK emits its own internal logs (HTTP responses, client lifecycle, etc.) via a custom logger adapter in amplitudeLogger. All Amplitude SDK log entries are prefixed with [amplitude] for traceability in debug log files.

The adapter demotes Amplitude's INFO-level logs (e.g. HTTP response code, HTTP response body) to DEBUG when the app's log level is above INFO. This keeps them off stderr by default while still capturing them in the debug log file (see Logging).

CLI flags Amplitude INFO appears as Visible on stderr?
(default) DEBUG No
--verbose INFO Yes
--debug INFO Yes

WARN and ERROR messages from the SDK always pass through at their original level.

Testing

Run the telemetry test suite:

task test -- ./internal/telemetry/... ./cmd/...

Key tests:

  • internal/telemetry/wire_test.go — exercises Track, TrackWith, TrackPlugin, EventFor, IsPluginCommand, FirstArg.
  • internal/telemetry/properties_test.go — exercises common properties including command_kind.
  • cmd/telemetry_wiring_test.go — verifies that every expected core command path is wired in the static command tree.

Maintenance checklist

  • Renaming a command? The event type follows cmd.CommandPath() automatically, but you must update expectedTrackedCommands in cmd/telemetry_wiring_test.go.
  • Removing a command? Remove its expectedTrackedCommands entry.
  • Changing event properties? Update the closure passed to TrackWith.