Skip to content

Logging Standards & Architecture

This document describes the logging architecture, patterns, and best practices established during the logging refactor. It serves as a guide for developers to maintain consistency across the codebase.

Overview

Our logging system is built on Go's log/slog structured logging package, wrapped with custom utilities to provide:

  • Context propagation for request tracing (trace_id, request_id, user_id, org_id)
  • Type-safe attribute helpers for consistent field naming
  • Layer-specific logging functions (API, Service, Client, Repository)
  • Automatic correlation with external observability tools (Better Stack, Grafana Loki)

Key Principles

  1. Context-first: Always use *Ctx functions when context is available
  2. Structured attributes: Use attribute helpers instead of raw slog.String()
  3. Consistent messages: Follow the "[Resource] [operation] [outcome]: [reason]" pattern
  4. No constructor logging: Constructors should return errors, not log them
  5. Minimal logging: Log what's necessary, avoid verbose/debug logs in production paths

Architecture

Package Structure

pkg/logger/
├── slogger.go      # Core logger setup, *Ctx functions, layer-specific functions
├── attrs.go        # Attribute constants and helper functions
└── context.go      # Context enrichment utilities (WithTraceID, WithUserID, etc.)

Logging Flow

%%{init: {"flowchart": {"wrappingWidth": 1000}} }%%
flowchart TD
    Node1["HTTP Request"]

    Node2["LogContextMiddleware<br/>- Generates/extracts trace_id (X-Trace-Id header)<br/>- Generates request_id<br/>- Enriches context: ctx = logger.WithTraceID(ctx, traceID)<br/>- Enriches context: ctx = logger.WithRequestID(ctx, requestID)"]

    Node3["AuthMiddleware<br/>- Validates token<br/>- Enriches context: ctx = logger.WithUserID(ctx, userID)<br/>- Enriches context: ctx = logger.WithOrgID(ctx, orgID)"]

    Node4["API Handler<br/>ctx := c.Request.Context()<br/>logger.InfoCtx(ctx, '...', []slog.Attr{...})<br/>logger.ErrorAPICtx(ctx, '...', err, []slog.Attr{...})"]

    Node5["Service<br/>logger.ErrorServiceCtx(ctx, '...', err, []slog.Attr{...})"]

    Node6["Platform Client<br/>logger.ErrorClientCtx(ctx, '...', err, []slog.Attr{...})"]

    Node7["Repository<br/>logger.ErrorRepoCtx(ctx, '...', err, []slog.Attr{...})"]

    %% Connections
    Node1 --> Node2
    Node2 --> Node3
    Node3 --> Node4
    Node4 --> Node5
    Node5 --> Node6
    Node6 --> Node7

    %% Styling to match the ASCII look
    classDef default font-family:monospace,text-align:left;

Log Output

Every log entry automatically includes context values when using *Ctx functions:

{
  "time": "2024-01-15T10:30:00Z",
  "level": "ERROR",
  "msg": "WorkOS authentication failed: invalid credentials",
  "layer": "service",
  "trace_id": "abc-123-def",
  "request_id": "req-456-ghi",
  "user_id": "user_01ABC",
  "org_id": "org_01XYZ",
  "error": "invalid_credentials: The provided credentials are incorrect"
}

Logging Patterns

When to Use Each Function

Function Layer Use Case
logger.InfoCtx / logger.ErrorCtx Generic Cross-cutting concerns, middleware
logger.ErrorAPICtx API Handlers Request validation, response errors
logger.ErrorServiceCtx Services Business logic errors
logger.ErrorClientCtx Platform Clients External API calls (WorkOS, LiveKit, Google)
logger.ErrorRepoCtx Repositories Database operations

When NOT to Use Context Functions

Use non-Ctx variants only when context is genuinely unavailable:

  • Startup/bootstrap code (cmd/main.go, internal/app/app.go)
  • Background workers without request context
  • Utility functions that don't receive context as parameter
// Startup code - no context available
func SetupDependencies() (*AppDependencies, error) {
    logger.Info("Starting application module initialization...", nil)
    // ...
}

// Background worker - no request context
func (r *recorderService) Handle(ev *lk.WebhookEvent) {
    logger.Warn("Webhook recorder: queue full, dropping event", []slog.Attr{
        slog.String("event_type", ev.Event),
    })
}

Attribute System

Why Attribute Helpers?

Instead of using raw slog.String("user_id", userID), we use helper functions like logger.UserID(userID). This provides:

  1. Consistency: Same attribute name across the entire codebase
  2. Type safety: Compile-time checks for attribute values
  3. Discoverability: IDE autocomplete shows available attributes
  4. Refactoring: Change attribute name in one place

Attribute Categories

All attributes are defined in pkg/logger/attrs.go:

// Identity attributes
const (
    AttrUserID    = "user_id"
    AttrOrgID     = "org_id"
    AttrSessionID = "session_id"
    AttrRole      = "role"
)

// Request tracing attributes
const (
    AttrTraceID   = "trace_id"
    AttrRequestID = "request_id"
)

// Business entity attributes
const (
    AttrRoomName        = "room_name"
    AttrRoomID          = "room_id"
    AttrCalendarID      = "calendar_id"
    AttrInvitationID    = "invitation_id"
    AttrMembershipID    = "membership_id"
)

// Recording/Media attributes
const (
    AttrEgressID        = "egress_id"
    AttrFilepath        = "filepath"
    AttrCustomBaseURL   = "custom_base_url"
)

Helper Functions

Each constant has a corresponding helper function:

func UserID(id string) slog.Attr {
    return slog.String(AttrUserID, id)
}

func OrgID(id string) slog.Attr {
    return slog.String(AttrOrgID, id)
}

func Count(n int) slog.Attr {
    return slog.Int(AttrCount, n)
}

When to Use Raw slog vs Attribute Helpers

Scenario Approach Example
Recurring attributes (user_id, org_id, room_name) Use helper logger.UserID(userID)
One-off debug attributes Raw slog acceptable slog.String("response_body", body)
Internal-only debugging (S3 bucket, endpoint) Raw slog acceptable slog.String("bucket", bucket)
HTTP status codes in error responses Raw slog acceptable slog.Int("status_code", resp.StatusCode)

Adding New Attributes

When you need a new recurring attribute:

  1. Add the constant to pkg/logger/attrs.go:

    AttrNewAttribute = "new_attribute"
    

  2. Add the helper function:

    func NewAttribute(value string) slog.Attr {
        return slog.String(AttrNewAttribute, value)
    }
    

  3. Use it in your code:

    logger.ErrorServiceCtx(ctx, "Operation failed", err, []slog.Attr{
        logger.NewAttribute(value),
    })
    

Context-Aware Logging

Context Enrichment

The context is progressively enriched as the request flows through middleware:

// In LogContextMiddleware
ctx = logger.WithTraceID(ctx, traceID)
ctx = logger.WithRequestID(ctx, requestID)

// In AuthMiddleware (after successful authentication)
ctx = logger.WithUserID(ctx, accessTokenData.UserID)
ctx = logger.WithOrgID(ctx, accessTokenData.OrgID)

// Update the request with enriched context
c.Request = c.Request.WithContext(ctx)

Extracting Context in Handlers

Always extract context at the beginning of your handler:

func (h *Handler) Handle(c *gin.Context) {
    ctx := c.Request.Context()

    // All subsequent *Ctx calls will include trace_id, request_id, user_id, org_id
    logger.InfoCtx(ctx, "Processing request", nil)
}

Performance Tracing

Use logger.Trace to measure operation duration:

func (s *Service) DoSomething(ctx context.Context) error {
    // Returns a stop function that logs duration when called
    stop := logger.Trace(ctx, "WorkOS: ListOrganizationMemberships")
    defer stop()

    // ... operation ...
    return nil
}

// With additional attributes
stop := logger.Trace(ctx, "WorkOS: GetOrganization", logger.OrgID(orgID))
defer stop()

Message Format

Pattern

"[Resource] [operation] [outcome]: [brief reason]"

Components

Component Description Examples
Resource The system/service being operated on WorkOS, LiveKit, Google Calendar, Database
Operation What action is being performed authentication, create room, list events
Outcome Result of the operation failed, succeeded, completed
Reason Brief explanation (optional) invalid credentials, network error, not found

Good Examples

// Platform client errors
"WorkOS authentication failed: invalid credentials"
"LiveKit create room failed: room already exists"
"Google Calendar list events failed: token expired"

// Service layer errors
"Membership service: send invitation failed"
"Session service: organization switch failed"

// API layer errors
"Auth middleware: token validation failed"
"Permission middleware: access denied"

// Startup/setup errors
"WorkOS setup: missing credentials"
"Feature setup: Authentication service initialization failed"
"Database: connection failed"

Bad Examples (Avoid)

// Too verbose
"Failed to authenticate user with WorkOS because the credentials provided were invalid"

// Missing context
"Error occurred"
"Failed"

// Using %v in message (error is passed separately)
"Failed to connect: %v"

// Debug prefixes in production code
"[DEBUG] User login failed"

Layer-Specific Guidelines

API Handlers

func (h *Handler) Handle(c *gin.Context) {
    ctx := c.Request.Context()

    // Validation errors
    if err := c.ShouldBindJSON(&req); err != nil {
        logger.ErrorAPICtx(ctx, "Request validation failed", err, nil)
        c.Error(httperror.NewStatusBadRequestError(...))
        return
    }

    // Service call errors
    result, err := h.service.DoSomething(ctx, req)
    if err != nil {
        // Don't log here - service already logged it
        c.Error(err)
        return
    }

    c.JSON(http.StatusOK, result)
}

Services

func (s *Service) DoSomething(ctx context.Context, input Input) (*Output, error) {
    // Log at meaningful business events
    logger.InfoCtx(ctx, "Processing payment", []slog.Attr{
        logger.UserID(input.UserID),
        logger.Amount(input.Amount),
    })

    // Call external client
    result, err := s.client.ExternalCall(ctx, input)
    if err != nil {
        // Client already logged the error, just return
        return nil, err
    }

    return result, nil
}

Platform Clients

func (c *Client) ExternalCall(ctx context.Context, input Input) (*Output, error) {
    result, err := c.sdkClient.DoSomething(input)
    if err != nil {
        httperr := httperror.NewStatusInternalServerError(httperror.ErrorConfig{Cause: err})
        logger.ErrorClientCtx(ctx, "ExternalService operation failed", httperr, []slog.Attr{
            logger.UserID(input.UserID),
        })
        return nil, httperr
    }
    return result, nil
}

Constructors

Do not log in constructors. Return errors instead:

// GOOD
func NewClient(config *Config) (Client, error) {
    if config.APIKey == "" {
        return nil, errors.New("API key cannot be empty")
    }
    return &clientImpl{config: config}, nil
}

// BAD - Don't do this
func NewClient(config *Config) (Client, error) {
    if config.APIKey == "" {
        logger.Error("API key is empty", nil, nil)  // Don't log here
        return nil, errors.New("API key cannot be empty")
    }
    return &clientImpl{config: config}, nil
}

Examples

Complete API Handler Example

func (h *CreateRoomHandler) Handle(c *gin.Context) {
    ctx := c.Request.Context()

    var req CreateRoomRequest
    if err := c.ShouldBindJSON(&req); err != nil {
        logger.ErrorAPICtx(ctx, "Create room: request validation failed", err, nil)
        c.Error(httperror.NewStatusBadRequestError(httperror.ErrorConfig{Cause: err}))
        return
    }

    room, err := h.roomService.CreateRoom(ctx, req.Name, req.Metadata)
    if err != nil {
        // Service already logged - just propagate
        c.Error(err)
        return
    }

    logger.InfoCtx(ctx, "Room created successfully", []slog.Attr{
        logger.RoomName(room.Name),
        logger.RoomID(room.ID),
    })

    c.JSON(http.StatusCreated, room)
}

Complete Service Example

func (s *RoomService) CreateRoom(ctx context.Context, name, metadata string) (*Room, error) {
    stop := logger.Trace(ctx, "LiveKit: CreateRoom")
    defer stop()

    room, err := s.livekitClient.CreateRoom(ctx, name, metadata)
    if err != nil {
        // Client already logged
        return nil, err
    }

    // Persist to database
    if err := s.roomRepo.Create(ctx, room); err != nil {
        logger.ErrorServiceCtx(ctx, "Room service: database persist failed", err, []slog.Attr{
            logger.RoomName(name),
        })
        return nil, httperror.NewStatusInternalServerError(httperror.ErrorConfig{Cause: err})
    }

    return room, nil
}

Complete Client Example

func (c *livekitClient) CreateRoom(ctx context.Context, name, metadata string) (*livekit.Room, error) {
    room, err := c.roomService.CreateRoom(ctx, &livekit.CreateRoomRequest{
        Name:     name,
        Metadata: metadata,
    })
    if err != nil {
        httperr := httperror.NewStatusInternalServerError(httperror.ErrorConfig{Cause: err})
        logger.ErrorClientCtx(ctx, "LiveKit create room failed", httperr, []slog.Attr{
            logger.RoomName(name),
        })
        return nil, httperr
    }

    return room, nil
}

Anti-Patterns

1. Logging Without Context When Context is Available

// BAD
func (h *Handler) Handle(c *gin.Context) {
    logger.Error("Something failed", err, nil)  // Missing context!
}

// GOOD
func (h *Handler) Handle(c *gin.Context) {
    ctx := c.Request.Context()
    logger.ErrorCtx(ctx, "Something failed", err, nil)
}

2. Using Raw slog for Recurring Attributes

// BAD
logger.ErrorCtx(ctx, "Failed", err, []slog.Attr{
    slog.String("user_id", userID),  // Use helper instead
})

// GOOD
logger.ErrorCtx(ctx, "Failed", err, []slog.Attr{
    logger.UserID(userID),
})

3. Logging in Constructors

// BAD
func NewService(client Client) (*Service, error) {
    if client == nil {
        logger.Error("Client is nil", nil, nil)
        return nil, errors.New("client cannot be nil")
    }
    return &Service{client: client}, nil
}

// GOOD
func NewService(client Client) (*Service, error) {
    if client == nil {
        return nil, errors.New("client cannot be nil")
    }
    return &Service{client: client}, nil
}

4. Double Logging

// BAD - Error logged twice
func (s *Service) DoSomething(ctx context.Context) error {
    result, err := s.client.ExternalCall(ctx)
    if err != nil {
        logger.ErrorServiceCtx(ctx, "External call failed", err, nil)  // Client already logged!
        return err
    }
    return nil
}

// GOOD - Let the client log, just propagate
func (s *Service) DoSomething(ctx context.Context) error {
    result, err := s.client.ExternalCall(ctx)
    if err != nil {
        return err  // Client already logged the error
    }
    return nil
}

5. Including Error Details in Message

// BAD
logger.ErrorCtx(ctx, fmt.Sprintf("Failed: %v", err), err, nil)

// GOOD
logger.ErrorCtx(ctx, "Operation failed", err, nil)  // Error passed separately

6. Verbose Messages

// BAD
logger.InfoCtx(ctx, "Successfully completed the user authentication process and validated all credentials", nil)

// GOOD
logger.InfoCtx(ctx, "User authenticated successfully", nil)

Checklist for New Code

  • Using *Ctx functions when context is available
  • Using attribute helpers for recurring attributes (user_id, org_id, room_name, etc.)
  • Following message format: "[Resource] [operation] [outcome]: [reason]"
  • Not logging in constructors
  • Not double-logging (check if called function already logs)
  • Extracting context at handler start: ctx := c.Request.Context()
  • Using appropriate layer function (ErrorAPICtx, ErrorServiceCtx, ErrorClientCtx, ErrorRepoCtx)

References

  • Go slog package documentation
  • pkg/logger/slogger.go - Core logger implementation
  • pkg/logger/attrs.go - Attribute definitions
  • pkg/requestmiddleware/middleware.go - Context enrichment middleware