Chat - Verlon AI

The chat() and chatStream() methods provide text generation with support for multimodal inputs, function calling, and streaming.

Basic Chat

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Explain quantum computing in simple terms' }
    ]
  }
});

console.log(response.content);
console.log('Cost:', response.cost);
console.log('Model:', response.model);

Streaming

Stream responses token by token for better UX:

const stream = verlon.chatStream({
  gateId: 'your-gate-id',
  data: {
    messages: [
      { role: 'user', content: 'Write a poem about the ocean' }
    ]
  }
});

for await (const chunk of stream) {
  process.stdout.write(chunk.content || '');
}
console.log('\n');

Use streaming for long responses to show progress to users instead of making them wait.

Message Roles

System Messages

Set behavior and context:

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'You are a senior software engineer who explains concepts clearly and concisely.'
      },
      {
        role: 'user',
        content: 'What is dependency injection?'
      }
    ]
  }
});

Conversation History

Include previous messages for context:

const conversation = [
  { role: 'user', content: 'What is 2+2?' },
  { role: 'assistant', content: '2+2 equals 4.' },
  { role: 'user', content: 'What about 3+3?' }
];

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: { messages: conversation }
});

Vision (Multimodal)

Send images in messages:

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'What's in this image?' },
          {
            type: 'image_url',
            image_url: {
              url: 'https://example.com/image.jpg',
              detail: 'high'  // 'auto', 'low', or 'high'
            }
          }
        ]
      }
    ]
  }
});

Image Detail Levels

Level	Use Case	Tokens
`low`	Simple identification	~85 tokens
`high`	Detailed analysis	~765-2000 tokens
`auto`	Model decides	Varies

Multiple Images

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Compare these two images' },
          {
            type: 'image_url',
            image_url: { url: 'https://example.com/image1.jpg' }
          },
          {
            type: 'image_url',
            image_url: { url: 'https://example.com/image2.jpg' }
          }
        ]
      }
    ]
  }
});

Function Calling

Define tools the model can use:

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      { role: 'user', content: 'What's the weather in San Francisco?' }
    ],
    tools: [
      {
        type: 'function',
        function: {
          name: 'get_weather',
          description: 'Get the current weather for a location',
          parameters: {
            type: 'object',
            properties: {
              location: {
                type: 'string',
                description: 'City and state, e.g. San Francisco, CA'
              },
              unit: {
                type: 'string',
                enum: ['celsius', 'fahrenheit'],
                description: 'Temperature unit'
              }
            },
            required: ['location']
          }
        }
      }
    ],
    toolChoice: 'auto'
  }
});

// Check if model wants to call a function
if (response.toolCalls) {
  for (const toolCall of response.toolCalls) {
    console.log('Function:', toolCall.function.name);
    console.log('Arguments:', toolCall.function.arguments);

    // Execute the function
    const args = JSON.parse(toolCall.function.arguments);
    const result = await getWeather(args.location, args.unit);

    // Send result back to model
    // (add to conversation and make another request)
  }
}

Tool Choice Options

Value	Behavior
`auto`	Model decides whether to call functions
`required`	Model must call at least one function
`none`	Model cannot call functions
`{type: 'function', function: {name: 'func'}}`	Force specific function

Complete Function Calling Example

// Initial request
const response1 = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      { role: 'user', content: 'What's the weather in Tokyo?' }
    ],
    tools: [
      {
        type: 'function',
        function: {
          name: 'get_weather',
          description: 'Get weather for a location',
          parameters: {
            type: 'object',
            properties: {
              location: { type: 'string' },
              unit: { type: 'string', enum: ['C', 'F'] }
            },
            required: ['location']
          }
        }
      }
    ]
  }
});

// Execute function
const toolCall = response1.toolCalls[0];
const args = JSON.parse(toolCall.function.arguments);
const weatherData = { temp: 22, condition: 'sunny' };

// Send result back
const response2 = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      { role: 'user', content: 'What's the weather in Tokyo?' },
      {
        role: 'assistant',
        content: null,
        toolCalls: response1.toolCalls
      },
      {
        role: 'tool',
        toolCallId: toolCall.id,
        content: JSON.stringify(weatherData)
      }
    ]
  }
});

console.log(response2.content);
// "It's currently 22°C and sunny in Tokyo."

JSON Mode

Force structured JSON output:

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      {
        role: 'user',
        content: 'Extract name, email, and age from: John Doe, john@example.com, 30 years old'
      }
    ],
    responseFormat: { type: 'json_object' }
  }
});

const data = JSON.parse(response.content);
console.log(data);
// { name: "John Doe", email: "john@example.com", age: 30 }

When using JSON mode, include “JSON” in your prompt to improve reliability.

Parameters

Request Parameters

Parameter	Type	Description
`gateId`	`string`	Gate UUID (required)
`gateName`	`string`	Display name (optional)
`model`	`string`	Override gate’s model (optional)
`metadata`	`object`	Custom metadata (optional)
`sessionId`	`string`	Session identifier for multi-turn context (optional)
`endUserId`	`string`	Stable end-user identifier used for experiment routing (optional)

Data Parameters

Parameter	Type	Default	Description
`messages`	`Message[]`	Required	Conversation history
`temperature`	`number`	1.0	Randomness (0-2)
`maxTokens`	`number`	Model max	Maximum response length
`topP`	`number`	1.0	Nucleus sampling
`stop`	`string[]`	None	Stop sequences
`tools`	`Tool[]`	None	Available functions
`toolChoice`	`string \| object`	None	Tool selection strategy
`responseFormat`	`object`	None	Force JSON output

Temperature Guide

Value	Use Case	Behavior
0	Code, math, facts	Deterministic, focused
0.7	General conversation	Balanced
1.0	Creative writing	More random
1.5+	Poetry, brainstorming	Very creative

Response

interface ChatResponse {
  id: string;                    // Request ID
  model: string;                 // Model that generated response
  content: string;               // Response text
  finishReason: string;          // Why generation stopped
  cost: number;                  // Request cost in USD
  latency: number;               // Response time in ms
  toolCalls?: ToolCall[];        // Function calls (if any)
  usage?: {
    promptTokens: number;
    completionTokens: number;
    totalTokens: number;
  };
}

Finish Reasons

Reason	Meaning
`completed`	Natural completion
`length_limit`	Hit max_tokens
`tool_call`	Model called a function
`filtered`	Content filtered
`error`	Request failed

Advanced

Override Model

Override the gate’s configured model:

const response = await verlon.chat({
  gateId: 'your-gate-id',
  model: 'gpt-4o',  // Use specific model
  data: {
    messages: [{ role: 'user', content: 'Hello!' }]
  }
});

Custom Metadata

Track requests with metadata:

const response = await verlon.chat({
  gateId: 'your-gate-id',
  metadata: {
    userId: 'user-123',
    sessionId: 'session-456',
    feature: 'chat-support'
  },
  data: {
    messages: [{ role: 'user', content: 'Hello!' }]
  }
});

Stop Sequences

Stop generation at specific strings:

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      { role: 'user', content: 'List 3 fruits' }
    ],
    stop: ['\n4.', 'That's all']
  }
});

End-user ID (experiment routing)

Pass endUserId when you want a stable end-user to consistently land in the same experiment variant across every request they make.

const response = await verlon.chat({
  gateId: 'your-gate-id',
  endUserId: currentUser.id, // stable per end-user
  data: {
    messages: [
      { role: 'user', content: 'Hello' }
    ]
  }
});

Any stable string works — an end-user ID, a session token, a device fingerprint. Two requests with the same endUserId always hash to the same variant; different values get independent assignments.

If an active experiment on the gate is configured with randomization_unit = 'user' and endUserId is missing, the request will bypass the experiment (and will not be counted as either variant). The experiment’s coverage metric in the dashboard shows how much traffic had the identifier. We’ll email you if 10 consecutive requests miss it.

Don’t use PII directly. Hash your user IDs or use opaque tokens. We hash whatever you pass and never expose it back externally, but it’s good hygiene.

Best Practices

1. Use Streaming for Long Responses

// Good ✅ - Better UX
for await (const chunk of verlon.chatStream({...})) {
  process.stdout.write(chunk.content || '');
}

// Less ideal - User waits
const response = await verlon.chat({...});
console.log(response.content);

2. Always Handle Errors

// Good ✅
try {
  const response = await verlon.chat({...});
  return response.content;
} catch (error) {
  logger.error('Chat failed:', error);
  return 'Sorry, I encountered an error.';
}

// Bad ❌
const response = await verlon.chat({...}); // Unhandled errors

3. Provide System Context

// Good ✅ - Clear behavior
{
  messages: [
    { role: 'system', content: 'You are a helpful math tutor for high school students.' },
    { role: 'user', content: 'Explain calculus' }
  ]
}

// Less effective
{
  messages: [
    { role: 'user', content: 'Explain calculus' }
  ]
}

4. Monitor Costs

const response = await verlon.chat({...});
console.log(`Cost: $${response.cost.toFixed(4)}`);
console.log(`Tokens: ${response.usage?.totalTokens}`);

Examples

Customer Support Bot

const response = await verlon.chat({
  gateId: 'your-gate-id',
  metadata: { userId: 'user-123', type: 'support' },
  data: {
    messages: [
      {
        role: 'system',
        content: 'You are a friendly customer support agent. Be helpful and concise.'
      },
      {
        role: 'user',
        content: 'How do I reset my password?'
      }
    ],
    temperature: 0.7,
    maxTokens: 300
  }
});

Code Explanation

const response = await verlon.chat({
  gateId: 'your-gate-id',
  data: {
    messages: [
      {
        role: 'system',
        content: 'You are an expert programmer who explains code clearly.'
      },
      {
        role: 'user',
        content: 'Explain this code:\n```python\n[x**2 for x in range(10)]\n```'
      }
    ],
    temperature: 0.3  // Lower for more focused technical responses
  }
});

Image Analysis

const response = await verlon.chat({
  gateId: 'your-vision-gate-id',
  data: {
    messages: [
      {
        role: 'user',
        content: [
          { type: 'text', text: 'Describe this image in detail' },
          {
            type: 'image_url',
            image_url: {
              url: 'https://example.com/photo.jpg',
              detail: 'high'
            }
          }
        ]
      }
    ]
  }
});

Next Steps

Images

Generate images from text

Embeddings

Create text embeddings

Gates & Routing

How Verlon routes requests

Cost Tracking

Monitor spending

Documentation Index

​Basic Chat

​Streaming

​Message Roles

​System Messages

​Conversation History

​Vision (Multimodal)

​Image Detail Levels

​Multiple Images

​Function Calling

​Tool Choice Options

​Complete Function Calling Example

​JSON Mode

​Parameters

​Request Parameters

​Data Parameters

​Temperature Guide

​Response

​Finish Reasons

​Advanced

​Override Model

​Custom Metadata

​Stop Sequences

​End-user ID (experiment routing)

​Best Practices

​1. Use Streaming for Long Responses

​2. Always Handle Errors

​3. Provide System Context

​4. Monitor Costs

​Examples

​Customer Support Bot

​Code Explanation

​Image Analysis

​Next Steps

Images

Embeddings

Gates & Routing

Cost Tracking

Basic Chat

Streaming

Message Roles

System Messages

Conversation History

Vision (Multimodal)

Image Detail Levels

Multiple Images

Function Calling

Tool Choice Options

Complete Function Calling Example

JSON Mode

Parameters

Request Parameters

Data Parameters

Temperature Guide

Response

Finish Reasons

Advanced

Override Model

Custom Metadata

Stop Sequences

End-user ID (experiment routing)

Best Practices

1. Use Streaming for Long Responses

2. Always Handle Errors

3. Provide System Context

4. Monitor Costs

Examples

Customer Support Bot

Code Explanation

Image Analysis

Next Steps