Streaming
What Is Streaming?
Streaming is a way to make your agent feel more responsive. Instead of waiting for the complete response, you can stream intermediate results as they arrive.
Streaming Support
Railtracks supports streaming responses from your agent. To interact with a stream, just set the appropriate flag when creating your LLM.
When you call the LLM, it will return a generator that you can iterate through:
model = llm.OpenAILLM(model_name="gpt-4o", stream=True)
response = model.chat(llm.MessageHistory([
llm.UserMessage("Tell me who you are are"),
]))
# The response object can act as an iterator returning string chunks terminating with the complete message.
for chunk in response:
print(chunk)
Agent Support
Agents in Railtracks also support streamed responses. When creating your agent, you provide an LLM with streaming enabled:
import railtracks as rt
agent = rt.agent_node(
llm=rt.llm.OpenAILLM(model_name="gpt-4o", stream=True),
)
The output of the agent will be a generator containing a sequence of strings, followed by the complete message.
Usage
agent = rt.agent_node(
llm=rt.llm.OpenAILLM(model_name="gpt-4o", stream=True),
)
@rt.session
async def main():
result = await rt.call(agent, rt.llm.MessageHistory([
rt.llm.UserMessage("Tell me who you are are"),
]))
# The response object can act as an iterator returning string chunks terminating with the complete message.
for chunk in result:
print(chunk)
Warning
When using streaming, you should fully exhaust the returned object within the session. If you do this outside of the session, the visualizer suite will not work as expected.
Warning
Streaming is not currently supported for tool-calling agents. See issue #756.