Number Representations & States

"how numbers are stored and used in computers"

SGLang streaming

SGLang supports streaming responses. Here's an example using curl:

code.txt
1curl http://localhost:8000/v1/chat/completions \ 2 -H "Content-Type: application/json" \ 3 -d '{ 4 "model": "meta-llama/Llama-2-7b-chat-hf", 5 "messages": [{"role": "user", "content": "Write a short poem about robots"}], 6 "stream": true

Or in Python:

code.py
1for chunk in sg.chat_func().run(stream=True): 2 print(chunk.text, end="", flush=True)