Skip to content

Issues: triton-inference-server/server

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Author
Filter by author
Label
Filter by label
Use alt + click/return to exclude labels
or + click/return for logical OR
Milestones
Filter by milestone
Assignee
Filter by who’s assigned
Sort

Issues list

Triton server crash when running a large model with an ONNX/CPU backend investigating The developement team is investigating this issue
#7337 opened Jun 10, 2024 by LucasAudebert
Does Triton Server support Dynamic Request Batching for models which has sparse tensors as inputs enhancement New feature or request investigating The developement team is investigating this issue
#7333 opened Jun 7, 2024 by MorrisMLZ
Segmentation fault when multi-requsts to triton-vllm bug Something isn't working question Further information is requested
#7332 opened Jun 7, 2024 by tricky61
Segmentation fault (core dumped) - Server version 2.46.0 question Further information is requested
#7330 opened Jun 6, 2024 by rahchuenmonroe
CUDA runtime API error raised when using only cpu on Mac M3 investigating The developement team is investigating this issue
#7324 opened Jun 5, 2024 by SunXuan90
When the request is large, the Triton server has a very high TTFT. investigating The developement team is investigating this issue
#7316 opened Jun 4, 2024 by Godlovecui
Memory over 100% with decoupled dali video model investigating The developement team is investigating this issue
#7315 opened Jun 3, 2024 by wq9
Single docker layer is too large investigating The developement team is investigating this issue
#7314 opened Jun 3, 2024 by ShuaiShao93
triton malloc fail question Further information is requested
#7308 opened May 31, 2024 by MouseSun846
Add TT-Metalium as a backend enhancement New feature or request
#7305 opened May 30, 2024 by jvasilje
Why is my model in ensemble receiving out-of-order input question Further information is requested
#7303 opened May 30, 2024 by Joenhle
How does Triton implement one instance to handle multiple requests simultaneously? investigating The developement team is investigating this issue
#7295 opened May 29, 2024 by SeibertronSS
Backend support for .keras files?
#7289 opened May 28, 2024 by chriscarollo
Support histogram custom metric in Python backend enhancement New feature or request
#7287 opened May 28, 2024 by ShuaiShao93
ProTip! What’s not been updated in a month: updated:<2024-05-11.