Efficient, Flexible and Portable Structured Generation
Fast inference engine for Transformer models
LLM inference in C/C++