ss-13 16 hours ago

I’ve been experimenting with MLflow’s Prompt Engineering UI, which lets you do no-code prompt tuning across multiple LLMs. While it officially supports models like OpenAI out of the box, I wanted to try it with Japanese open-source models from the LLM-jp project.

This repo shows how to serve these models locally using MLflow’s pyfunc model interface, expose them via the MLflow AI Gateway, and compare prompt performance through the UI.

It includes a working setup with: - Hugging Face LLM-jp models (e.g. llm-jp-3-3.7b-instruct) - MLflow Model Serving - MLflow Gateway - Prompt Engineering UI - Streamlit UI for experiment tracking

GitHub: https://github.com/suzuki-2001/mlflow-llm-jp-integration Japanese article explaining the project: https://zenn.dev/shosuke_13/articles/21d304b5f80e00