[MLOps] MLFlow 실습

MLOps를 위한 MLFlow 실습

게시 2025/12/23

MLFlow Logo

By 황병현

7 분읽는 시간

[MLOps] MLFlow 실습

MLFlow 구축

  
# MLFlow 설치 및 환경 구축
!pip install mlflow

Terminal에 아래와 같이 작성하면 MLFlow 실행

mlflow ui

INFO:waitress:Serving on http://127.0.0.1:5000

위 주소 클릭 -> 실행

머신러닝

자동 로깅

Sklearn 머신러닝을 사용하는 경우 mlflow.sklearn.autolog()를 사용할 수 있다. 해당 기능은 매개변수를 자동 로깅 통합 기능을 제공하다.

자세한 mlflow.autolog()는 MLflow Python API를 통해 확인할 수 있다.

  
import mlflow
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split, GridSearchCV
from sklearn.ensemble import RandomForestRegressor

# tracing저장 경로
mlflow.set_tracking_uri("file:./mlruns")

# 실험 환경 설정
mlflow.set_experiment("diabetes-rf")

# 자동 로깅
mlflow.sklearn.autolog()

data_diabetes = load_diabetes()
X_train, X_test, y_train, y_test = train_test_split(data_diabetes.data, data_diabetes.target, test_size = 0.2, random_state = 1004)

reg = GridSearchCV(RandomForestRegressor(), {"n_estimators": [50,100], "max_depth": [2,5,10], "max_features": [5,8,10]})
reg.fit(X_train, y_train)

predictions = reg.predict(X_test)

2025/12/22 17:27:09 INFO mlflow.utils.autologging_utils: Created MLflow autologging run with ID 'b854a7f39b034877969c0579904195f9', which will track hyperparameters, performance metrics, model artifacts, and lineage information for the current sklearn workflow
2025/12/22 17:27:20 INFO mlflow.sklearn.utils: Logging the 5 best runs, 13 runs will be omitted.

수동 로깅

mlflow.log_metric(), mlflow.log_metrics(): metric를 지정해서 로깅, metrics의 경우 dict형태로 저장 mlflow.log_param(), mlflow.log_params(): param을 지정해서 로깅, params의 경우 dict형태로 저장

  
from sklearn.metrics import mean_absolute_error, mean_squared_error

def train_model_with_hyperparameters(n_estimator, max_depth, max_feature):
    with mlflow.start_run():
        model = RandomForestRegressor(n_estimators=n_estimator, max_depth=max_depth, max_features=max_feature)
        model.fit(X_train, y_train)
        y_pred = model.predict(X_test)
        mae = mean_absolute_error(y_test, y_pred)
        mse = mean_squared_error(y_test, y_pred)
        mlflow.log_param("n_estimators", n_estimator)
        mlflow.log_param("max_depth", max_depth)
        mlflow.log_param("max_features", max_feature)
        mlflow.log_metric("mae_on_test", mae)
        mlflow.log_metric("mse_on_test", mse)
        mlflow.log_metric("rmse_on_test", mse**0.5)

  
n_estimators = [50,100]
max_depths = [2,5,10]
max_features = [5,8,10]

for n_estimator in n_estimators:
    for max_depth in max_depths:
        for max_feature in max_features:
            train_model_with_hyperparameters(n_estimator, max_depth, max_feature)

Experiments

만들어진 log를 확인하여 모델 선택

model을 register에 저장

version1으로 새로운 모델 생성

※ 구버전과 신버전의 차이가 있음

신버전

Tags와 Aliases로 모델 배포

구버전

stage의 Archived, Staging, Production으로 관리 및 배포
Archived: 저장 상태
Staging: QA 테스트 상태(준비 상태)
Production: 배포 상태

Model Serving

배포된 모델이 예측 요청을 받고 처리하여 예측 결과를 반환하는 과정을 의미
요청 처리와 모델 추론

방법 1

  
import mlflow

mlflow.set_tracking_uri("http://127.0.0.1:5000")

  
model_url = "models:/RandomForestRegressor/1"

loaded_model = mlflow.sklearn.load_model(model_url)

loaded_model

RandomForestRegressor(max_depth=5, max_features=8, n_estimators=50)

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

  
y_pred = loaded_model.predict(X_test)
mean_squared_error(y_pred, y_test)

3041.9044368258315

방법 2

CLI를 통해 모델을 지정된 포트로 서버 통신

  
mlflow models serve -m models:/RandomForestRegressor/1 -p 5001 --no-conda

  
import pandas as pd
import requests

host = "127.0.0.1"
url = f"http://{host}:5001/invocations"

X_test_df = pd.DataFrame(X_test, columns=data_diabetes.feature_names)
json_data = {"dataframe_split": X_test_df[:10].to_dict(orient="split")}

response = requests.post(url, json=json_data)
print(f"\nPyfunc 'predict_interval':\n${response.json()}")

Pyfunc 'predict_interval':
${'predictions': [116.08085576971826, 100.81009133456911, 184.35325437343346, 104.71531717642706, 156.09652462733177, 174.07622265563091, 125.68893595239533, 140.11980756904265, 81.71646235903134, 200.24981584985514]}

Cloud 배포

AWS Sagemaker

  
mlflow sagemaker build-and-push-container -m models:/<model_id>
mlflow sagemaker deploy {parameters}

AzureML

  
mlflow azureml export -m {model' path} -o test-output

딥러닝

model.pytorch.autolog()가 가능하지만 수동으로 하는 것을 권장

  
import mlflow

from torch.utils.data import Dataset, DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
from torchvision.io import read_image
import torch
import torch.nn as nn

import pandas as pd
import os

  
def seed_everything(seed=42):
    torch.manual_seed(seed)
    torch.cuda.manual_seed_all(seed)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

  
mlflow.set_tracking_uri("file:./mlruns")

mlflow.set_experiment("FashionCNN")

# mlflow.pytorch.autolog()

<Experiment: artifact_location=('file:///c:/Users/PC/Desktop/실습/mlruns/107038616652530463'), creation_time=1766485896982, experiment_id='107038616652530463', last_update_time=1766485896982, lifecycle_stage='active', name='FashionCNN', tags={'mlflow.experimentKind': 'custom_model_development'}>

  
training_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=False,
    transform=ToTensor()
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=False,
    transform=ToTensor()
)

  
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
print(device)

cuda:0

  
class FashionDataset(Dataset):
    def __init__(self, annotations_file, img_dir, transform=None, target_transform=None):
        self.img_labels = pd.read_csv(annotations_file, names=["file_name", "label"])
        self.img_dir = img_dir
        self.transform = transform
        self.target_transform = target_transform

    def __len__(self):
        return len(self.img_labels)
    
    def __getitem__(self, idx):
        img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
        image = read_image(img_path)
        label = self.img_labels.iloc[idx, 1]
        if self.transform:
            image = self.transform(image)
        if self.target_transform:
            label = self.target_transform(label)
        return image, label

  
train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)

  
a = next(iter(train_dataloader))
a[0].size()

torch.Size([64, 1, 28, 28])

  
class FashionCNN(nn.Module):
    def __init__(self):
        super(FashionCNN, self).__init__()

        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=32, kernel_size=3, padding=1),
            nn.BatchNorm2d(32),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.layer2 = nn.Sequential(
            nn.Conv2d(in_channels=32, out_channels=64, kernel_size=3),
            nn.BatchNorm2d(64),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2)
        )
        self.fc1 = nn.Linear(in_features=64*6*6, out_features=600)
        self.drop = nn.Dropout(0.25)
        self.fc2 = nn.Linear(in_features=600, out_features=120)
        self.fc3 = nn.Linear(in_features=120, out_features=10)

    def forward(self, x):
        out = self.layer1(x)
        out = self.layer2(out)
        out = out.view(out.size(0), -1)
        out = self.fc1(out)
        out = self.drop(out)
        out = self.fc2(out)
        out = self.fc3(out)

        return out

  
params = {
    "epochs": 5,
    "learning_rate": 0.001,
    "batch_size": 64
}

  
model = FashionCNN()
model.to(device)

criterion = nn.CrossEntropyLoss()

optimizer = torch.optim.Adam(model.parameters(), lr=params["learning_rate"])
print(model)

FashionCNN(
  (layer1): Sequential(
    (0): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (1): BatchNorm2d(32, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (layer2): Sequential(
    (0): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
    (1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU()
    (3): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  )
  (fc1): Linear(in_features=2304, out_features=600, bias=True)
  (drop): Dropout(p=0.25, inplace=False)
  (fc2): Linear(in_features=600, out_features=120, bias=True)
  (fc3): Linear(in_features=120, out_features=10, bias=True)
)

  
@torch.no_grad()
def compute_accuracy_from_loader(model, dataloader, device):
    model.eval()
    correct, total = 0, 0
    for data, target in dataloader:
        data, target = data.to(device), target.to(device)
        logits = model(data)
        pred = logits.argmax(dim=1)
        correct += (pred == target).sum().item()
        total += target.size(0)
    return correct / max(total, 1)

  
best_model_id = None
best_test_acc = -1.0

with mlflow.start_run() as run:
    mlflow.log_params(params)

    for epoch in range(params["epochs"]):
        model.train()
        for data, target in train_dataloader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            loss = criterion(output, target)
            optimizer.zero_grad()
            loss.backward()
            optimizer.step()

        model_info = mlflow.pytorch.log_model(
                pytorch_model=model,
                name=f"checkpoint-{epoch}",
                step=epoch,
                pip_requirements=[
                    "torch==2.6.0+cu124",
                    "torchvision==0.21.0+cu124",
                    "numpy",
                ],
            )
        
        train_acc = compute_accuracy_from_loader(model, train_dataloader, device)
        test_acc = compute_accuracy_from_loader(model, test_dataloader, device)

        mlflow.log_metric(
                key="train_accuracy",
                value=train_acc,
                step=epoch,
                model_id=model_info.model_id,
            )
        mlflow.log_metric(
            key="test_accuracy",
            value=test_acc,
            step=epoch,
            model_id=model_info.model_id,
        )

        if test_acc > best_test_acc:
            best_test_acc = test_acc
            best_model_id = model_info.model_id

    ranked_checkpoints = mlflow.search_logged_models(
        filter_string=f"source_run_id='{run.info.run_id}'",
        order_by=[{"field_name": "metrics.test_accuracy", "ascending": False}],
        output_format="list",
    )

    best_checkpoint = ranked_checkpoints[0]
    print(f"Best checkpoint: {best_checkpoint.name}")
    print(f"Accuracy: {best_checkpoint.metrics[0].value}")

    best_model_uri = f"models:/{best_model_id}"

    try:
        mv = mlflow.register_model(model_uri=best_model_uri, name="FasionCNN_Model")
        print("FashionCNN_model : ", mv.version)
    except Exception as e:
        print("error : ", e)

Best checkpoint: checkpoint-4
Accuracy: 0.9048
FashionCNN_model :  2


Registered model 'FasionCNN_Model' already exists. Creating a new version of this model...
Created version '2' of model 'FasionCNN_Model'.

MLOps, MLFlow

MLOps MLFlow