PythonのPandasのforループについて

2025年8月25日

Pandasは行や列を扱うので、次のような場面でforループが使われます。

DataFrameの行を1つずつ処理したいとき
新しい列を作成したいとき
条件分岐でデータを分類したいとき
複雑なロジックで行ごとの処理が必要なとき

ただし、PandasやNumPyは「ベクトル化」によってまとめて処理するのが得意なので、forループは「理解のため」「少量データの処理」「どうしても複雑な処理が必要なとき」に限定するのが基本です。

Pandasでよく使うforループの方法

`iterrows()` を使う

行を1つずつ (index, Series) として取り出す方法です。

import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== iterrows() を使った行ループ ===")
for index, row in df.iterrows():
    print(index, row["name"], row["age"])

特徴

簡単でわかりやすい
ただし遅い（数万行を超えると効率が悪い）
行がSeriesになるので型が変わることがある

`itertuples()` を使う

行を namedtuple で取り出す方法。iterrows()より高速。

import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== itertuples() を使った行ループ ===")
for row in df.itertuples():
    print(row.Index, row.name, row.age)

特徴

高速で型も保持されやすい
列は属性でアクセスできる（例: row.age）
大量データ処理ならこちらが推奨

`iloc` / `loc` とforループ

行番号を使って直接アクセスする方法。

import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== iloc / loc を使った行ループ ===")
for i in range(len(df)):
    print(df.loc[i, "name"], df.loc[i, "age"])

特徴

Pythonリスト的に使える
しかし処理は遅くなりやすい

forループの実用例

新しい列を作成

import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== forループで新しい列を追加 ===")
df["age_plus_10"] = None
for i in range(len(df)):
    df.loc[i, "age_plus_10"] = df.loc[i, "age"] + 10

print(df)

条件分岐で列を追加

import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== forループで条件分岐 ===")
df["status"] = None
for index, row in df.iterrows():
    if row["age"] > 30:
        df.loc[index, "status"] = "senior"
    else:
        df.loc[index, "status"] = "junior"

print(df)

forループを避ける方法（高速化）

Pandasでは ベクトル化 で書き換えるのが基本です。

`apply()` を使う

import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== apply を使った処理 ===")
df["status"] = df["age"].apply(lambda x: "senior" if x > 30 else "junior")

print(df)

`np.where()` を使う（さらに高速）

import pandas as pd
import numpy as np

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== np.where を使った処理 ===")
df["status"] = np.where(df["age"] > 30, "senior", "junior")

print(df)

数値計算はベクトル化

import pandas as pd

df = pd.DataFrame({
    "name": ["Alice", "Bob", "Charlie"],
    "age": [25, 30, 35]
})

print("=== ベクトル化で数値計算 ===")
df["age_plus_10"] = df["age"] + 10

print(df)

forループより数十倍高速になることも多いです。