我正在尝试将 Tensorflow 和 Keras 用于预测模型。
我首先读取具有形状 (7709, 58) 的数据集,然后对其进行规范化:
normalizer = tf.keras.layers.Normalization(axis=-1)
normalizer.adapt(np.array(dataset))
然后我将数据拆分为训练和测试数据:
train_dataset = dataset[:5000]
test_dataset = dataset[5000:]
我准备了这些数据集:
train_dataset.describe().transpose()
test_dataset.describe().transpose()
train_features = train_dataset.copy()
test_features = test_dataset.copy()
train_labels = train_features.pop('outcome')
test_labels = test_features.pop('outcome')
然后我建立模型:
def build_and_compile_model(norm):
model = keras.Sequential([
norm,
layers.Dense(64, activation='relu'),
layers.Dense(64, activation='relu'),
layers.Dense(1)
])
model.compile(loss='mean_squared_error', metrics=['mean_squared_error'],
optimizer=tf.keras.optimizers.Adam(0.001))
return model
dnn_model = build_and_compile_model(normalizer)
然后当我尝试拟合模型时,它失败了:
history = dnn_model.fit(
test_features,
test_labels,
validation_split=0.2, epochs=50)
给出以下错误:
ValueError: in user code:
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1021, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1010, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 1000, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py", line 859, in train_step
y_pred = self(x, training=True)
File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py", line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
ValueError: Exception encountered when calling layer "normalization_7" (type Normalization).
Dimensions must be equal, but are 57 and 58 for '{{node sequential_7/normalization_7/sub}} = Sub[T=DT_FLOAT](sequential_7/Cast, sequential_7/normalization_7/sub/y)' with input shapes: [?,57], [1,58].
有谁知道问题是什么以及我该如何解决?谢谢!
回答1
由于 https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.pop.html,您丢失了数据框中的 outcome
列。尝试使用提取该列
train_labels = train_features['outcome']
test_labels = test_features['outcome']
回答2
https://stackoverflow.com/a/72256863/16350154。但是,我会保留 pop,但将“normalizer.adapt”方法移到 pop 后面。这样,您就不会将规范化器与标签相匹配(这没有意义),并且您不会将标签用作特征(这可能很糟糕)。