Closed
Description
System information
Macbook OS 11.6
numpy==1.19.5
pathlib2==2.3.5
python==3.8.8
tensorflow==2.7.0
transformers==4.2.0
Describe the current behavior
During .fit()
, receiving warnings:
WARNING:tensorflow:AutoGraph could not transform <bound method Socket.send of <zmq.Socket(zmq.PUSH) at XXX >> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: module, class, method, function, traceback, frame, or code object was expected, got cython_function_or_method
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
and
The parameters `output_attentions`, `output_hidden_states` and `use_cache` cannot be updated when calling a model.They have to be set to True/False in the config object (i.e.: `config=XConfig.from_pretrained('name', output_attentions=True)`).
The parameter `return_dict` cannot be set in graph mode and will always be set to `True`.
Would not have submitted except for the "please report" phrase in first warning. Happy to close quickly if this is resolved/trivial.
Describe the expected behavior
Neither of these warnings to appear prior to training.
Standalone code to reproduce the issue
# train_texts is a list of strings
# train_labels is a list of integers (0,1)
import tensorflow as tf
from transformers import AutoTokenizer, TFAutoModelForSequenceClassification
from sklearn.model_selection import train_test_split
train_texts, val_texts, train_labels, val_labels = train_test_split(train_texts, train_labels, test_size=.2)
checkpoint = "cardiffnlp/twitter-xlm-roberta-base-sentiment"
tokenizer = AutoTokenizer.from_pretrained(checkpoint)
train_encodings = tokenizer(train_texts, truncation=True, padding=True, max_length=512, return_tensors='tf')['input_ids']
val_encodings = tokenizer(val_texts, truncation=True, padding=True, max_length=512, return_tensors='tf')['input_ids']
test_encodings = tokenizer(test_texts, truncation=True, padding=True, max_length=512, return_tensors='tf')['input_ids']
train_dataset = tf.data.Dataset.from_tensor_slices((
train_encodings,
train_labels
))
val_dataset = tf.data.Dataset.from_tensor_slices((
val_encodings,
val_labels
))
test_dataset = tf.data.Dataset.from_tensor_slices((
test_encodings,
test_labels
))
model = TFAutoModelForSequenceClassification.from_pretrained(checkpoint)
optimizer = tf.keras.optimizers.Adam(learning_rate=5e-5)
model.compile(optimizer=optimizer, loss=model.compute_loss)
## Warnings occur here:
model.fit(train_dataset.shuffle(1000).batch(16), epochs=3, batch_size=16)
Other info / logs
Verbose==10 logs leading up to warning:
tf_verbose_log.txt