Questions tagged [bert]
BERT stands for Bidirectional Encoder Representations from Transformers and is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers
345 questions
2
votes
2
answers
53
views
BERT + CNN Model Underfitting for Binary Text Classification: How to Improve?
I'm working on a binary text classification task using a BERT + CNN model. However, based on the loss and accuracy graphs during training, it seems that the model is underfitting, and I'm not seeing ...
0
votes
1
answer
49
views
Find the correlation between two lists of texts
Let's say that I have some lists of texts such as :
...
0
votes
2
answers
105
views
Calculate the correlation of two lists of embeddings
I have two lists of sentences
...
0
votes
1
answer
35
views
Handle text column with PyTorch
I'm new in ML so question may be stupid.
I have a data set with multiple numeric columns and one text column. Text is just one sentense.
So i want to use all data avaible for classification. But i don'...
0
votes
0
answers
16
views
What does token_type_id affect the self attention and other mechanism in BERT?
I know that BERT use NSP (Next Sentence Prediction) task for pre-training. They design two sentence separated by [SEP] token and have token id different on each sentence.
My downstream task is build a ...
1
vote
0
answers
29
views
Sampling multiple masked tokens through Metropolis–Hastings
I'm trying to replicate the finding of the the publication "Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis-Hastings" for obtaining the joint distribution ...
0
votes
0
answers
66
views
Using multiple text inputs for one output with RoBERTa/DistillBERT
In a current project i want to fine tune a RoBERTa/DistillBERT for text classification.
The model should take two text input features, limited to the length of around 280 characters, and generate a ...
0
votes
0
answers
33
views
Best model for enforcing corporate naming conventions
I'm working on a project (Python) to enforce the company naming convention of products on product lists provided by clients/suppliers. I'm having a list of company names (Standardised names) and those ...
0
votes
0
answers
27
views
NLP model for word recovery (analogy to BERT, but letters)
I am working on solving the problem of restoring words in text where some letters are missing. For example (restore words where vowels are removed): Hll wrld -> Hello world n ltrntv ssssmnt sggsts -...
0
votes
0
answers
32
views
How can I make my Hugging Face fine-tuned model's config.json file reference a specific revision/commit from the original pretrained model?
I uploaded this model: https://huggingface.co/pamessina/CXRFE, which is a fine-tuned version of this model: https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized
Unfortunately, CXR-BERT-...
0
votes
0
answers
26
views
Fine-tuning pretrained model on 2 tasks with 2 labeled dataset
I am having difficulty using BERT for a sentiment analysis task that handles both aspect-based sentiment analysis (ABSA) and comment sentiment analysis. I know that using two separate classification ...
3
votes
0
answers
69
views
Weird behaviour when using RobERTA for text classification
I have a dataset with around 70 classes and the dataset is largely balanced ~150 samples per class. I am finetuning RoBERTA-base for 4 epochs with a ...
2
votes
0
answers
162
views
Use text embeddings to map job descriptions to ESCO occupations
I'm trying to build a model to map job descriptions to ESCO occupations which is a taxonomy for job titles. Every ESCO occupations have a title, a description and some essential skills.
Ideally I ...
1
vote
1
answer
121
views
Reducing emails token count preprocessing for Large Email Datasets - Feeding LLMs
I have a large email dataset in .txt format and want to feed LLMs (like Gemini and ChatGPT) to provide answers based on email content.
The token count for my email data is very high (~1M for 1K emails)...
1
vote
1
answer
413
views
How can I use contextual embeddings with BERT for sentiment analysis/classification
I have a BERT model which I want to use for sentiment analysis/classification. E.g. I have some tweets that need to get a POSITIVE,NEGATIVE or NEUTRAL label. I can't understand how contextual ...