Skip to main content

Questions tagged [bert]

BERT stands for Bidirectional Encoder Representations from Transformers and is designed to pre-train deep bidirectional representations by jointly conditioning on both left and right context in all layers

Filter by
Sorted by
Tagged with
2 votes
2 answers
53 views

BERT + CNN Model Underfitting for Binary Text Classification: How to Improve?

I'm working on a binary text classification task using a BERT + CNN model. However, based on the loss and accuracy graphs during training, it seems that the model is underfitting, and I'm not seeing ...
DMabulage's user avatar
  • 121
0 votes
1 answer
49 views

Find the correlation between two lists of texts

Let's say that I have some lists of texts such as : ...
Leon's user avatar
  • 1
0 votes
2 answers
105 views

Calculate the correlation of two lists of embeddings

I have two lists of sentences ...
Leon's user avatar
  • 1
0 votes
1 answer
35 views

Handle text column with PyTorch

I'm new in ML so question may be stupid. I have a data set with multiple numeric columns and one text column. Text is just one sentense. So i want to use all data avaible for classification. But i don'...
Kliver Max's user avatar
0 votes
0 answers
16 views

What does token_type_id affect the self attention and other mechanism in BERT?

I know that BERT use NSP (Next Sentence Prediction) task for pre-training. They design two sentence separated by [SEP] token and have token id different on each sentence. My downstream task is build a ...
jupyter's user avatar
  • 101
1 vote
0 answers
29 views

Sampling multiple masked tokens through Metropolis–Hastings

I'm trying to replicate the finding of the the publication "Exposing the Implicit Energy Networks behind Masked Language Models via Metropolis-Hastings" for obtaining the joint distribution ...
Chris's user avatar
  • 11
0 votes
0 answers
66 views

Using multiple text inputs for one output with RoBERTa/DistillBERT

In a current project i want to fine tune a RoBERTa/DistillBERT for text classification. The model should take two text input features, limited to the length of around 280 characters, and generate a ...
Mime's user avatar
  • 101
0 votes
0 answers
33 views

Best model for enforcing corporate naming conventions

I'm working on a project (Python) to enforce the company naming convention of products on product lists provided by clients/suppliers. I'm having a list of company names (Standardised names) and those ...
Secret Ambush's user avatar
0 votes
0 answers
27 views

NLP model for word recovery (analogy to BERT, but letters)

I am working on solving the problem of restoring words in text where some letters are missing. For example (restore words where vowels are removed): Hll wrld -> Hello world n ltrntv ssssmnt sggsts -...
SoH's user avatar
  • 119
0 votes
0 answers
32 views

How can I make my Hugging Face fine-tuned model's config.json file reference a specific revision/commit from the original pretrained model?

I uploaded this model: https://huggingface.co/pamessina/CXRFE, which is a fine-tuned version of this model: https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized Unfortunately, CXR-BERT-...
Pablo Messina's user avatar
0 votes
0 answers
26 views

Fine-tuning pretrained model on 2 tasks with 2 labeled dataset

I am having difficulty using BERT for a sentiment analysis task that handles both aspect-based sentiment analysis (ABSA) and comment sentiment analysis. I know that using two separate classification ...
ndycuong's user avatar
3 votes
0 answers
69 views

Weird behaviour when using RobERTA for text classification

I have a dataset with around 70 classes and the dataset is largely balanced ~150 samples per class. I am finetuning RoBERTA-base for 4 epochs with a ...
user1274878's user avatar
2 votes
0 answers
162 views

Use text embeddings to map job descriptions to ESCO occupations

I'm trying to build a model to map job descriptions to ESCO occupations which is a taxonomy for job titles. Every ESCO occupations have a title, a description and some essential skills. Ideally I ...
GanaelD's user avatar
  • 21
1 vote
1 answer
121 views

Reducing emails token count preprocessing for Large Email Datasets - Feeding LLMs

I have a large email dataset in .txt format and want to feed LLMs (like Gemini and ChatGPT) to provide answers based on email content. The token count for my email data is very high (~1M for 1K emails)...
Rafael Borja's user avatar
1 vote
1 answer
413 views

How can I use contextual embeddings with BERT for sentiment analysis/classification

I have a BERT model which I want to use for sentiment analysis/classification. E.g. I have some tweets that need to get a POSITIVE,NEGATIVE or NEUTRAL label. I can't understand how contextual ...
average_discrete_math_enjoyer's user avatar

15 30 50 per page
1
2 3 4 5
23