Skip to main content

Questions tagged [huggingface]

Filter by
Sorted by
Tagged with
1 vote
0 answers
11 views

Problem in creating custom Deberta model

I'm trying to create a custom model that takes sentence and POS tag as input as well, but the model is predicting the same label over and over for each token. Well, I tried different parameters (e.g., ...
Animy's user avatar
  • 11
5 votes
1 answer
50 views

Running DeepSeek-V3 inference without GPU (on CPU only)

I am trying to run the DeepSeek-V3 model inference on a remote machine (SSH). This machine does not have any GPU, but has many CPU cores. 1rst method/ I try to run the model inference using the ...
The_Average_Engineer's user avatar
0 votes
0 answers
26 views

Error while loading the MultiModal Models from Huggingface hub

I am trying to use a multimodal model from Huggingface hub. I tried with "maya-multimodal/maya" model.(Following is the code to load the model): from llama_index.multi_modal_llms.huggingface ...
Swagat Mishra's user avatar
0 votes
0 answers
21 views

Formatting a numbered list into a cohesive prose paragraph using Hugging Face Inference API

I am playing with Hugging Face Inference API and I am trying to convert a numbered list into a cohesive prose paragraph. I have tried multiple models but I am not able to get things working. I have ...
Sandeep's user avatar
  • 101
1 vote
0 answers
20 views

How to finetune time series transformer hyper parameters to beat the LSTM performance?

I am trying to train an ML model on time series data. The input is 10 timeseries which are essentially a sensor data. The output is another set of three time series. I feed the model with the window ...
Mahesha999's user avatar
0 votes
0 answers
49 views

Hugging Face Real Time Object Detection Deployment

I'm developing a live object detection app using Streamlit and the YOLOv8 model. The app runs smoothly with real-time inference on my local machine. However, when I deploy it to Hugging Face Spaces, ...
Shah Zeb's user avatar
0 votes
0 answers
32 views

How can I make my Hugging Face fine-tuned model's config.json file reference a specific revision/commit from the original pretrained model?

I uploaded this model: https://huggingface.co/pamessina/CXRFE, which is a fine-tuned version of this model: https://huggingface.co/microsoft/BiomedVLP-CXR-BERT-specialized Unfortunately, CXR-BERT-...
Pablo Messina's user avatar
0 votes
0 answers
27 views

Does it common for LM (hundreds million parameters) beat LLM (billion parameters) for binary classification task?

Preface I am trying to fine-tune the transformer-based model (LM and LLM). The LM that I used is DEBERTA, and the LLM is LLaMA 3. The task is to classify whether a text contains condescending language ...
sempraEdic's user avatar
0 votes
0 answers
31 views

NLP: how to handle bad tokenization

I get nonsense when trying to translate the following german sentence to swedish using google/madlad400-3b-mt: a. Natürliche Personen: BundID mit ELSTER-Zertifikat oder nPA/eID/eAT-Authentifizierung ...
Mathermind's user avatar
1 vote
1 answer
191 views

attentions not returned from transformers ViT model when using output_attentions=True

I'm using this code snippet from the docs of HuggingFace ViT classification model - with one addition: I'm using the output_attentions=True parameter. Nevertheless, ...
OfirD's user avatar
  • 111
0 votes
0 answers
111 views

Instruction LLM - extract data from text wrongly continues

I'm trying to fine-tune open sourced LLMs, for now let's stick with Mistral-7b-instruct model. My task is a follow: I have emails, that represents "price requests" for shipments sends by our ...
sagi's user avatar
  • 101
4 votes
1 answer
144 views

Since LoRA parameters are randomly initialized, shouldn't that mean that initially breaks a models output?

I have just tried using LoRA on Llama 3 8B and I found without doing any fine tuning it performed pretty well on my dataset. But then I realized that surely the LoRA parameters are randomly ...
Ameen Izhac's user avatar
1 vote
0 answers
1k views

"No sentence-transformers model found with name" on huggingface even though it exists

I am trying to use infgrad/stella-base-en-v2 on hugging to generate embeddings using langchain The model exists on the huggingface hub The model is listed on the MTEB leaderboard The model has ...
figs_and_nuts's user avatar
0 votes
0 answers
78 views

Not able to use huggingface inference API to get text embeddings

From the tutorials I am using the example that is provided ...
figs_and_nuts's user avatar
4 votes
1 answer
2k views

How do I get model.generate() to omit the input sequence from the generation?

I'm using Huggingface to do inference on llama-3-B. Here is my model: ...
Ameen Izhac's user avatar

15 30 50 per page
1
2 3 4 5
7