π AI-Powered PDF Summarizer is a tool that extracts and summarizes research papers from ArXiv PDFs using Ollama (Gemma 3 LLM). The system provides structured, downloadable summaries to help researchers and professionals quickly grasp key findings.
- π Input an ArXiv PDF URL to fetch and summarize papers.
- π Extracts technical content (architecture, implementation, results).
- π Optimized for large text processing with parallel summarization.
- π¨ Modern UI built with Streamlit.
- π₯ Download summary as a Markdown file.
Component | Technology |
---|---|
Frontend | Streamlit |
Backend | FastAPI |
LLM Platform | Ollama |
LLM Model | Google Gemma 3 |
PDF Processing | PyMuPDF (fitz) |
Text Chunking | LangChain RecursiveCharacterTextSplitter |
1οΈβ£ Enter an ArXiv PDF URL
2οΈβ£ Click "Summarize PDF" π
3οΈβ£ Get a structured summary with technical insights π
4οΈβ£ Download as Markdown π₯
git clone https://github.com/arjunprahulal/gemma3_pdf_summarizer.git
cd gemma3_pdf_summarizer
pip install -r requirements.txt
Install Ollama - MacOS/Linux
curl -fsSL https://ollama.com/install.sh | sh
Download Gemma 3 Model
ollama pull gemma3:27b
uvicorn main:app --host 0.0.0.0 --port 8000 --reload
streamlit run frontend.py
GET /health
Response:
{"status": "ok", "message": "FastAPI backend is running!"}
Summarize an ArXiv Paper
POST /summarize_arxiv/
Request Body:
{
"url": "https://arxiv.org/pdf/2401.02385.pdf"
}
Response:
{
"summary": "Structured summary of the research paper..."
}