-
Updated
Jan 4, 2022 - Go
#
html-parsing
Here are 87 public repositories matching this topic...
A little like that j-thing, only in Go.
HTML parsing/serialization toolset for Node.js. WHATWG HTML Living Standard (aka HTML5)-compliant.
-
Updated
Mar 25, 2022 - HTML
A fast & lightweight XML & HTML parser in Swift with XPath & CSS support
-
Updated
Jan 4, 2022 - Swift
-
Updated
Mar 15, 2022 - TypeScript
A Scala library for scraping content from HTML pages
-
Updated
Mar 21, 2022 - Scala
Heuristic based boilerplate removal tool
-
Updated
Oct 21, 2021 - Python
Reworked https://www.readability.com/ parsing library (now https://mercury.postlight.com/ is living alternative)
-
Updated
Aug 2, 2019 - HTML
htmlparsing.com, a website devoted to helping people parse HTML correctly
-
Updated
Mar 13, 2019 - CSS
Delphi Dom HTML Parser and Converter. Fork (not from the original author): https://sourceforge.net/projects/htmlp/
-
Updated
Feb 2, 2020 - Pascal

-
Updated
Mar 7, 2022 - JavaScript
A java html 5 compliant parser
-
Updated
Mar 13, 2022 - Java
2
xBioDreadx
commented
Oct 31, 2019
Hi, im currently parse website that require authentication for access to some content.
So I log in with browser, copy all created cookies and add them with each parse request.
void editRequest(BoundRequestBuilder req, * *, * *) {
cookieService.cookies.each { req.addCookie(it) }
}
That works like a charm until website change cookies in response - short after this my c
good first issue
Good for newcomers
A Node.js XML DOM, Parser & Stringifier.
-
Updated
Feb 1, 2022 - JavaScript
SourceCode for SCP Foundation app - https://play.google.com/store/apps/details?id=ru.dante.scpfoundation
-
Updated
Nov 3, 2017 - Java
Fully Featured, highly pluggable and customizable Java Html to Pojo converter.
-
Updated
Aug 23, 2021 - Java
A java tool for detecting charset encoding of HTML web pages
-
Updated
Aug 23, 2021 - Java
A html parser written in RUST, parse html into node trees.
-
Updated
Feb 8, 2022 - Rust
BeautifulSoup4 packaged into a command line tool
-
Updated
May 24, 2015 - Python
web scrape facebook post and extract data
-
Updated
Jul 4, 2018 - Go
Summarize text and websites and optionally saves the data to a local file
-
Updated
Feb 19, 2017 - Go
Add, delete, modify, get html tags, text, links by using css selector
-
Updated
Nov 6, 2018 - PHP
django-janitor allows you to use bleach to clean HTML stored in a Model's field.
-
Updated
Oct 30, 2017 - Python
Apache Drill UDFs for retrieving and working with HTML text
-
Updated
Jul 28, 2018 - Java
Swift wrapper around libxml2 HTML Parser to provide SAX style HTML Parsing
-
Updated
Nov 11, 2019 - Swift
security-audit
penetration-testing
http-server
fuzz-testing
infosec
html-parsing
html-form
form-input
http-request-test
spiders
autotools
hacking-tool
webappsec
dirbuster
web-security-research
attack-surface
cgi-bin
-
Updated
Jan 7, 2019 - C
Vertretungsplan und Stundenplan des Wilhelm-Gymnasiums
android
synchronization
schedule
material-design
authentication
material-ui
school-project
timetable
html-parsing
android-app
-
Updated
Mar 30, 2017 - Java
a simple package for parsing html files into dom trees
-
Updated
Aug 24, 2021 - Go
An XML/HTML parser and serializer for JavaScript.
javascript
html
parser
json
typescript
serializer
xml
xml-parsing
html-parser
xml2json
html-parsing
xml-parser
transformation
xml2js
html2js
html2json
forgiving-xml-parser
-
Updated
Apr 9, 2021 - TypeScript
CAP (Common Alerting Protocol) XML alert format parsing, HTML parsing, inserting new alerts into database, OneSignal (possible Android and iOS push notifications), Twitter, Facebook, MailChimp (e-mail notifications) for project of open source solution for natural disasters early-warning.
social-media
twitter-api
mailchimp
xml-parsing
facebook-api
html-parsing
common-alerting-protocol
mailchimp-api
onesignal
onesignal-notifications
early-warning-systems
early-warning-signals
-
Updated
Mar 8, 2022 - Python
Improve this page
Add a description, image, and links to the html-parsing topic page so that developers can more easily learn about it.
Add this topic to your repo
To associate your repository with the html-parsing topic, visit your repo's landing page and select "manage topics."
I have mostly tested
htmldate
on a set of English, German and French web pages I had run into by surfing or during web crawls. There are definitely further web pages and cases in other languages for which the extraction of a date doesn't work so far.Please install the
dateparser
library beforehand as it significantly extends linguistic coverage:pip
orpip3 install -U dateparser
or `pi