Skip to content
#

tokenization

Here are 365 public repositories matching this topic...

Ekphrasis is a text processing tool, geared towards text from social networks, such as Twitter or Facebook. Ekphrasis performs tokenization, word normalization, word segmentation (for splitting hashtags) and spell correction, using word statistics from 2 big corpora (english Wikipedia, twitter - 330mil english tweets).

  • Updated Feb 8, 2021
  • Python
firthmj
firthmj commented Jan 21, 2021

How easy would it be to change the library to have versions of the encode and decode functions where the payload JSON was provided / returned just as the JSON text?

There are other good JSON generation / parsing libraries available, and some people may wish to use them to generate or process the payload, rather than the built in claim processing.

Improve this page

Add a description, image, and links to the tokenization topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the tokenization topic, visit your repo's landing page and select "manage topics."

Learn more