Skip to main content

Questions tagged [computational-linguistics]

A branch of science that uses computers and mathematical methods to construct and investigate linguistic theory. Its technological and algorithmic implementation is called NLP.

Filter by
Sorted by
Tagged with
1 vote
3 answers
188 views

Languages with Nearly Uniform Character Frequencies

I am a statistician working on a curriculum with a chapter on randomness. To illustrate some concepts of randomness (namely Shannon entropy and MDL), I set up a hypothetical scenario with deciphering ...
NotAGroupTheorist's user avatar
4 votes
1 answer
145 views

Measure of efficiency of a language?

Is there a measure of a language's efficiency? Such as a ratio: (information content) ÷ (phonemes to express that information) I'm not asking about spoken language specifically (as in "Speech ...
Geremia's user avatar
  • 440
-4 votes
2 answers
94 views

Does the apparent creativity of AI, like GPT, challenge Chomsky's claim that only humans possess a uniquely creative capacity for language?

Chomsky argued that the creative use of language is a uniquely human ability, one that machines inherently lack. In Cartesian Linguistics, he built on Descartes' notion that human language, due to its ...
user49607's user avatar
1 vote
1 answer
93 views

Is something similar to orthogonality defined in linguistics?

In mathematics, two vectors are orthogonal to each other if you cannot produce one from the other using linear operations (there is a more precise definition, but this is the simplest). For example, &...
user avatar
0 votes
1 answer
85 views

Is there an error-free web-based parser that automatically draws a syntax tree from an English text?

The only web-based automatic parser I know is CoreNLP version 4.5.5, where you can put in an English text and get a constituency tree (when you select 'parts-of-speech' as Annotations and click '...
JK2's user avatar
  • 832
0 votes
0 answers
88 views

Insight into basic machine translation error from major email service

I recently ordered a product through a Japanese company (I live in Tokyo), and received an email response that due to a recent backlog of orders I should expect my product shipped within 2-3 weeks (in ...
pedalferrous's user avatar
2 votes
1 answer
85 views

Information rate of ultra-information-sparse languages

Pellegrino 2011 Also Pellegrino YoonMi These popular researches found out information density (ID) inversely correlates with speech rate (SR) to make information rate (IR) of languages cluster around ...
Raxrax's user avatar
  • 404
3 votes
0 answers
39 views

List of counter examples + statistics of Greenberg's universal

I could not find a list of counter examples/ statistics of Greenberg's linguistic universals. There are numbers that I could find relevant information on WALS. There are some I could not find anything....
Raxrax's user avatar
  • 404
0 votes
0 answers
74 views

What language has a small difference between word length of advanced VS basic vocabulary?

word length on Swadesh list Zipf's Law Basic vocabulary is used more frequently and the word length (#syllables/#segments) is generally shorter than advanced/technical/formal vocabulary. In English: ...
Raxrax's user avatar
  • 404
2 votes
1 answer
201 views

Did big languages generally have a net loss of inflectional morphology in the past 1-3 millennia and small languages the other way round?

a. R. M. W. Dixon (1998) theorizes that languages normally evolve in a cycle from fusional to analytic to agglutinative to fusional again like a clock. There are two opposing forces: one reduces ...
Raxrax's user avatar
  • 404
1 vote
0 answers
27 views

Specifically which Corpora were used by the Ofsted Research Team in designing the new curricula for MFL GCSE 2026?

Specifically which corpora were used by the Ofsted Research Team in designing the new curricula for MFL GCSE 2026?
Snowtiger77's user avatar
0 votes
0 answers
97 views

What is the information density of factual knowledge in large bodies of English text?

An ML paper I was reading mentioned an estimate of no more than 0.7 bits per word, in footnote 4: As of February 1, 2024, English Wikipedia contains a total of 4.5 billion words [...] We estimate ...
MWB's user avatar
  • 1,178
0 votes
0 answers
62 views

What is the most accurate way to parse a text so that we can get the characters and the list of sentences that refer to each character?

I'm trying to come up with a method that will take a text and parse it so that we can get all the characters and a list of the sentences from the text that have references to the character (either ...
user14269's user avatar
  • 109
-5 votes
1 answer
250 views

How has computational linguistics contributed to the preservation of endangered languages?

Computational tools and techniques had been applied to the field of historical linguistics, aiding in the analysis of old or endangered languages. This has contributed to the documentation and ...
Arunabh's user avatar
  • 105
0 votes
0 answers
71 views

how much text data an AI chatbot is based on vs how accurate its language use is

This question is motivated by a question I read on another online forum, to which the answerers said that when they tested ChatGPT's Hindi, it made grammatical errors all the time and was also trash ...
Chris Sanders's user avatar

15 30 50 per page
1
2 3 4 5
38