[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: FOSS tool to do general stats from text indata



On Fri, Jun 30, 2023, 8:32 AM Emanuel Berg <incal@dataswamp.org> wrote:
Nicholas Geovanis wrote:

>>> If you have python programming skills, you might consider
>>> NLTK
>>
>> Unbelievable if there are no such tools anywhere already,
>> but I don't have one either so maybe there aren't then?
>>
>
> There's a big subject called computational linguistics.
> They have some specialized tools for what they call corpus
> analysis. Because you mentioned statistics you threw
> everyone off :-) And I really like R.

Okay, so now we are getting somewhere. The technical term and
scientific field of this activity is known as computational
linguistics, and the guys that do that do corpus
analysis. Sweet!

Two standard text books are Foundations of Computational Linguistics by R Hausser, and Computational Linguistics: An Introduction by R Grishman.

Syntactical analysis of human and artificial (programming) languages is well known. But how do you attach meaning to the symbols? Semantics. How do you identify style and emphasis? These are the kind of starting points for computational linguistics.

--
underground experts united
https://dataswamp.org/~incal


Reply to: