[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bug#1014908: ITP: gender-guesser -- Guess the gender from first name



Hi,

On Tue, Jul 19, 2022 at 06:08:16PM +0200, Andrej Shadura wrote:
> Hi,
> 
> On Tue, 19 Jul 2022, at 16:57, Adam Borowski wrote:
> > On Tue, Jul 19, 2022 at 01:48:17PM +0200, Andrej Shadura wrote:
> >> Take Misha/Miša/Миша or Petya/Peťa/Петя.  In Russian tradition, these are
> >> very likely masculine names, from Mikhail and Petr.
> 
> > If only this piece of software had a distinction between "almost always
> > male", "leaning male", "neutral", "leaning female", "almost always
> > female"...  Oh wait, it does!
> > Precisely for the reason you mention.
> 
> No, it does not and cannot, since some names are almost always male in one culture but almost always female in another one.
> 
> >> And we haven’t yet touched the topic of people who were given non-traditional names.
> 
> > In which case it says "unknown".
> 
> No, it cannot know about cases when a person is given a name traditionally given to another gender in another culture. Pretty common in the US, for example. Sure, there probably aren’t many cases of women named Michael, but there are many other names where you wouldn’t be easily able to tell.
> 
> https://en.wikipedia.org/wiki/Category:English_unisex_given_names

Both of you make arguments based on wrong data.
If you would just tried the software (clone it from its source. No need
to install anything) you would notice that.

>>> import gender_guesser.detector as gender
>>> d = gender.Detector()
>>> print(d.get_gender(u"Andrea"))
female
>>> print(d.get_gender(u"Misha"))
male
>>> print(d.get_gender(u"Miša"))
andy
>>> print(d.get_gender(u"Миша"))
unknown
>>> print(d.get_gender(u"Petya"))
male
>>> print(d.get_gender(u"Peťa"))
unknown
>>> print(d.get_gender(u"Петя"))
unknown

So: the software can give an output "andy", that is androgynous, it also
has the output "mostly_male" and "mostly_female".

Furthermore, reading the README, I noticed I can give it more context:

>>> print(d.get_gender(u"Andrea", "italy"))
male
>>> print(d.get_gender(u"Misha", "slovakia"))
andy
>>> print(d.get_gender(u"Petya", "slovakia"))
andy

However, "unknown" means "Not in my database". It does not mean "Neither
male nor female". So it seems Adam also did not check the results before
posting.

Maybe gender-guesser belongs in Debian and maybe it doesn't. But please
try to at least look at the software before judging it.

-- 
mail / xmpp / matrix: tzafrir@cohens.org.il


Reply to: