so @rphsoftware alerted me that PHP has a "Gender" class which has constants for all 109 genders, which are and I quote:

FEMALE
MOSTLY FEMALE
MALE
MOSTLY MALE
UNISEX
COUPLE
NOT FOUND
ERROR
ANY
BRITAIN
IRELAND
USA
SPAIN
PORTUGAL
ITALY
MALTA
The reason there are a bunch of countries in there is because the way it works is that it has a bunch of databases of (country,name)->gender probabilities.

So you call Gender::get("foone",USA) to get back a gender ID of some sort.
oh sweet gendered jesus, they are namespaced. So you have to actually pass like Gender::FRANCE.
the example code also shows that the IS_A_COUPLE value is returned when a name is both male and female

as opposed to when it's unisex, which is... different?
ahh, this is in PECL. Not the PHP core.
PECL is the CPAN/PyPi/rubygems/npm equivalent for PHP.
also it doesn't look like it's been significantly touched since about 2007, other than someone fixing an invalid free() in php_gender_obj_destroy() back in 2015.
thank fuck for that. I hate it when people exploit memory bugs in my gender destruction functions and sideload Doom into my gender.
https://twitter.com/bnjmnjhn/status/1343293580392214531
ANYWAY the manual is here:
https://www.php.net/manual/en/class.gender.php

and the module & source is here:
https://pecl.php.net/package/gender/0.6.0
I should launch a service like Pronoun Island ( http://pronoun.is ) for people's profiles, but base it on PHP's gender class.
so if they have phpgender=70 (which matches GENDER::IS_FEMALE) in their profile, their pronouns are she/her,
if it's 109 (GENDER::IS_MOSTLY_MALE) they're he/him/they/them,
and if they're 3 (GENDER::USA) they're free/dom
personally I use 63 (GENDER::IS_UNISEX_NAME) so they/them, though I'm somewhat genderfluid towards 69 (GENDER::ERROR_IN_NAME), with pronouns nice/ERROR: missing null terminator in string! 0▒▒f▒▒$\\D▒▒f▒▒K▒B▒B ▒B(▒D0▒A8▒D@▒8D0A(B BBH▒▒▒▒▒▒P@▒▒▒▒▒▒L▒▒▒g
;kM:hE6uI5lR:sT;wV9vR,{Y5YPEiVFwZDmSKzcIygWmiVyogd`Ql=Aï%.â*.ç(.ç<&û<%ì%1ê+2Æ$3ô*6å46Æ/8ú=&ì6"çD(ùF'åJ4ÆT3ªH'⌐T)╢W)»U2║d+╡j5É[1─\\+╟h+╠u,╙x-╟i4╔v7╘y4╤l-αy>ù:CïVGågHåjVëuYûxXömL▓pLêwhÄwm⌐eS╚xF╩vK╔wT{äm╪ç5▄ù5╤å1ßÖ:π¿:τ▒=▐ñ6┤çUÿàiÿêxÄëvºòy¡Åo¡ûo╚çX╤ÄPΦ½I(2/5)
annoying twitter/gender edge case #209:
your pronouns are too long to fit in a single tweet
Anyway I think the real takeaway we should have from this PHP gender mess (besides the obvious two of "don't use genderizing code" and "probably don't use PHP") is that we really should adopt the mostly-genders.
seriously, if you look at the retweets of this, it's mostly people saying they're MOSTLY_MALE or MOSTLY_FEMALE.
Clearly the two most needed genders for gender selection forms are the mostlies.
also I'm disappointed in the fact that no one has called me out on my numerical error in the first post.
I say "109 genders" but my screenshot shows both a 0 and a 109, which means it'd be 110 genders.
but even that is wrong, because they're not contiguous.
if you check the documentation and count them all up, there's 62 GENDER:: constants.
I really don't know why they're not contiguous.
countries are 0-53, but then the gender return values like IS_MOSTLY_MALE are 32, 63, 67, 69, 70, 77, 102, and 109.
clearly, muchly like Dmitri Mendeleev looking at the periodic table and guessing at the existence and properties of not-yet-discovered elements, we can predict this means t here are AS YET UNDISCOVERED GENDERS in these gaps
genders 64-66 are unknown, as is gender 68, genders 71-76, and unstable radioactive genders in the 78-101 and 103-108 area
and some gender physicists believe there may be a metastable island of gender stability up in the 140-160 area, with exotic genders that can maybe someday be created in high-powered gender supercolliders.
BTW, the fact that GENDER::NAME_NOT_FOUND is 32, instead of >53? it means it overlaps with a country.
fun fact: Every person in GENDER::KOSOVO identifies as GENDER::NAME_NOT_FOUND
Anyway the fact there are only 53 countries (with one "COUNTRY_ANY") in the pseudo-enum means that the majority of the 193 countries in the world are not allowed to have gender.
here's a question, mainly to myself:
What's the most populous country in the world that's not represented by a PHP gender?
It's Indonesia. Population: 270 million as of July 2019.

Not allowed to have a PHP GENDER.
Fourth most populated nation in the world, completely overlooked by PHP.
What a shame.
#5-#8 (Pakistan, Brazil, Nigeria, and Bangladesh) similarly are not allowed to have PHP GENDERS.
anyway the real failure of this code is this part right here
specifically, this part HERE.
they're coding these genders as integers!
integers! Do you know how computers store integers?
and if there's one thing you should understand as like step zero to understanding gender, it's that it's not a binary.

speaking as a non-binary computer scientist, I am continually bothered by this.
anyway this isn't even a new idea. people have been building Genderizers for years. I have one called Personator 3 (for Windows!) https://twitter.com/Foone/status/1052023503140999168
This sort of software is always fundamentally a bad idea. If you need gender for your database, ask it, don't guess it.
And here's the other thing: You almost certainly don't need it.
The least bullshit reason why you might want it is for titles, because you're expecting to send mail to Mr. Dave Johnson or Ms Sarah Habbershank.

And if that's what you want to do? Just ask for what title they use.
Don't guess it from the gender, and especially don't guess it from the gender you guessed from the name!
And you don't even need to consider genders outside the traditional two to realize why your titling system will completely fail if you're powering it off genders:

FUCKING DOCTORS
Doctor Sarah did not spend 8 years in Gendered Medical School to be addressed as "Miss", thank you very much.
anyway the dark secret of why like 95% of sites have a gender option on sign up... it's just for ads.

Advertisers want to gender-target ads like we're all 7 year olds and they want to sell transforming robot action figures to the boys and Magic Talking Barbie to the girls
for reference, it has been three (3) days since I last purchased a Magic Talking Barbie, but I bought it to disassemble and reverse engineer it? So to simplify based on the advertising stereotypes, my gender is Complex & Confusing. https://twitter.com/Foone/status/1343035443722420225
"What's your gender?"
"I'm a reverse engineer."
"Uhh, what's your sex?"
"Can't, too busy hacking this Barbie."
"No, I mean, what's in your pants?"
"Screwdrivers!"
new bad-gender-form idea: it's a numeric field, and displays the PHP Gender Constants for reference
I need to make a site which is a combination of @GenderRate and r/BadUIBattles, where you can create and submit bad ideas for gender entry fields
I'm not even sure what this would mean.
I'm too lazy to do it but one form idea would be just the emoji flags of all the nations in the PHP GENDER class
my pronouns are 🏳️‍🌈/🏳️‍⚧️or 🇺🇸/🇵🇱
"Hey, I gave you that keyboard for Foone, were you able to give it to 🏳️‍⚧️?"
"Oh, I tried to give it to 🏳️‍⚧️, but 🏳️‍🌈 said 🏳️‍🌈 didn't have room in ⚧️ bag to take it home today. I left it on ⚧️ desk so 🏳️‍🌈 can take it home tomorrow."
(reposted after corrections because emojipronouns are confusing and I was mixing up my 🏳️‍🌈 and 🏳️‍⚧️)
for reference, the "reading key" for this is:
🏳️‍🌈 = they
🏳️‍⚧️= them
⚧️ = their
also in the compose window twitter spaces out the trans flag really oddly. the pride flag, too, and differently.
also if you copy that post and try to re-compose it, like I did, it turns into:
Anyway the compose thing is because twitter has... funky emoji support. they sometimes have "emoji" which aren't really emoji yet, or support emojis that aren't yet finalized. They've got their own code to handle those.
in other words they're basically just forum smilies from the early 2000s.
WHEN ARE ALL OF THESE EMOJIS GONNA BE AVAILABLE, UNICODE CONSORTIUM?
Anyway... ⚧️ has been in unicode since 2005, but as a symbol, not an emoji. It was turned into an emoji this year.
🏳️‍🌈 has been in since 2016.
🏳️‍⚧️ was added this year.
⚧ is U+26A7 Male with Stroke and Male and Female Sign, and is the only one that is a single separate codepoint.
The others are all flag-joiners.
🏳️‍🌈 is 🏳️, a Zero Width Joiner codepoint, then 🌈

🏳️‍⚧️ is the same, except it uses ⚧ instead of 🌈
And twitter is doing some Magic to make sure those work (at least on desktop: on mobile it often just doesn't) despite the fact your OS/browser/fonts may not support them yet.
and if you can't actually read that post, it's:

Rainbow flag is Waving white flag, a Zero Width Joiner codepoint, then Rainbow

Transgender flag is the same, except it uses Transgender Symbol instead of Rainbow
ZWJs are the big idea that has allowed Emojis to expand in amazing amounts without requiring thousands of new codepoints.
Like, we don't need separate codepoints for "Couple in love" "man and woman in love", "woman and woman in love", and "man and man in love".
There's just 💑 with a ZWJ and some gender specifiers.
so like 👨‍❤️‍👨 , the "two men in love" emoji, is actually just MAN + ZWJ + HEART + ZWJ + MAN
And the same thing works for emojis like "health worker".
You've got the generic gender-neutral🧑‍⚕️, male 👨‍⚕️, and female 👩‍⚕️
They're all just Person/Man/Woman ZWJ'd with a ⚕️ symbol
FUN FACT: All nation-flags work similar to this, but in a slightly different way, technically.
Instead of using ZWJ, there's a set of 26 codepoints called "Regional Indicator Symbols".
And they're equivalent to the letters A-Z: So you just use them to spell out... well not Countries, exactly, they're Regions in the Common Locale Data Repository.
but like you combine REGIONAL INDICATOR SYMBOL LETTER F and REGIONAL INDICATOR SYMBOL LETTER R to get 🇫🇷
(that's the flag of france, if your emoji support is lacking)
Basically this means that Unicode doesn't need to be updated every time countries change names or flags.
They're not exactly countries because two of the flags aren't countries: The UN flag 🇺🇳and the EU flag 🇪🇺
It also provides a nice fallback: If your display device doesn't have the flag of, say, Qatar handy? It can just fallback to QA which is hopefully at least somewhat clear what nation it means.
Anyway the same ZWJ trick is used for emoji skin-colors.
You take emojis and ZWJ them with the EMOJI MODIFIER FITZPATRICK TYPE codepoints, the Fitzpatrick Scale being a scale of skin color.
So you take the MAN codepoint 👨, add ZWJ and Fitz-4, and you get 👨🏽. Use Fitz-6, and it's instead 👨🏿
Two interesting things about the Fitzpatrick scale:
1. It was designed to classify skin based on response to ultraviolet light, so it's actually about how how long it takes to get a sunburn and if you can tan.
2. There are 6 values of the Fitzpatrick scale, 1-6:
Unicode only has 5. They merged 1 and 2 together.
anyway my next terrible keyboard is going to be powered by PHP's Gender class.
It maps the 8 gender responses (IS_FEMALE through ERROR_IN_NAME) to 3 bits, and you just have to enter your intended letters to type in binary by combining them... by entering names.
so if you type DAVE SARAH ALEX it maps it to IS_MALE, IS_FEMALE, IS_UNISEX_NAME, or 010 000 011, which is scancode 131 (0x83), or... F7?
this is the second worst keyboard idea I've ever had.
And @i_a_r_n_a just discovered THE SECRET MEANING BEHIND THE GENDER NUMBERS! The original C code this is ported from assigns LETTERs to the gender responses, not numbers. The PHP numbers are the ASCII values! https://twitter.com/i_a_r_n_a/status/1343997131602817025
anyway, as my Magic Talking Barbie would say (or will say, once I finish Hacking her)...

Gender Is Hard, Let's Go Get Coffee!
You can follow @Foone.
Tip: mention @twtextapp on a Twitter thread with the keyword “unroll” to get a link to it.

Latest Threads Unrolled:

By continuing to use the site, you are consenting to the use of cookies as explained in our Cookie Policy to improve your experience.