cadadr: Selfie, I am wearing a coat, a hoodie, an orange beanie, a pair of round glasses. I have light skin, dark hair, dark beard (tho with natural highlights around my chin and in my moustache). Behind me a street with greenery on the one side and houses and parked cars on the others. (Default)

In case anybody out there was using my RSS feeds generated from some Hacettepe University announcements pages and from the updates of LingBuzz, they've become offline, because I have deleted my Gitlab account. The reason is simple, Gitlab is a terrible website and it's a hassle to deal with it, so I deleted my account, having run out of patience.

In case you want to take over, this is the source code for the LingBuzz scraper. Note that the readme is out of date, the repo is a fork and the readme pertains only to the original code. My fork applies a couple fixes to the JavaScript, and I used to run it on Gitlab CI, generating RSS inside a Gitlab Pages website. The Hacettepe scrapers worked similarly but their source code isn't public for now, and honestly you don't want it anyways, it's a few lines of Ruby and Nokogiri, and you are better off just periodically looking at the pages. Assuming there were any users of the scripts besides me, of course. But life is full of surprises, isn't it :D

cadadr: Selfie, I am wearing a coat, a hoodie, an orange beanie, a pair of round glasses. I have light skin, dark hair, dark beard (tho with natural highlights around my chin and in my moustache). Behind me a street with greenery on the one side and houses and parked cars on the others. (Default)

Just listened to @nasser@merveilles.town's talk on multilingual programming, titled "A Personal Computer for Children of All Cultures" again.

As a (for now) linguistics student I really like this talk and highly recommend it. But also as a linguistics person coming from a programming background, it has me thinking and I have some questions and ideas I want to voice, with the belief that asking these questions early on in a project like Ramsey's will help us design these solutions such that in departing from the domination of English in programming languages and communities, we don't involuntarily find ourselves in another form of inequity's dominion: that of monolingualism, which itself comes from the exact same source as English's global dominance and destructive status.

First of all, I think the next step / next big question here is how to enable bilingual programming, code switching in code.

Code switching is extremely common, and in ways we don't often think it exists. E.g. languages have registers and styles, and we go between these pretty frequently (e.g. formal to informal, programmer jargon to kitchen jargon to just small talk [hehe] vocabulary), besides switching between more major linguistic varieties, like what we call languages and dialects (which are political terms and not linguistically sound, but I'll avoid that discussion here).

Could that happen in code within this framework?

So my languages are Turkish, English, and Italian. With Ramsey's ideas, I can write modules that are in one language or another, and my whole program can be multilingual. But could it be possible for a declaration, say the body of a function to be code switching between Turkish and English? I could of course do that with "local identifiers", using Ramsey's terminology, but could I also do it with keywords and external identifiers? Because it's very common for a bilingual community to do code switching not only at conversation or whole text level, not only between sentences, but even mid-sentence.

So imagine:

int main (void) {
    const char* w = "world";
    puts(sprintf("hello, %s", w));
    return 0;
}

How could we allow, then:

sayma_s baş (boş) {
    sabit harf* m = "il mondo";
    puts(sprintf("ciao, %s", m)); 
    ritorna 0;
}

which starts out with Turkish but outputs and ends with Italian, and has some English identifiers in the middle. (There's also the %s in there which is a complicating factor, as it definitely comes from the English string, but that can be completely replaced with something like string interpolation probably.)

This is a toy example of course, but there can be real-world situations where this becomes a cultural question. Imagine me collaborating with an Arabic/Armenian/Greek/Kurdish-speaking programmer on a given module as a speaker of Turkish. There's a cultural domination/injustice relationship there, and every time we decide on a module's language, that'll come into play as I'm relative to them, privileged. And it's not only a me-question, as it's likely that this decision takes place in Turkish-dominated spaces in Turkish-dominated conurbations and political settings.

And then a related question is of course what linguistic varieties get access to being a "language" versus a "dialect" versus an "argot/jargon/style/slang" and similar. None of these categories are scientifically sound, they are all political. Which is why we invent terms like "variety", "register" and similar in linguistics, because the structural properties are seldom what political properties capture.

This of course leads us on to the question of how we encode linguistic varieties, how do we decide which linguistic variety is active for a given snippet of code at each level, and how do we do this without making it difficult so that the devised solutions don't lead English or some other lingua franca to take over all other practical uses of the solution. Yes we have international codes for languages, but they are also centrally gatekept by institutions of the Western world, and they carry the same (de)politicising linguistic ideologies that today govern the statuses and the status quo regarding which varieties get to be called languages and which dialects, which get representation and which are devalued, which are kept around and which are left to wither.

Another question is how this maps to existing ways of combining multiple programming languages, because it poses both opportunities and challenges.

E.g. we readily use the ironically named FFI's to communicate across programming-linguistic boundaries, so using extern "C" or it's analogue in many programming languages, you can combine them at some level. And there are other facilities, like RPy, Pymacs, and similar. I think reworking these a little bit should actually really help with going beyond human-linguistic boundaries in programming too.

For example new ABIs can be developed for existing libraries that do not use the English names, but some other identifiers, hashes or otherwise. I believe (as a fairly inexperienced programmer when it comes to anything beyond small stuff and scripting, but still) that there should be ways to incorporate the existing codebase the world has developed into an emergent multi-human-lingual paradigm of programming without simply having to rewrite it all.

But also we have other ways of multi-programming-lingual combination, or code switching, if you will. These manifest themselves in the likes of Knuth's literate programming or Emacs' Org Mode's and Rmarkdown's similar-but-not-exactly-the-same mechanisms. Could we exploit these systems' ideas in developing programming environments that can combine multiple human languages and multiple programming languages? Why shouldn't that be possible?

Because in Org mode, which is the system I'm most familiar with at this point, the programming languages bit is at least possible, practical, and also highly useful. For example consider this setup script I have for my Raspberry PI which combines Emacs Lisp and Bourne Shell programming languages liberally, using Org Mode's mechanisms for doing so. (You can search for begin_src in the file to explore how the two very different languages are used and combined in the literate script.)

These literate programming environments could easily be used for any compiler for a multi-human-lingual programming language/environment, that's pretty straight-forward, but what's food for thought is how such a sytem can take advantage of the ideas and tools developed by the said literate environments over the last ~50 years, despite relative obscurity among especially professional programmers.

This is all I have for now. I am really excited for a future where programming becomes customarily multilingual in both human and programming language dimensions, because as someone who is advancing towards a career in academic scholarship and as a long-time hobbyist programmer, and as a non-native speaker of English, I have personally experienced how limiting it can be when programming tools are exclusively targeted at English-speaking professionals, and what sort of things become possible once we start breaking those barriers.

I believe Ramsey's doing god's work in breaking some of these barriers with thinking about how to make programming work for all human linguistic varieties, and hope that this text here contributes some questions/ideas to consider in such efforts. Really, thank you Ramsey!

December 2025

S M T W T F S
 1234 56
78910111213
14151617181920
21222324252627
28293031   

Syndicate

RSS Atom

Most Popular Tags

Expand Cut Tags

No cut tags
Page generated Dec. 14th, 2025 04:46 pm
Powered by Dreamwidth Studios

Style Credit