Allison's bookmarks (tagged language)

Wikifunctions

"Wikifunctions is a Wikimedia project for everyone to collaboratively create and maintain a library of code functions to support the Wikimedia projects and beyond.... We are currently primarily focused on functions related to Wikidata Lexemes. The Lexicographical data from Wikidata and functions to process it are essential for the goal of an Abstract Wikipedia." lots of interesting implementations of nlproc-related stuff!

programming language nlproc

Saved 2025-07-07T21:05:55.109416Z

Artificial intentionality - by Rob Horning - Internal exile

too many good quotes from this, among them: "[T]here are no labor shortcuts for caring, in and of itself, no stretching a little bit of intentionality to provide focused attention across some ever increasing population. Care doesn’t scale; cruelty does. You can’t automate your way around the infinite obligation to the other."

ai communication language

Saved 2024-10-07T16:38:10.951635Z

Getty Vocabularies (Getty Research Institute)

"structured resources for the visual arts domain, including art, architecture, decorative arts, other cultural works, archival materials, visual surrogates, and art conservation" tons of fun stuff in here, love me a controlled vocabulary (via data is plural via lynn cherny)

datasets language art

Saved 2024-09-18T16:20:39.469541Z

The Encyclopedia Project, or How to Know in the Age of AI - Public Books

"[W]hat is currently sold to us as “Artificial Intelligence”... is neither intelligent nor entirely artificial, yet it’s pumping the internet with automated content more quickly than you can fire an editorial office. No system predicated on these assumptions can hope to discern “misinformation” from “information”: both are reduced to equally weighted packets of content, merely seeking an optimization function in a free marketplace of ideas. And both are equally ingested into a great statistical machinery, which weighs only our inability to discern."

epistemology ai text language internet culture

Saved 2024-06-13T22:48:17.355961Z

Poetix – Post Position

Nick Montfort's poetics: "Writing very small-scale computational poems allows me to learn more about computing and its intersection with language and poetry. Not computing in the abstract, but computing as embodied in particular platforms, which are intentionally designed and have platform imaginaries and communities of use and practice surrounding them."

poetics poetry text language computation

Saved 2024-03-02T22:00:42.151663Z

New Words

"a speculative research project exploring the use of machine learning for the evolution of language. Large language models (LLM's) are fantastic at capturing our language as it currently is - but language is constantly evolving and adapting. Can machine learning help us create something truly new and unbounded by its training data?"

poetics machinelearning text language

Saved 2023-12-15T00:59:10.317819Z

An Interview with Kimberly Alidio, Author of Teeter – Nightboat Books

"Working with or against writing systems and what other poets and artists have done with them, we learn something vital about language as it relates to identity that isn’t taught in critical ethnic studies classes or by community elders or culture workers. Or in an MFA poetry workshop, for that matter. And what poets know about language and identity that people whose institutional job or mission it is to know about language and identity do not know is in the poet’s work, in the poems. "

poetry poetics language writing

Saved 2023-11-14T15:37:07.564189Z

dell-research-harvard/AmericanStories · Datasets at Hugging Face

"a collection of full article texts extracted from historical U.S. newspaper images [that] includes nearly 20 million scans from the public domain"

datasets corpora language text history

Saved 2023-09-13T18:51:55.137457Z

Pronouns as Linguistic Care Work | Linguistic Society of America

'None of this is predicated on “trying not to misgender someone” or even “trying not to mess up pronouns accidentally and get yelled at.” Linguistic care work, like any care work truly based in principles of a loving community, cannot run on shame-based fuel. Avoiding shame and harm are only the barest, most basic bar to clear—they do not constitute showing affection. Failing to abuse someone isn’t the same as loving them.'

language gender linguistics care

Saved 2023-01-17T21:33:34Z

How a flawed idea is teaching millions of kids to be poor readers | At a Loss for Words | APM Reports

if you think horses and ponies are the same thing, and are content for children to remain ignorant of this fact, you live in a world devoid of wonder and joy

education reading phonics teaching spelling language poetics

Saved 2022-05-06T21:24:22Z

Strange Horizons - Deep Wheel Orcadia by Harry Josephine Giles By Cat Fitzpatrick

incredibly insightful review. "Their exact words, not just their paraphraseable meaning but their precise choices of phrasing, become full of comprehensible information about character, and this gives the characters themselves an unusual reality and presence. As in all good poetry, it is the language itself, and not just the plot and worldbuilding, that makes us care."

poetry poetics scifi language linguistics literature

Saved 2022-04-04T20:07:10Z

Pre-Surrealist Games | MetaFilter

wonderful list of resources relating to early forms of exquisite corpse

wordgames language surrealism poetics

Saved 2022-03-16T17:53:07Z

Some Georgian and Victorian Acrostic Puzzles: Precursors to Crosswords | MetaFilter

wordgames poetics language text

Saved 2022-03-07T17:02:56Z

Brendan Howell – Rustic Computing

generating from a markov model by hand

language languagemodels poetics text

Saved 2022-01-14T18:53:09Z

actionscoregenerator - Nathan Walker - Performance Artist

"a website that produces event scores for performance. The material objects, locations and activities within each score are based on the performance archives of Nathan Walker between 2009-2014 and work towards shuffling and redistributing the archival record to create an anarchive."

text language poetics generative fluxus

Saved 2021-11-19T16:20:45Z

etymology - Rhetoric vs. Mathematics: ellipsis/ellipse, parable/parabola, hyperbole/hyperbola - English Language & Usage Stack Exchange

rhetoric and conic sections. this is amazing

language math

Saved 2021-10-08T22:51:22Z

Blabrecs

"BLABRECS is a rules modification for the wordgame SCRABBLE that swaps out the dictionary of real-if-obscure English words for a capricious artificial intelligence. In BLABRECS, real English words aren't allowed! Instead, you have to play nonsense words that sound like English to the AI. These nonsense words are called – you guessed it – BLABRECS."

machinelearning wordgames text poetics language games

Saved 2021-09-04T02:43:22Z

alphacep/vosk-api: Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node

"Vosk is an offline open source speech recognition toolkit. [...] Vosk models are small (50 Mb) but provide continuous large vocabulary transcription, zero-latency response with streaming API, reconfigurable vocabulary and speaker identification." Bindings for various languages, "scales from small devices like Raspberry Pi or Android smartphone to big clusters."

speech nlproc text language poetics

Saved 2021-07-20T05:15:40Z

TextOCR

"TextOCR provides ~1M high quality word annotations on TextVQA images allowing application of end-to-end reasoning on downstream tasks such as visual question answering or image captioning."

text ocr datasets language poetics machinelearning

Saved 2021-05-17T21:47:42Z

Layout Parser

could be fun to play with. "With the help of state-of-the-art deep learning models, Layout Parser enables extracting complicated document structures using only several lines of code. This method is also more robust and generalizable as no sophisticated rules are involved in this process."

language design layout python machinelearning materialoflanguage

Saved 2021-04-30T15:34:18Z

100,000 Podcasts: A Spoken English Document Corpus - ACL Anthology

"a noisy but fascinating collection of documents which can be studied through the lens of natural language processing, information retrieval, and linguistics"

datasets language poetics podcasts audio speech

Saved 2021-03-29T15:48:21Z

SpeechBrain: A PyTorch Speech Toolkit

includes pre-trained models for a bunch of interesting tasks: speech recognition, speaker recognition, speech enhancement, speech processing (including multi-microphone processing)

machinelearning speech language

Saved 2021-03-22T22:02:19Z

Nothing Breaks Like A.I. Heart

machinelearning writing poetics language languagemodels

Saved 2021-03-13T22:24:51Z

Hateful Memes Challenge winners

"Hate speech can come in many forms, including memes that combine text and images. This kind of multimodal content can be particularly challenging for AI to detect because it requires a holistic understanding of the meme." that is not the reason that hate speech is difficult to detect, and it's actually harmful that you think it's the reason, sorry

language culture machinelearning nlproc hatespeech

Saved 2021-01-03T22:42:05Z

on vocal cloning — Are.na

everest's bibliography on text-to-speech and vocal cloning

language nlproc voice speech

Saved 2020-11-11T16:46:03Z

Jurafsky & Martin chapter on constituency grammars

language linguistics syntax poetics text

Saved 2020-08-19T13:20:29Z

Introduction to Semantics | teaching materials by Maria Esipova, Nadine Theiler and Lucas Champollion

"complete sets of teaching materials for an undergraduate-level introduction to semantics"

language linguistics semantics syllabus

Saved 2020-08-19T13:19:52Z

Speech Accent Archive

"a large set of speech samples from a variety of language backgrounds. Native and non-native speakers of English read the same paragraph and are carefully transcribed" (close IPA transcriptions)

language linguistics

Saved 2020-08-07T19:18:35Z

Scientists rename human genes to stop Microsoft Excel from misreading them as dates - The Verge

bglkjawbflablfhbawefjh

language text programming poetics biology

Saved 2020-08-07T19:15:31Z

Chinese WeChat Users Are Sharing A Censored Post About COVID-19 By Filling It With Emojis And Writing It In Other Languages

"[T]o avoid the censorship, people have converted parts of the interview into Morse code, filled it up with emojis, or translated it into fictional languages like Sindarin from The Lord of the Rings or Klingon from Star Trek. In one particularly creative example, someone inserted it into the iconic opening crawl of Star Wars."

language text poetics politics censorship china

Saved 2020-03-14T18:14:50Z

The internet could learn a lot from the VR sign language community | Rock Paper Shotgun

"Even the fancier controllers of Valve’s Index kits don’t let you separate your fingers to produce the Ws or Vs necessary for some words. [...] It’s a lovely avenue of human connection, but I can also imagine linguists frothing over VR sign language. There’s a great example in Syrmor’s video where a currently learning interpreter called Quentin explains that because the W restriction means they can’t use the normal word for ‘world’, they instead mimic the appearance of a portal opening up in VR. They’ve also got different ways of signing words depending on your gear, which is both fascinating and mildly concerning." that must feel weird

mol asl language poetics vr internet

Saved 2020-02-04T14:42:06Z

Deaf Anime Girl In VR Talks About Getting Bullied - YouTube

asl language poetics internet culture

Saved 2020-02-04T14:40:34Z

文言 / wenyan‑lang

"an esoteric programming language that closely follows the grammar and tone of classical Chinese literature. Moreover, the alphabet of wenyan contains only traditional Chinese characters and 「」 quotes, so it is guaranteed to be readable by ancient Chinese people." (from one of Golan Levin's students)

programming chinese language text poetics

Saved 2020-01-01T16:55:09Z

mhagiwara/github-typo-corpus: GitHub Typo Corpus: A Large-Scale Multilingual Dataset of Misspellings and Grammatical Errors

"a large-scale dataset of misspellings and grammatical errors along with their corrections harvested from GitHub. It contains more than 350k edits and 65M characters in more than 15 languages, making it the largest dataset of misspellings to date."

datasets language text poetics

Saved 2019-12-11T16:06:06Z

🦄🤝🦄 Encoder-decoders in Transformers: a hybrid pre-trained architecture for seq2seq

this looks promising

machinelearning nlproc text language

Saved 2019-12-10T19:04:02Z

alexwarstadt/blimp: The Benchmark of Linguistic Minimal Pairs

"a challenge set for evaluating what language models (LMs) know about major grammatical phenomena in English" it warms my heart to see an ngram baseline in there, haha

language linguistics text nlproc

Saved 2019-12-09T22:04:59Z

Teaching AI Feminism and Making Art

"I run datasets of iconic feminist texts through a simple textRNN, generating new feminists texts in the legendary words of bell hooks, Simone De Beauvoir, Betty Friedan and Audre Lorde. Some are funny. Some are poetic. Some make no sense at all and some are way too real. Information about the model and settings can be found under each post."

poetics text machinelearning language generative feminism theory

Saved 2019-12-06T16:57:07Z

Autocat

very cool online text classifier generator (just upload your data and then you can pip install your model!)

nlproc programming language

Saved 2019-12-01T21:54:11Z

Semantic Specialization of Distributional Representation Models

another tutorial from emnlp-19

nlproc language semantics poetics text

Saved 2019-12-01T21:34:55Z

Data Collection and End-to-End Learning for Conversational AI

overview + materials for emnlp-19 workshop

nlproc language poetics chatbots

Saved 2019-12-01T21:32:18Z

Measuring gender imbalances in reporting on the creative industries

language text nlproc dataviz

Saved 2019-11-22T16:43:29Z

Joel Simon on Twitter: "New work in my Dimension of Dialogue series :) Two neural nets learn to communicate through their own emergent visual language. The resulting alphabet is a product of their adversarial and cooperative relationship. Here set in clay

"Two neural nets learn to communicate through their own emergent visual language."

text language machinelearning poetics mol

Saved 2019-10-28T16:27:05Z

Universal Dependencies

"Universal Dependencies (UD) is a framework for consistent annotation of grammar (parts of speech, morphological features, and syntactic dependencies) across different human languages. UD is an open community effort with over 200 contributors producing more than 100 treebanks in over 70 languages."

datasets nlproc language text poetics

Saved 2019-10-15T14:11:54Z

Jenny Holzer Hits Her Mark in a Major, Largely Unnoticed Retrospective

"Artists aim differently than sharpshooters. They are not typically trying to take something out, but to draw something out. The mark Holzer hits in this case is the mark in the most cave-drawing sense: the effort to leave (or find) a trace of something that is not an opinion, but a register of some kind, certifying a lived experience. There may be no such thing as a permanent record, but the fact that the Washington Post contributor found Holzer’s work dangerous is a sign in and of itself that it has achieved one of its goals: it has carved a deep enough mark to leave a strong impression (for that writer, a menacing one). That’s the most any language or other kind of mark-making can hope to accomplish."

art politics language text poetics

Saved 2019-09-03T15:52:22Z

Wiktionary:Frequency lists - Wiktionary

wiktionary word frequency lists

language data text poetics

Saved 2019-06-13T19:08:24Z

Scotch Snaps in Hip Hop - YouTube

really remarkable speculation/scholarship on rhythmic patterns in music and language

language linguistics music phonetics

Saved 2019-05-20T18:05:21Z

Cancel Culture: The Internet Eating Itself | shattersnipe: malcontent & rainbows

"The problem with the internet is that takes up all three areas on a Venn diagram depicting the overlap between speech and action, and while this has always been the case, we’re only now admitting that it’s a bug as well as a feature."

internet culture history language

Saved 2019-04-09T18:53:23Z

Workshop “Linguistic investigations beyond language: gestures, body movement and primate linguistics” |

well this looks fascinating. “How to do things with nonwords: communication, expression, and meaning” “Musical gestures in the typology of linguistic inferences” “Iconic modulation in spoken language: iconicity, intensification, or both?” etc

language linguistics pragmatics

Saved 2019-03-20T13:53:02Z

Tsvetshop: Home

"Yulia Tsvetkov's research group at Language Technologies Institute of Carnegie Mellon University. Our work focuses on natural language processing, particularly cross-lingual approaches, low-resource settings, and social good."

language poetics text machinelearning nlproc

Saved 2019-03-18T18:58:08Z

Re:MARK - Interactive Art by Golan Levin and Collaborators

"...the fiction that speech casts visible shadows. [...] converts speech into whimsically animated letters and shapes that appear to float upwards from the shadow of the speaker's head. Visitors can also manipulate these forms directly, using the shadow of their own body. When a phoneme is recognized by the software with sufficient confidence, it is spelled out on the installation's display."

sound installation materialoflanguage language poetics art

Saved 2019-03-12T20:51:17Z

Heather Dewey-Hagborg | Unlanguage

"In this interactive installation participants enter the first word that comes to their mind in one of two input terminals in any language. These words are then the seed of a generative process that develops a poem, bifurcating and mutating, merging languages, poetic styles, sense and nonsense. Poems overlap and degrade over time, eventually fading away. Phonetics are remapped to a new alphabet of sound referencing the body and incidental noises, creating a unique expression for each word and making literal the arbitrariness of the language. This installation was projected on a massive scale covering the walls and ceiling and filling the hall of the old imperial castle in Poznan, Poland. This video shows a demonstration of the generated poetry."

text language poetry poetics materialoflanguage

Saved 2019-03-12T20:37:17Z

languagemodeling.pptx - languagemodeling.pdf

dan jurafsky intro lecture

computational linguistics language text data

Saved 2019-02-26T16:54:52Z