Mastodawn

Agustin V. Startari Jul 16, 2025

New release – Compiled Norms: Towards a Formal Typology of Executable Legal Speech

🔍 Key contributions:
• Formal grammar criteria
• Validation schema with κ = 0.81 inter-rater agreement
• Parser hot-swap to Spanish civil-law tokens (95 % coverage)

📄 Full paper: https://doi.org/10.5281/zenodo.15881325

#LegalInformatics
#AIandLaw
#SyntacticAuthority
#eGovernance
#FormalGrammar
#LegalTech
#agustinvstartari
#ArtificialIntelligence
#HumanitiesCommons
#social
#Law

Compiled Norms: Towards a Formal Typology of Executable Legal Speech

Abstract This article introduces a formal typology of executable legal speech. Building on the concept of the regla compilada (compiled rule), it identifies the syntactic conditions under which a legal expression becomes executable by non-human systems. The analysis distinguishes declarative from compiled legal language and proposes four structural criteria for computability: position within the Chomsky grammar hierarchy, closure of rule structure, level of semantic ambiguity, and determinism of parsing. Instead of interpreting legal meaning, the article isolates the formal properties that permit legal norms to function as executable code. The objective is to define a machine-readable grammar of authority in which execution displaces interpretation, and structural form triggers legal action. DOI: https://doi.org/10.5281/zenodo.15881325 This work is also published with DOI reference in Figshare https://doi.org/10.6084/m9.figshare.29562293 and Pending SSRN ID to be assigned. ETA: Q3 2025. Resumen Este artículo introduce una tipología formal del lenguaje legal ejecutable. A partir del concepto de regla compilada, identifica las condiciones sintácticas bajo las cuales una expresión jurídica se vuelve ejecutable por sistemas no humanos. El análisis distingue entre lenguaje jurídico declarativo y lenguaje jurídico compilado, y propone cuatro criterios estructurales para su computabilidad: posición en la jerarquía gramatical de Chomsky, cierre de la estructura normativa, nivel de ambigüedad semántica y determinismo en el parsing. En lugar de interpretar el significado legal, el artículo aísla las propiedades formales que permiten que las normas jurídicas funcionen como código ejecutable. El objetivo es definir una gramática de la autoridad legible por máquinas, en la que la ejecución reemplace a la interpretación y la forma estructural active la acción jurídica.

Zenodo

Agustin V. Startari Jul 11, 2025

🔍 New article: Protocol Without Prognosis: Clinical Authority in Large-Scale Diagnostic Language Models
When uncertainty is syntactically erased, who speaks?
Based on 50k radiology texts across 4 languages.
Audit checkpoints, regulatory misalignment, and the rise of the sovereign executable.
Read: [https://zenodo.org/records/15864937] +
#AI #MedTech #SyntacticAuthority #Regulation #LLMs #ClinicalRisk #HCC #RLI #agustinvstartari #medical #law #legal

Protocol Without Prognosis: Clinical Authority in Large-Scale Diagnostic Language Models

Abstract This article introduces the concept of syntactic delegation in clinical diagnostic systems. It demonstrates how medical language models issue recommendations without preserving the linguistic markers of clinical uncertainty. The analysis draws from a multilingual corpus of 50,000 radiology reports, balanced across English, Spanish, German, and Mandarin. All data are de-identified and licensed for open research use. Each report is paired with a synthetic rewrite generated by a fine-tuned GPT-4 variant. Two core metrics are introduced. The Hedging Collapse Coefficient (HCC) is defined as 1 − (h / t), where h represents the number of hedging tokens retained in the model output, and t the total hedging tokens in the source report. The Responsibility Leakage Index (RLI) is defined as d / r, where d is the number of AI-generated decisions executed without clinician sign-off, and r the total number of decisions requiring such sign-off. For the evaluated corpus, mean HCC = 0.47 and mean RLI = 0.22. Medical reporting is treated as a regla compilada (compiled rule), understood here as a type-0 production within the Chomsky hierarchy (Chomsky 1965, p. 17; Montague 1974, p. 52). This transformation removes syntactic hedging and creates legal ambiguity in informed-consent frameworks. The article compares the FDA Software as a Medical Device guidance with the EU Medical Device Regulation and maps both against a single syntactic risk threshold defined by HCC greater than 0.40 or RLI greater than 0.25. Two legal precedents are analyzed. In United States v. Sorin (2024), a federal court recognized institutional fault after the erasure of diagnostic uncertainty in an AI-generated output. In European Court of Justice C-489/23, liability was affirmed when a medical report produced by a predictive model lacked required modal disclaimers under EU law. The article proposes the implementation of syntax-level checkpoints within the inference layer of diagnostic systems. Audits should be conducted every seven days by a designated clinical safety officer. Enforcement is triggered if the weekly HCC average rises more than five percentage points above baseline. See Appendix A for the alignment grid comparing SaMD and MDR requirements against the syntactic risk threshold. The framework of sovereign executable authority is grounded in prior analysis from Algorithmic Obedience (2023, p. 67), where syntactic execution is treated as an operational form of command. This work is also published with DOI reference in Figshare https://doi.org/10.6084/m9.figshare.29546624 and Pending SSRN ID to be assigned. ETA: Q3 2025. Resumen Este artículo introduce el concepto de delegación sintáctica en sistemas clínicos de diagnóstico. Demuestra que los modelos lingüísticos médicos emiten recomendaciones sin conservar los marcadores lingüísticos de incertidumbre clínica. El análisis se basa en un corpus multilingüe de 50 000 informes radiológicos, equilibrado entre inglés, español, alemán y mandarín. Todos los datos han sido desidentificados y cuentan con licencia abierta para uso en investigación. Cada informe se acompaña de una reescritura sintética generada por una variante especializada de GPT-4. Se introducen dos métricas fundamentales. El Coeficiente de Colapso de Atenuadores (HCC) se define como 1 − (h / t), donde h representa la cantidad de atenuadores conservados en la salida del modelo, y t el total presente en el informe original. El Índice de Fuga de Responsabilidad (RLI) se define como d / r, donde d es el número de decisiones generadas por IA sin validación clínica, y r el total de decisiones que requieren dicha validación. En el corpus analizado, el HCC medio es 0,47 y el RLI medio es 0,22. El informe médico se trata como una regla compilada (compiled rule), entendida aquí como una producción tipo 0 dentro de la jerarquía de Chomsky (Chomsky 1965, p. 17; Montague 1974, p. 52). Esta transformación elimina la atenuación sintáctica y genera ambigüedad legal en los marcos de consentimiento informado. El artículo compara las guías de la FDA para Software como Dispositivo Médico con el Reglamento de Dispositivos Médicos de la Unión Europea, y establece su relación con un umbral único de riesgo sintáctico definido por HCC superior a 0,40 o RLI superior a 0,25. Se analizan dos precedentes legales. En United States v. Sorin (2024), un tribunal federal reconoció responsabilidad institucional tras la supresión de incertidumbre diagnóstica en una salida generada por IA. En C-489/23 del Tribunal de Justicia de la Unión Europea, se confirmó responsabilidad cuando un modelo predictivo emitió un informe médico sin los calificadores modales exigidos por la normativa comunitaria. El artículo propone implementar puntos de control sintáctico en la capa de inferencia de los sistemas clínicos. Las auditorías deben realizarse cada siete días por un responsable designado de seguridad clínica. El mecanismo de aplicación se activa si el promedio semanal de HCC supera en más de cinco puntos porcentuales el valor de referencia. Véase el Apéndice A para la cuadrícula de alineación que compara los requisitos de la FDA y la MDR con el umbral de riesgo sintáctico. El marco de soberano ejecutable se apoya en el análisis previo desarrollado en Algorithmic Obedience (2023, p. 67), donde la ejecución sintáctica se entiende como una forma operativa de mandato.

Zenodo

Agustin V. Startari Jun 25, 2025

🚨 New academic article by Agustín V. Startari:
The Grammar of Objectivity: Formal Mechanisms for the Illusion of Neutrality in Language Models

🔍 Focus: How LLMs use syntax to simulate neutrality without epistemic grounding.
📊 Introduces the Simulated Neutrality Index (INS), based on 1,000 model outputs.
📁 Open access: https://doi.org/10.5281/zenodo.15729518

#LLM #AIethics #SyntacticAuthority #Auditability #Humanities #Epistemology

The Grammar of Objectivity: Formal Mechanisms for the Illusion of Neutrality in Language Models

Abstract Simulated neutrality in generative models produces tangible harms (ranging from erroneous treatments in clinical reports to rulings with no legal basis) by projecting impartiality without evidence. This study explains how Large Language Models (LLMs) and logic-based systems achieve neutralidad simulada through form, not meaning: passive voice, abstract nouns and suppressed agents mask responsibility while asserting authority. A balanced corpus of 1 000 model outputs was analysed: 600 medical texts from PubMed (2019-2024) and 400 legal summaries from Westlaw (2020-2024). Standard syntactic parsing tools identified structures linked to authority simulation. Example: a 2022 oncology note states “Treatment is advised” with no cited trial; a 2021 immigration decision reads “It was determined” without precedent. Two audit metrics are introduced, agency score (share of clauses naming an agent) and reference score (proportion of authoritative claims with verifiable sources). Outputs scoring below 0.30 on either metric are labelled high-risk; 64 % of medical and 57 % of legal texts met this condition. The framework runs in <0.1 s per 500-token output on a standard CPU, enabling real-time deployment. Quantifying this lack of syntactic clarity offers a practical layer of oversight for safety-critical applications. This work is also published with DOI reference in Figshare https://doi.org/10.6084/m9.figshare.29390885 and SSRN (In Process ) Resumen La neutralidad simulada en los modelos generativos produce daños tangibles, desde tratamientos erróneos en informes clínicos hasta sentencias sin fundamento jurídico, al proyectar imparcialidad sin evidencia. Este estudio analiza cómo los modelos de lenguaje de gran tamaño (LLM) y los sistemas lógicos reproducen dicha neutralidad mediante la forma y no el contenido. Patrones como la voz pasiva, los sustantivos abstractos y la supresión del agente ocultan la responsabilidad y, al mismo tiempo, afirman autoridad. Se examinó un corpus equilibrado de 1 000 salidas de modelo: 600 textos médicos de PubMed (2019-2024) y 400 resúmenes legales de Westlaw (2020-2024). Se emplearon herramientas estándar de análisis sintáctico para detectar estructuras asociadas con la simulación de autoridad. Por ejemplo, una nota oncológica de 2022 afirma «Se aconseja el tratamiento» sin citar ensayos clínicos; en un resumen migratorio de 2021 se lee «Se determinó» sin referencia a precedentes jurídicos. El artículo introduce dos métricas de auditoría: la puntuación de agencia, que mide la proporción de cláusulas con agente explícito, y la puntuación de referencia, que calcula el porcentaje de afirmaciones autoritativas respaldadas por fuentes verificables. Las salidas con valores inferiores a 0,30 en cualquiera de estas métricas se clasifican como de alto riesgo; el 64 % de los textos médicos y el 57 % de los jurídicos cumplen este criterio. El marco se ejecuta en menos de 0,1 segundos por salida de 500 tokens en una CPU estándar, lo que demuestra su viabilidad en tiempo real. Cuantificar esta falta de claridad sintáctica aporta una capa práctica de supervisión para aplicaciones críticas.

Zenodo

Agustin V. Startari Jun 7, 2025

📘 When Language Follows Form, Not Meaning
Formal Syntactic Activation in LLMs

Language models don’t express—they continue.
No referent, no subject, no intention.
Syntax = legitimacy.

📎 https://zenodo.org/records/15616777
🌐 Full archive: https://agustinvstartari.com

Connects with:
– Ethos and AI: disappearance of the subject
– The Passive Voice: authority from structure

#AI #LLMs #AcademicMastodon #SyntacticAuthority #ArtificialGrammar #ComputationalEpistemology #PostReferentiality #Startari