top of page

Welcome to my research page, where you can find my latest projects and publications. To help you explore, the interactive word cloud below highlights the main themes in my work: the larger and darker a word appears, the more often it occurs across my research. Click any word, and the page will automatically filter the list to show studies related to that topic.

Reshaping Language Learners’ Languaging Habitus: A World-Englishes-Informed Critical Pedagogy

This paper proposes a critical world Englishes-informed pedagogy designed to disrupt ingrained norms and expectations shaped by learners’ socialization, or their “languaging habitus.” Drawing on Mezirow's transformative learning theory and employing a case-study approach, this study examines how critical pedagogy reshaped the languaging habitus of undergraduate English learners in a Hong Kong English program. Analysis of multimodal learner artifacts, alongside survey and interview data, revealed significant—though uneven—shifts in learners’ sociocultural understandings, self-perceptions, orientations, and attitudes (e.g., describing their own variety as “beautiful”). These findings, we argue, illustrate transformative processes in learners’ languaging habitus, highlighting the dynamic interplay between pedagogy, language ideologies, and identity.

RELC Journal (Sage)

2025

Our People’s Language: Variation and change in the Lánnang-uè of the Manila Lannangs (Dân láng-e uè: Mga Manilá Lánnáng-e Lánnang-uè-e pagka-varỳ kâp pagka-pièn)

This book pioneers the study of Lánnang-uè, deeply embedded in Manila’s Lannang community’s culture. It approaches Lánnang-uè not just as a language but as a vibrant social practice, highlighting its variability and complex social meanings (e.g., identity-marking). Over six years and with more than 150 participants, the monograph integrates contemporary, community-focused, and critical sociolinguistic frameworks to explore and document linguistic variation as well as change signaling attrition, challenging reductive academic views. Employing diverse methodologies—surveys, elicitation, interviews, computational modeling, and ethnography— the work offers a nuanced depiction of Lánnang-uè’s diversity. A decolonial stance is advocated, emphasizing the complex practices that define the language and its speakers’ identity. It critiques the idea of a uniform linguistic standard, presenting Lánnang-uè as shaped by local, diverse, and inclusive practices, urging a reevaluation of language ownership and authenticity. This monograph is crucial for scholars in sociolinguistics, language variation, and contact linguistics, informing language revitalization efforts and enriching global discussions on linguistic diversity and discrimination.

John Benjamins

2025

Shifting style-shifts? Higher-order style-shifting in Hong Kong Cantonese-English languaging through indexicality and audience design

This study examines style-shifting between monolingual and bilingual Cantonese-English styles in Hong Kong through audience design and indexicality. Using narrative elicitation, YouTube data, and an attitudinal experiment, it finds that bilingual styles are often downplayed in public but appear more robust in scripted speech, where speakers exercise greater control and tap into alternative indexical meanings. Attitudinal data confirm that these monolingual/bilingual styles carry distinct social meanings, shaped by processes such as enregisterment, which enable their mobilization. Based on these findings, the study proposes an “independent layers” socio-indexical model, where these styles function as distinct yet overlapping resources.

Language & Communication (Elsevier)

2025

Styling mono-/bi-/multilingualism in the signs of plurilingual landscapes in Hong Kong: variability and socio-indexical meanings

This study investigates the semiotic functions of ‘degree of multilingualism’ (i.e. mono-/bi-/multilingualism) in Hong Kong’s plurilingual landscape. Using linguistic landscape data from Mong Kok (MK), Tin Shui Wai (TSW), and Discovery Bay (DB), we operationalise mono-/bi- and multilingualism as a macro-stylistic resource encoding socio-indexical meanings. Using a mixed-methods approach – combining quantitative, semiotic, and ethnographic analysis – we examine (1) distributional patterns of signs, (2) socio-spatial factors shaping linguistic styling, and (3) the socio-indexical meanings of linguistic choices. Findings reveal notable regional differentiation: MK’s high multilingualism aligns with its commercial vibrancy and cosmopolitan ethos; TSW’s monolingual Chinese signage indexes local identity and Mainland influence; DB’s near-exclusive English use reinforces its expatriate enclave status. Beyond communication, language works hand-in-hand with other semiotic choices to express meanings such as authenticity and prestige. However, the meanings are far from static: for example, while multilingualism often signals cosmopolitanism, in contexts like TSW, it can index commodification and low prestige.

International Journal of Multilingualism (Routledge)

2025

Decolonizing TESOL Classrooms: A Critical World-Englishes-Informed Pedagogy

Power dynamics and soft forms of linguistic discrimination underlie TESOL classrooms, as reflected in the systemic marginalization of “non-standard” world English (WE) varieties that form part of many English learners’ linguistic repertoire. In response to this problem, the study introduces a Critical World Englishes-informed Pedagogy (CWIP) in an undergraduate course in an applied English linguistics program at a Hong Kong university. Our approach specifically involves various pedagogical tools and tasks designed to empower learners to invest in their identities as legitimate users of Hong Kong English (HKE) and other WE varieties. The pedagogy actively encourages learners to question colonial English ideologies, equips them with knowledge about WE, and fosters a critical examination of their perceptions and attitudes toward WE varieties—an approach that emphasizes multilingualism, cultural diversity, and multifaceted identities. Through an examination of a wide range of course artefacts and teacher-learner data through the lenses of positionality, voice, and authority under the decolonial framework of power transformation (Dei, 2019), our findings provide insights into the transformative impact of CWIP on power dynamics within TESOL classrooms. The findings reveal that CWIP effectively facilitates learners’ engagement with WE practices and contributes to the ongoing decolonization of educational practices in the region. Overall, this study highlights the need for critical, decolonizing pedagogy in TESOL—a pedagogy that values the richness of linguistic diversity and challenges traditional but insidious power structures in language education.

Springer

2025

AI NPCs in an Educational Metaverse: Evaluating the Effectiveness of Prompt Templates for Contextual Interactions

The paper evaluates the effectiveness and relevance of AI-powered non-player characters (NPCs) in educational settings within the metaverse platform Classlet, particularly within the domain of applied social sciences. Integrating Speech Act Theory, Prospect Theory, and the Recognition-Primed Decision Model into AI prompt templates enhanced interactions and positively impacted student engagement. The AI NPCs achieved a performance ratio of 128%, providing contextually relevant responses effectively maintained by the prompt templates. A survey on students’ perceptions of the AI and VR integration revealed positive user perceptions, with strong correlations between enjoyment, perceived usefulness, and intent to use. Correlational analysis showed a strong fit (R2 = 0.816) for user intent, which had a mean of 74%. U-tests also indicated that female and non-VR users encountered more technical difficulties. Overall, the results suggest new ways of using AI and VR to promote learning engagement and interactions within applied social sciences.

Innovating Education with AI (Springer)

2025

Restricted restrictives on Twitter/X: Analysing and predicting English relativiser choice in the Philippine Twitter-/X-scape

This paper examines relativiser variation in human-antecedent rrcs on Twitter/X – a digital platform that has hybrid written, spoken, and electronically mediated characteristics. It employs quantitative, computational methods to examine rrcs extracted from the Twitter Corpus of Philippine Englishes (tcope). The findings show notable disparities in the ranking and distribution of relativiser variants between EngPh in speech and on Twitter. Moreover, it was found that within EngPh on Twitter, intra-linguistic, stylistic and extra-linguistic factors jointly restrict or constrain restrictive relativiser choice, with intra-linguistic factors exerting a stronger influence than others. The results provide support for a probabilistic representation of relativiser variation that involves all these factors but assigns greater significance to structural ones.

Corpora (Edinburgh University Press)

2025

Predicting language choice in a digital medium: A computational approach to analyzing WhatsApp code-switching in Hong Kong

This paper explores how bilingual users in Hong Kong alternate between Cantonese and English in the digital messaging medium (specifically WhatsApp). The key question: can we build a predictive model of language choice (Cantonese vs English) at the word-level, using both linguistic (internal) and social (external) factors?

International Journal of Bilingualism (Sage)

2025

Mixed language in flux? The various impacts of multilingual contact on Lánnang-uè’s wh-question system

This study aims to investigate the interactions between speakers’ exposure to, frequency of, and proficiency in four languages (English, Tagalog, Hokkien, and Mandarin) and their influences on the why-fronting only wh-question system of Lánnang-uè, a mixed language used by the metropolitan Manila Lannangs. It also aims to test the validity of the assertion that symbiotic mixed languages are more likely to be in flux.

International Journal of Bilingualism (Sage)

2024 (online)

Sociolinguistic Analysis with Missing Metadata? Leveraging Linguistic and Semiotic Resources Through Deep Learning to Investigate English Variation and Change on Twitter

This paper highlights a language and sign-based computational solution to the problem of missing social metadata on Twitter (now, ‘X’): demographic prediction using Deep Learning. It aims to apply this method to variationist sociolinguistics research, illustrating how the approach can facilitate analyses with missing metadata (i.e. stylistic age and sex/gender) by deriving this metadata solely from publicly available linguistic and semiotic resources on Twitter profiles (e.g. display pictures and biographies). I use my investigations of English tweets from the Philippines and Hong Kong as case examples, examining the extent to which the use of the copula and the use of will-shall modals on social media are conditioned by diachronic factors as well as factors internal and external to language (e.g. social factors). The results reveal the influence of stylistic gender and age as well as other factors on patterns of variation.

Applied Linguistics (Oxford University Press)

2024 (online)

Adverbial confirm in Colloquial Singapore English: Insights from a text message corpus.

In this paper, we provide morphosyntactic and semantic evidence to show that adverbial confirm behaves like a strongly subjective speaker-oriented adverb (SpOA). We argue that adverbial confirm developed as a result of contact with the Mandarin adverbial kending, likely due to English–Mandarin bilingualism amongst Chinese Singaporeans, although its use has proliferated throughout the general CSE-speaking population. Quantitatively, adverbial confirm emerges as the most frequently used SpOA when compared to surface-semantically equivalent strongly subjective SpOAs, such as definitely, sure, for sure and surely. The results suggest an increase in preference by CSE speakers to use confirm to express strong speaker certainty.

Asian Englishes (Routledge)

2024 (online)

Philippine Englishes in the Sino-Philippine Lannang context: Towards a concentric-pluricentric interactional-interplanar model of English

This article explores the relationship between Philippine English and the Lannangs, individuals with Filipino and Southern Chinese cultural heritage. It highlights the multifaceted nature of this English variety by discussing how it interacts with non-English languages in contemporary Lannang communities located in Manila, Iloilo and Cebu. Using corpus data from 10 Lannang linguistic varieties used in these areas, I found that Philippine English has assumed three primary linguistic roles (that of a lexifier, a substrate and a lexical contributor)—dynamic roles that are conditioned by the (sociohistorical) context, domain of use and conventions of the specific community. In conjunction with existent accounts of non-Lannang Philippine English(es), my findings justify complexity-based representations or models of Philippine English that has multiple levels (degrees of English influence) and centres (ethno-regional context) that interact with each other across different planes (e.g. social), such as the concentric-pluricentric interactional-interplanar (CPII) model proposed in this paper. They problematize and challenge the notion of a monolithic Manila-based ‘Philippine English.’ The proposed model presents a framework/blueprint for analyzing English in multilingual settings.

World Englishes (Wiley)

2024 (online)

When to (not) split the infinitive: Factors governing patterns of syntactic variation in Twitter-style Philippine English

The paper investigates why and how speakers of Philippine English (PhE) use split infinitives (e.g., to boldly go) versus alternative adverb placements, focusing on Twitter-style PhE. Using a 135-million-word Twitter corpus and a Bayesian multinomial regression model supported by deep-learning demographic inference, it examines both language-internal factors (such as stress, rhythm, adverb type, and length) and language-external factors (like time, age, sex, and geography). The study finds that both sets of factors interact to influence adverb placement, though some patterns diverge from previous findings in American English. By analyzing this underexplored variety, the paper offers a comprehensive, probabilistic account of modified infinitive variation and contributes to understanding how Philippine English negotiates stylistic and structural norms in digital contexts.

English Language & Linguistics (Cambridge University Press)

2024 (online)

Sociolinguistic variation in Colloquial Singapore English sia

Colloquial Singapore English (CSE), also known as ‘Singlish’, features a wide range of sentence-final particles (SFP) influenced by local languages such as Hokkien, Cantonese, Mandarin and Malay. This study focuses on the SFP sia, a relatively new and less-explored particle with Malay roots. We examine sia and its variants (sia, sial, siak and siol) using data from the Corpus of Singapore English Messages, a 6.9-million-word text-message corpus from 2016 to 2022. While previous research has associated sia and its variants with strong illocutionary contexts, particularly among young male Singaporeans due to its vulgar and masculine connotations, our data indicate that sia is now used more broadly among CSE-speaking youth. It is employed in both strong and weak illocutionary contexts, suggesting a shift away from its negative/vulgar associations. Sia and its variants are emerging as general phatic markers reflecting the identity of CSE-speaking youth.

World Englishes (Wiley)

2024 (online)

World Englishes pedagogy: constructing learner identity

This paper investigates how World-Englishes-based teaching reshapes learner identity. Through classroom reflection data, it shows students shift from native-norm orientation to confidence in localized English use. The approach promotes critical language awareness and inclusive pedagogy.

ELT Journal (Oxford University Press)

2024 (online)

Clause-Final Adverbs in Colloquial Singapore English Revisited

This paper investigates the use of clause-final adverbs such as already, also, and only in Colloquial Singapore English. Using chat-message data from the CoSEM corpus, it models age, gender, and semantic effects on usage. Younger speakers favour already and only, while also remains stable. Results show ongoing grammaticalization and link CFA variation to bilingual English-Mandarin contact.

Journal of English Linguistics (SAGE)

2024 (online)

The predictive role of L2 motivation in receptive and productive informal digital learning of English: A chain mediation model

This article draws attention to the ways in which Hong Kong university students are motivated to engage in informal digital learning of English (IDLE) activities. Drawing upon the second language (L2) Motivational Self System (Dörnyei, 2009), it seeks to examine how Hong Kong students’ L2 motivational components, including the ideal L2 self, the ought-to L2 self, and L2 learning experiences, predict their receptive IDLE activities (RIA), and productive IDLE activities (PIA). Using an adapted and validated questionnaire, we collected data from 310 undergraduate students in a Hong Kong public university and analyzed the data following a structural equation modeling approach (Collier, 2020).

Digital Applied Linguistics (Castledown)

2024 (online)

The Holistic Advantage: Unified quantitative modeling for less-biased, in-depth insights into (socio)linguistic variation

This paper explores the consequences of such a choice on data interpretation and, consequently, (socio)linguistic theorization. Utilizing Twitter-style English in the Philippines (EngPH) as a case study, I employ the Twitter Corpus of Philippine Englishes (TCOPE) primarily to investigate and elucidate variations in three morphosyntactic variables that have been previously examined using a piecemeal approach. I propose a holistic quantitative approach that incorporates documented linguistic, social, diachronic, and stylistic factors in a unified analysis. The paper illustrates the impacts of adopting this holistic approach through two statistical procedures: Bayesian regression modeling and Boruta feature selection with random forest modeling. In contrast to earlier research findings, my overall results reveal biases in non-unified quantitative analyses, where the confidence in the effects of certain factors diminishes in light of others during analysis. The adoption of a unified analysis or modeling also enhances the resolution at which variations have been examined in EngPH. For instance, it highlights that presumed ‘universals’, such as the hierarchy of linguistic > stylistic > diachronic > social factors in explaining variation in some domains, is contingent on the specific variable under examination. Overall, I argue that unified analyses reduce data distortion and introduce more nuanced interpretations and insights that are critical for establishing a well-grounded empirical theory of EngPH variation and language variation as a whole.

Languages (MDPI Switzerland)

2024

Revitalizing Attitudes Toward Creole Languages

This chapter focuses on revitalizing attitudes toward Creole languages by emphasizing their normalcy, naturalness, creativity, diversity, and resilience. It aims to move away from hegemonic paradigms and promote social justice and decolonization in how these languages are perceived and discussed. The objective of this chapter is to revitalize attitudes toward Creole languages: to refresh, reroute, and redefine how these languages are perceived, presented, and discussed, particularly in the Global North (cf. Braithwaite & Ali, this volume). This is a key aspect of moving away from hegemonic paradigms and toward social justice and decolonization, which we take to mean forefronting as researchers, teachers, and language users a liberated, anti-exceptionalist narrative about Creoles and their users that emphasizes their normalcy, naturalness, creativity, diversity, and resilience.

Decolonizing Linguistics (Oxford University Press)

2024

New Dimensions: The Impact of the Metaverse and AI Avatars on Social Science Education

This study investigates the integration of the metaverse and artificial intelligence (AI) avatars in social science education, amidst global challenges that necessitate holistic approaches for cultivating critical global citizenship. Despite the promising capabilities of virtual reality (VR) for immersive learning, the effectiveness of the metaverse and the role of AI in enhancing educational experiences remain underexplored. The central question addressed is the impact of immersive learning experiences in the metaverse, including AI avatars, on students’ perceptions and intentions for using this technology for social science education. Findings reveal significant student engagement with immersive learning, aligning with experiential learning theories and suggesting potential for complex topic exploration. However, the influence of AI avatars on learning experiences suggests that students found environments with AI avatars harder to use than those without. Given the limitations, such as the small and specific participant pool and brief engagement duration, further research is warranted to understand the long-term effects of immersive learning in the metaverse and the advanced integration of AI avatars across various educational disciplines.

Blended Learning: Intelligent Computing in Education (Springer)

2024

The MULTI Project: Resources for enhancing multifaceted creole language expertise in the linguistics classroom

This paper introduces open teaching resources for creole linguistics that challenge deficit framings. It details design principles and classroom applications promoting linguistic equity. The project broadens access to creole studies and fosters anti-racist pedagogy.

American Speech (Duke University Press)

2024

Variation in Asian and Pacific Islander North American English: What the patterns of scholarship demonstrate about race in sociolinguistics

This paper surveys sociolinguistic studies on Asian and Pacific Islander Englishes in North America. Mapping published work shows heavy focus on East Asian speakers and under-representation of Pacific and Southeast Asian groups. It highlights research inequities and urges more inclusive, community-based approaches.

Asia Pacific Language Variation (John Benjamins)

2024

Broadening horizons in the diachronic and sociolinguistic study of Philippine English with the Twitter Corpus of Philippine Englishes (TCOPE)

This paper presents the Twitter Corpus of Philippine Englishes (TCOPE): a dataset of 27 million tweets amounting to 135 million words collected from 29 cities across the Philippines. It provides an overview of the dataset, and then shows how it can be employed to examine Philippine English (PhilE) and its relationship with extralinguistic factors (e.g. ethno-geographic region, time, sex). The focus is on the patterns of variation involving four PhilE features: (1) irregular past tense morpheme -t, (2) double comparatives, (3) subjunctive were, and (4) phrasal verb base from. My analyses corroborate previous work and further demonstrate structured heterogeneity within PhilE, indicating that it is a multifaceted and dynamic variety. TCOPE has shown itself to be useful for exploring both the “general” features of contemporary PhilE and the different forms of variation within it. It contributes to a deeper understanding of Philippine English(es) over time and in different social contexts.

English World-Wide (John Benjamins)

2023 (online)

Advancing Sino-Philippine linguistics and sociolinguistics using the Lannang Corpus (LanCorp) – a multilingual, POS-tagged, and audio-textual databank

This paper introduces the Lannang Corpus (LanCorp), a public 375,000-word collection of raw and transcribed recordings of Lannang languages spoken in metropolitan Manila, which have been annotated with part-of-speech tags and linked to 40 types of sociolinguistic metadata. It begins by providing an overview of the LanCorp (e.g. design, formats, accessibility). Then, it goes on to show various examples of how the corpus can be used for variationist sociolinguistic research, using Lánnang-uè data as a case study. The findings from the exploratory studies indicate that Lannang languages are influenced by sociolinguistic factors, demonstrating the intricate nature of the Sino-Philippine sociolinguistic ecology. Due to its large size, sociolinguistic metadata, and various formats, LanCorp can be used to study Lannang languages in general and how they are used by specific social groups. It enables scholars to investigate multilingual interactions in a wide range of sociolinguistic factors, furthering the field of Sino-Philippine (socio)linguistics.

International Journal of Corpus Linguistics (John Benjamins)

2023 (online)

Spread, stability, and sociolinguistic variation in multilingual practices: the case of Lánnang-uè and its derivational morphology

This study examines nominal derivational affixes in a multilingual practice in the Philippines involving Hokkien, Tagalog, and English called Lánnang-uè. A feature of this practice is the systematic combination of affixes and roots (henceforth, ‘system’). Certain morphological combinations (e.g. Tagalog prefixes + English root) are used frequently and are regarded by Lánnang-uè users as well-formed, while others are not. This paper seeks to examine the spread, stability, and possible patterns of sociolinguistically-conditioned variation involving this system in the Lánnang-uè-speaking community. I conducted an acceptability judgment experiment involving 65 users in Manila and found high rates of spread and stability within my sample. Factors such as age, sex, and attitudes towards mixing selectively conditioned how some speakers adhered to system. For example, older users tended not to follow the affix source language, length, and position condition of the system whereas male users only tended not to follow the first condition. Based on the findings, I argue that the derivational affixation system exhibits conventionalisation, and that it emerged due to identity negotiation practices led by younger and female users. I also argue that conscious positive attitudes towards mixing help shape the stable development of multilingual practices.

International Journal of Multilingualism (Routledge)

2023 (online)

From tweets to trends: analyzing sociolinguistic variation and change using the Twitter Corpus of English in Hong Kong (TCOEHK)

This article introduces the Twitter Corpus of English in Hong Kong (TCOEHK)—a 123-million-word dataset covering tweets from all 18 districts and three macro-regions of Hong Kong (2010–2022). It demonstrates the corpus’s analytic potential through four variables in English in Hong Kong (EngHK): tense marking, -ize/-ise suffix choice, adverb position, and copula (non-)use. Results show that variant distributions align with known Hong Kong English (HKE) patterns while also revealing nuanced effects of linguistic, stylistic, and social factors across variables. The study underscores EngHK’s dynamic, multi-layered variation and positions TCOEHK as a robust resource for tracking ongoing change in digital Hong Kong English.

Asian Englishes (Routledge)

2023 (online)

The Sociolinguistics of Code-switching in Hong Kong’s Digital Landscape: A Mixed-Methods Exploration of Cantonese-English Alternation Patterns on WhatsApp

This paper examines the prevalence of Cantonese-English code-mixing in Hong Kong through an under-researched digital medium. Prior research on this code-alternation practice has often been limited to exploring either the social or linguistic constraints of code-switching in spoken or written communication. Our study takes a holistic approach to analyzing code-switching in a hybrid medium that exhibits features of both spoken and written discourse. We specifically analyze the code-switching patterns of 24 undergraduates from a Hong Kong university on WhatsApp and examine how both social and linguistic factors potentially constrain these patterns. Utilizing a self-compiled sociolinguistic corpus as well as survey data, we discovered that those who identified as male, studied English, and had an English medium-of-instruction (EMI) background tended to avoid intra-clausal code-switching between Cantonese and English.

Journal of English and Applied Linguistics (De La Salle University Press)

2023

Variability in clusters and continuums: The sociolinguistic situation of the Manila Lannangs in the 2010s

This study explores the sociolinguistic situation of a metropolitan Manila Lannang community based on data gathered between 2017 and 2020. A survey was administered to 117 individuals to probe into various dimensions of self-reported language use (e.g., proficiency, confidence) and attitudes (e.g., pride). The results show that, among the Lannangs, there is a range of language use and attitudes, with age and other social factors such as identity impacting the scope of this variability. This variability appears to progress along a continuum in some areas, while forming cluster patterns in others. An examination of the contemporary data alongside data from investigations done in the late 1980s and 1990s reveals some disparities, pointing to generational shifts in language use. The findings demonstrate that the sociolinguistic situation of the Manila Lannang community is unique, dynamic, and complex, enabling us to gain some insights and a nuanced view of the sociolinguistic landscape of the broader Asia-Pacific region.

Asia Pacific Language Variation (John Benjamins)

2023

The Corpus of Singapore English Messages (CoSEM)

This article introduces the first version of the Corpus of Singapore English Messages (CoSEM), a 3.6-million-word monitor corpus of online text messages collected between 2016 and 2019, compiled and managed by a group of scholars who share an interest in Colloquial Singapore English (CSE) research. The paper explains the motivations behind developing a new corpus for the investigation of CSE. It also documents the process of compiling and organizing CoSEM and describes the corpus’s initial structure and composition. We further discuss the social variables used in tagging the data, as well as ethical challenges, advantages, and disadvantages unique to online message datasets. In addition, we present preliminary analyses of two selected CSE features: (1) the Hokkien-derived expression (bo)jio and (2) sentence-final adverbs (already, also, only). As CoSEM is an ongoing project, we conclude the article with notes on future directions.

World Englishes (Wiley)

2023

'Is it' in Colloquial Singapore English: What variation can tell us about its conventions and development

The paper investigates how Singapore English speakers use the question-marking construction is it in online text messaging, focusing on its linguistic, social, and pragmatic variation. Using data from the 3.6-million-word Corpus of Singapore English Messaging (CoSEM), the study analyzes 1,902 instances of is it across social groups and contexts. Through statistical modeling and native-speaker coding, it examines how factors such as syntax (polarity vs. clause-initial), orthography (e.g., is it, izzit, issit), social variables (age, gender, race, nationality), and pragmatic meanings (rhetoricity and affect) influence usage. The findings aim to uncover how is it functions as a pragmatic marker in everyday CSE discourse, reflecting speakers’ playful, mocking, or rhetorical stances and broader sociolinguistic patterns in digital interaction. Overall, the study contributes to understanding variation and grammaticalization in Singapore English through an innovative analysis of “finger speech” in online messaging.

English Today (Cambridge University Press)

2022

“Truly a Language of Our Own” A Corpus-Based, Experimental, and Variationist Account of Lánnang-uè in Manila

The paper aims to determine whether Lánnang-uè, as spoken in Manila, functions as a full language or merely as an ad-hoc mixture of others. Specifically, it seeks to locate Lánnang-uè along a continuum of “languageness” by examining how systematic, stable, widespread, and socially meaningful its linguistic patterns are. Drawing on multiple methods—descriptive, corpus-based, experimental, computational, and ethnographic—the study investigates whether Lánnang-uè exhibits properties typical of established languages (such as internal structure and social indexing). Ultimately, it argues that Lánnang-uè shows strong evidence of being a mixed language with high degrees of independence and coherence, challenging folk perceptions that it is simply “broken Hokkien.”

University of Michigan

2022

From Malay to Colloquial Singapore English: A case study of sentence-final particle sia.

In this paper, we offer a preliminary report of the ongoing CoSEM project by focusing on the SFP sia. This particle is a good example of CSE’s fluidity based on its distribution and usages patterns. Our goal is to demonstrate a developmental path of sia as well as the corpus’s advantages and disadvantages. For the latter, we are concerned particularly with the practical and ethical considerations of collecting and using online text message data, and possible future directions for the project. Before moving on to the discussion of the study, we will share some sociohistorical background information that is relevant to Singapore’s language ecology.

LINCOM

2022

Hybridization

In this chapter, I investigate hybrids related to Philippine English (PhE) using a bottom-up approach. I survey related works and analyze linguistic data with the goal of broadening the traditional PhE field by including studies of language contact, language documentation, and diaspora sociohistory to PhE research. Another goal of this chapter is to propel studies of PhE beyond the homogenizing paradigm, fulfilling the goal of the field of world Englishes, that is, to study the varied uses of English in various contexts around the world (Smith, 1981).

Philippine English
Development, Structure, and Sociology of English in the Philippines (Routledge)

2022

The Lannang Corpus (LanCorp): A POS-tagged, sociolinguistic corpus containing recordings and transcriptions of Lannang speech collected from the metropolitan Manila Lannangs between 2016 and 2020

This paper introduces a 375 000-word, POS-tagged, audio-linked corpus of multilingual Lannang speech from Metro Manila. It documents how Philippine Hokkien, Tagalog, English, and Lánnang-uè intertwine in everyday interaction. The corpus design, metadata, and annotation pipeline are detailed to support studies of variation, code-switching, and contact. It offers the first open empirical foundation for Sino-Philippine multilingual research.

Deep Blue Data, Deep Blue Repositories

2022

Filipino, Chinese, neither, or both? The Lannang identity and its relationship with language.

This paper focuses on a specific community of individuals who have mixed Southern Chinese and Filipino cultural heritage in the Philippines – the ‘Lannangs’. I investigate the Lannang identity and, with ethnographic interviews and survey data, propose that the identity should be broadly defined as comprising of four dynamic parts: being Filipino, being Chinese, being neither, and being both. Focusing on the Manila community, I show how the Lannangs navigate between these orientations depending on the social context and the interlocutors. Moreover, drawing on the notions of indexicality and simultaneity, I investigate the role of language in the characterization of the Lannang identity. I also show that Hokkien, Lánnang-uè, Tagalog, English, among other languages, are being used to embody the aspects of ‘Lannang-ness’.

Language & Communication (Elsevier)

2021

Interactions of Sinitic languages in the Philippines: Sinicization, Filipinization, and Sino-Philippine language creation

This chapter surveys linguistic codes related to Sinitic languages in the Philippines and examines the outcomes of their contact with local Philippine languages. It outlines how processes of Filipinization, Sinicization, and Sino-Philippine language creation have shaped the linguistic landscape. Focusing on Southern Min (Hokkien), Cantonese, and Mandarin, the chapter provides a descriptive overview of their use and transformation within the Philippines. It also highlights emergent hybrid languages resulting from these interactions, centering on four key speaker groups: Filipinos, Lannangs, Mainland Chinese migrants, and Sangleys, whose linguistic practices reflect the ongoing negotiation of identity, migration, and heritage across time.

The Palgrave Handbook of Chinese Language Studies (Palgrave-MacMillan)

2021

Two Englishes diverged in the Philippines? A substratist account of Manila Chinese English

This study explores distinctive features of a metropolitan Manila variety of Chinese English used in the Philippines, Manila Chinese English (MCE), an English contact variety used by Manila Chinese Filipinos. After comparing the frequencies of selected features observed in a 52,000-word MCE database with frequencies in Manila English and American English corpora, this study found that a distinct variety – MCE – most likely emerged in the 1960s due to the extensive contact between general Manila English and local tongues of Chinese Filipinos such as (Hybrid) Hokkien and Tagalog, which function as MCE’s substrate languages. This study takes into account MCE’s structure, sources, and genesis, and discusses MCE in relation to Philippine English as positioned in Schneider’s dynamic model, to demonstrate how intergroup variations coexist but take divergent paths within a WE variety.

Journal of Pidgin and Creole Languages (John Benjamins)

2020

Ethnic and gender variation in the use of Colloquial Singapore English discourse particles

This study uses the Corpus of Singapore English Messages (CoSEM), a large-scale corpus of texts composed by Singaporeans and sent using electronic messaging services, to investigate gender and ethnic factors as predictors of particle use. The results suggest a strong gender effect as well as several particle-specific ethnic effects. More generally, our study underlines the special nature of the grammatical class of discourse particles in CSE, which is open to new additions as the sociolinguistic and pragmatic need for them develops.

English Language and Linguistics (Cambridge)

2020

Vowel system or vowel systems? Variation in the monophthongs of Philippine Hybrid Hokkien in Manila

The paper examines how the vowel system of Manila Lánnang-uè (Philippine Hybrid Hokkien) reflects its development as a mixed language formed through contact among Hokkien, Tagalog/Filipino, and English. Using acoustic data from 34 speakers and analyzing vowel variation through Pillai scores, the study investigates how far the vowel systems of these source languages have converged. It finds that Lánnang-uè possesses a distinct, unified eight-vowel inventory, rather than a simple blend of its inputs—evidence of structural integration. Sociophonetic patterns show that older women lead in vowel stability, likely playing a key role in the formation of the mixed code, while younger speakers show shifts that may signal either language endangerment or ongoing evolution tied to changing community identity. Overall, the study highlights how Lánnang-uè’s phonological system embodies both its hybrid origins and its speakers’ sociocultural history.

Journal of Pidgin and Creole Languages (John Benjamins)

2020

Trilingual code-switching using quantitative lenses: An exploratory study on Hokaglish

Adopting a quantitative approach, this paper highlights findings of an exploratory study on Hokaglish, initially describing it as a trilingual code-switching phenomenon involving Hokkien, Tagalog, and English in a Filipino-Chinese enclave in Binondo, Manila, the Philippines. Departing from the (socio)linguistic landscape of the archipelagic nation, the discussion eventually leads to a frequency-based description of this phenomenon. Preliminary findings suggest that, in Hokaglish, code-switching from Hokkien to English appears to be the most frequent code-switching combination among the six possible ones and that it is typically found in religious institutions. From the investigation, Hokaglish yielded more attestations of intrasentential code-switching than intersentential ones in households particularly. Moreover, findings also indicate that switches in the word-level are very frequent and that morphological code-switching is virtually non-existent in Hokaglish conversations. The paper ends with a discussion that will more or less provide some justification for the findings.

Philippine Journal of Linguistics (Linguistic Society of the Philippines)

2017

Split infinitives across world Englishes: A corpus-based investigation

This article investigates split infinitives in 12 World Englishes using Kachru’s concentric circles framework. Beginning with a brief description of split infinitives, the article explores two significant aspects of splitting: the most common ‘breakers’, and split infinitive use across different genres and domains. Sourcing from the International Corpus of English, findings reveal that split infinitive use in Inner Circle and Outer Circle Englishes both exhibit similarities and differences. The seemingly contradicting data indicate that the split between Inner and Outer Circle Englishes is not as defined as Kachru initially hypothesized, but overlapping. While the similarities can partially be attributed to the prevailing first language (L1) prescriptive norms in the Outer Circle, the perceptible divergences in split infinitive use are mainly argued to involve subconscious substratum transfer and identity-formation processes; the deviations from L1 norms can be viewed as a sign of nativization and, perhaps, differentiation from their ex-colonizers or settlers’ English(es).

Asian Englishes (Routledge)

2017

Language contact in the Philippines: The history and ecology from a Chinese Filipino perspective

This article narrates the sociohistory of the Philippines through the lens of a Sinitic minority group – the Chinese Filipinos. It provides a systematic account of the history, language policies, and educational policies in six major eras, beginning from the precolonial period until the Fifth Republic (960 – present). Concurrently, it presents a diachronic narrative on the different linguistic varieties utilized by the ethnic minority, such as English, Hokkien, Tagalog, and Philippine Hybrid Hokkien (PHH). Following an exposition on how these varieties were introduced to the ecology is a discussion focused on contact that highlights potential theories as to how Philippine contact varieties like PHH emerged. How this account contributes to the overall language ecology forms the conclusion. Overall, this article delineates the socio-historical sources that intrinsically play a significant role in the (re)description of Philippine contact varieties. In its breadth, this article goes beyond providing second-hand information, and presents ideas that can be crucial for understanding how Philippine contact languages work.

Language Ecology (John Benjamins)

2017

Philippine Englishes

This paper argues that scholars should adopt the notion of Philippine ‘Englishes’ to acknowledge all substrate-influenced ‘regional’ (e.g. Iloilo English), social, and hybrid varieties (e.g. Hokaglish). Beginning with a brief overview of the current situation, it examines literature hinting for the invalidation of a standard Philippine English, identifying some evidence of variation due to (socio)linguistic factors through a concise survey of local Englishes. The study asserts that the Philippine Englishes model is more encompassing and forward-looking; it also shows some evidence that Philippine English is at the dawn of stage 5 (differentiation) of Schneider’s dynamic model. Although this model might raise more questions, it hopes to challenge researchers to embark on new-wave investigations on local Englishes while encouraging them to utilize existing research and frameworks. Ultimately, what this study hopes to provide is a fresh perspective on the preponderance of literature on Philippine English by introducing the said model.

Asian Englishes (Routledge)

2017

bottom of page