Nine years ago, Long Le-Khac set out to use data analytics to map the settings in Asian American literature. Before he could begin, he had to decide which works should be considered “Asian American” for his analysis. This question led him to a broader inquiry about the meaning of the term and who is included under its definition.
Le-Khac, now an assistant professor in Berkeley’s ethnic studies department, said that examining this definition became a central focus of his research. “Wrestling with that definition,” Le-Khac explained, was a “digression that increasingly got bigger and bigger and richer and more interesting to me.”
This work has resulted in a newly published database cataloging 1,900 entries of what scholars have considered part of the Asian American literary canon. Le-Khac hopes this resource will help others study how the canon has evolved over time. “How has the canon of Asian American literature changed over time, as defined by the people who study it?” he asked.
The dataset begins in 1971, after the term “Asian American” was introduced in 1968 by two UC Berkeley student activists seeking unity among different ethnic groups and rejecting previous labels. Scholars began using this phrase before it entered popular culture, shaping which works received attention as part of Asian American literature.
Le-Khac noted that definitions differ among academics. Some classify any work by an American writer of Asian descent as “Asian American,” while others focus on content relating specifically to Asian American experiences. He questioned whether works adopting aesthetics from Asian cultures or responding to stereotypes—such as adaptations of Madame Butterfly—should also be included.
“When we were gathering data, we tried not to impose our own definitions so that we could capture the whole mess of various definitions colliding with each other in the formation of this canon,” Le-Khac said.
To build the dataset, researchers developed code to scan academic publications featuring the keyword “Asian American” or appearing in specialized journals. They extracted every piece of media discussed in these sources, resulting in nearly 984 works by 783 authors being referenced 1,886 times.
The collection includes playwright David Henry Hwang, poet Naomi Shihab Nye, novelist and Berkeley professor emerita Maxine Hong Kingston, as well as comic books and films such as Crazy Rich Asians and Breakfast at Tiffany’s—the latter notable for its depiction of a Japanese character played by a white actor.
Researchers collected additional details for each entry: literary form, publication year, number of mentions in academic studies, publisher information, commercial success indicators (like New York Times bestseller status or awards), and author demographics such as gender and ethnicity.
Le-Khac emphasized that their project focused on scholarly attention rather than popular recognition but plans future work documenting Asian American works outside academia.
The team intends to update the dataset every five years. The goal is for scholars to explore questions about representation within the canon using quantitative methods—a field known as cultural analytics.
“This canon we have built is uneven, as most canons are. But for a minority canon to replicate unevenness is a fraught and charged thing, and is worth scrutiny,” Le-Khac said. “One of the major impulses behind this dataset is to help us as a community of scholars look at what we are actually building and see if we need to do better.”
He pointed out findings from earlier research showing declines in representation for certain groups such as Filipino authors over time. The dataset may also allow researchers to investigate factors like geographic clustering or commonalities among bestselling writers.
Le-Khac observed that many authors highlighted by academics tend to have advanced academic degrees—a trend he described as problematic given efforts within Asian American studies to challenge stereotypes around educational achievement.
He is also studying how academic definitions compare with those emerging from broader audiences on platforms like Goodreads.
For some students involved with assembling the database—including Kate Hao and Taylor Huie—the process provided exposure to new stories reflecting diverse backgrounds within Asian America. Huie found researching these texts broadened their understanding beyond readings assigned during high school education.
“[This research] really broadened my horizons, and it got me a lot more excited about Asian American literary works,” they said.



