When fiction is cataloged, keywords never tell the whole story—but they can be suggestive. Above are selected Library of Congress keywords for eight well-known novels. Can you identify the books? Key: (A) Virginia Woolf, Mrs. Dalloway. (B) Nathaniel Hawthorne, The Scarlet Letter. (C) Ira Levin, Rosemary’s Baby. (D) Tom Wolfe, The Bonfire of the Vanities. (E) F. Scott Fitzgerald, The Great Gatsby. (F) Donna Tartt, The Secret History. (G) J. D. Salinger, The Catcher in the Rye. (H) Toni Morrison, Beloved. (Illustration by Nicole J. Melton)

The problem with metadata

What's in a keyword? Mark Athitakis, AB’95, turns a critical eye on how the Library of Congress and the New Yorker classify fiction.

You’re reading a novel. “What’s it about?” somebody asks. What do you say?

The question grates; there’s no good answer. Book reviewers are trained to avoid all but the briefest sketch of a novel’s story line, because we know that plot summary tends to bore people (“Well, there’s this couple, and they have three kids, and it’s 1986, and they’re unhappy because …”). Talking about themes and ideas instead doesn’t improve matters. Done wrong—and it often is, in conversation—it comes off as highfalutin (“Well, it’s about this couple, but it’s really about how globalization, particularly when it comes to personal technology …”). Maybe it’s best to just answer the question with a grunt about setting and characters (“It’s about an unhappy couple. In rural Oregon ...”).

I imagine this struggle going on among the world’s librarians and metadata experts whenever I look at the Library of Congress cataloging information for a work of fiction. For instance, here’s the complete listing for an acclaimed 2006 novel celebrated for its verve, wit, and sprawl:

1. Young women—Fiction.

An older novel, a National Book Award winner by one of American literature’s signature 20th-century authors, reads:

1. Americans—Mexico—Fiction.
2. Failure (Psychology)—Fiction.
3. Chicago (Ill.)—Fiction.
4. Depression—Fiction.
5. Young men—Fiction.
6. Mexico—Fiction.

And, back to the present again, a relatively recent Pulitzer Prize winner:

1. Greek Americans—Fiction.
2. Detroit (Mich.)—Fiction.
3. City and town life—Fiction.
4. Suburban life—Fiction.

If you keep up with fiction at all, you can probably take a good guess at the last two books. (No need to prolong the mystery: in order, they’re Marisha Pessl’s Special Topics in Calamity Physics, Saul Bellow’s The Adventures of Augie March, and Jeffrey Eugenides’s Middlesex.) But few people would discuss what those novels are about in the Library of Congress’s terms. Indeed, the information for Middlesex seems to avoid the book’s most relevant plot point (Hermaphroditism—Fiction).

For better or worse, my obsession with the limitations of cataloging information has only expanded. On a weekly basis, I’m amused and baffled by the metadata attached to the short stories on the New Yorker’s website; for a little while last year, I got in the habit of logging examples at my Tumblr (markathitakis.tumblr.com). The task of trying to reduce the ineffable qualities of fiction to streams of keywords feels at once charming and childish, like trying to capture moonlight in a jar.

New stories on the New Yorker’s website are keyworded with an entertaining profligacy, as in the case of the Jonathan Lethem story that inspired me to start logging them in the first place:

Pornography. Clerks. Stores. Threesomes. Sex. Videos. New York City. Critics. Reviewers. Transsexuals. Sex Machines. Vomit.

George Saunders’s “The Semplica-Girl Diaries” is tagged with a series of keywords that reads like the exploded id of postcapitalist America: Lottery Winners. Poor People. Illegal Immigrants. Yards. Hoisting. Daughters. Wealth. Birthday Parties. Diaries. Rich People. Arrangements. Girls. Credit Card Debt. Microlines. Economic Classes.

Though the trains of words seem silly when strung together, keywording is serious business. Editors are now constantly logging, tagging, keywording, categorizing, metadata-ing. It is tedious but essential work. The Great God CMS must be pleased. Because there is no telling how articles—sorry, “content”—will be used in the years to come, those words are the necessary toeholds for future databases. And because nobody knows what information we’ll need years (centuries?) from now, the more keywording the better. The New Yorker has done its bit to make sure that anybody researching the role of sex machines, or vomit, in the first decade of the Tea Party era won’t miss the chance to reckon with Jonathan Lethem’s short story “The Porn Critic.”

Older New Yorker stories are keyworded much more parsimon-iously. Perhaps this is because the responsible party is concerned only with finding the essence of a story, but more likely it’s because the work is being done in a hurry. Philip Roth’s (AM’55) New Yorker debut, “The Kind of Person I Am”—published in 1958, when he was teaching English at the University of Chicago—is keyworded thusly: Analysis of Habits & Tastes; Parties.

The keywording for “Unguided Tour,” a 1977 story by Susan Sontag, AB’51, is likewise short and sweet: Love; Travel.

Even so, a few classics are so well known that a handful of words are enough to identify them. If you studied English in high school, you know this one: Lots; Mob Violence; Small Towns; Stoning.

This one too: Adolescence; Bathing Suits; New England; Supermarkets.

And any inveterate New Yorker reader can guess this one: Bullet Park; Drinking; Swimming Pools.

Those scattered terms can be enough to let you know what’s in Shirley Jackson’s “The Lottery,” John Updike’s “A&P,” and John Cheever’s “The Swimmer.” But they’re not enough to say what the stories are about. Emotional states don’t get keyworded at the New Yorker. There’s nothing in the metadata for Vladimir Nabokov’s “Symbols and Signs” (Insane; Birthdays; Children; Parents; Russia, Russians; Gifts: New York City; Immigrants) that gets at its tone of emotional devastation, the despair in its line about “neglected children humming to themselves in unswept corners.” The three keywords for Alice Munro’s “A Wilderness Station” (Canada; Letters; Murder) are comically insufficient at summarizing a story about guilt, accusation, and suppres-sion that stretches across decades.

So be it. If fiction could be summarized in a series of nouns it would stop being fiction; its abstractions render abstracts meaningless, or at least beside the point. Still, I was disap-pointed to see how shabbily James Thurber has been treated by the keepers of the New Yorker archives. “The Secret Life of Walter Mitty,” for instance, is entirely bereft of relevant keywords. (Just “The New Yorker, magazine, subscription”—when in doubt, pitch a subscription, apparently.) If you want to know what “The Secret Life of Walter Mitty” is about, you’re going to have to read it—which, in a perfect world, is just as it should be.


Mark Athitakis is a magazine editor and book critic living outside Washington, DC. His reviews have appeared in the New York Times, Washington Post, Barnes & Noble Review, and other publications, and he serves on the board of directors of the National Book Critics Circle. Since 2008 he has maintained the literary blog American Fiction Notes, where a version of this essay originally appeared.